implicit runge-kutta processes runge-kutta processes by j. c. butcher 1. introduction. a runge-kutta...

Click here to load reader

Upload: dangnhan

Post on 16-Mar-2018

218 views

Category:

Documents


1 download

TRANSCRIPT

  • Implicit Runge-Kutta ProcessesBy J. C. Butcher

    1. Introduction. A Runge-Kutta process is a means of obtaining an approxi-mation y to the solution at x = x0 + h for the system

    -/ = f(y), y = yo at x = x0,

    where y is a vector of n elements and f (y) a vector function of these elements.The equations defining y for a v stage Runge-Kutta process are*

    gW =(yo + haijg(A ii =1,2,

    y = yo + A big"',

    "),

    where the coefficients a,y, &(, j = 1, 2, -, v) are numerical constants.It was shown [1] that the true solution y and the approximation y can be ex-

    panded in power series given by the equations t

    (i) y = yo + Z

  • IMPLICIT RUNGE-KUTTA PROCESSES 51

    For convenience we shall designate the process by an array as follows

    an21

    ai222

    aua2,

    avi ar2

    bi b2 by

    where c, = X^=i aa A well known example of an explicit process is the following due to Kutta [3];

    in this case p = 3.

    0 0 0

    0 0

    12 012 1

    0

    In contrast we have as examples of implicit and of semi-explicit processes thefollowing:

    0 0 0 |0

    s 1T 3

    1

    1"T

    i6

    1 1IS

    0 0 0|0

    1 i 0

    0 1 0

    The first of these is equivalent to a process due to R. F. Clippinger and B. Dims-dale and quoted by Kunz [8]. On the other hand the semi-explicit process appearsto have been previously overlooked even though it is of comparable accuracy andmore convenient for practical use. Using results proved in section 2 of this paperor by simply verifying (3) for the appropriate 3> it is easy to verify that p = 4 ineach case.

    It is well to consider what we might hope to gain by relaxing the restriction ofallowing only explicit processes. In explicit processes we have viv + l)/2 (= A7E ,say) coefficients o,-y, 6 to choose while in the semi-explicit case the number is in-creased to JVS = viv + 3)/2. However, for implicit processes we have availablethe complete set of Nj = viv + 1) coefficients. It is reasonable to hope that withN variables we can satisfy M restrictions as long as M ^ N.

    The numbers NE , Ns , Ni are shown in Table 1 as are the numbers of restric-tions M corresponding to different values of p. This number is, of course, the num-ber of elementary differentials with orders not exceeding p, or what is equivalent,the number of rooted trees with no more than p nodes [9].

    Basing our comparison on this table we see that the gain in accuracy that canbe achieved by the use of implicit rather than explicit processes is not more thanone power of h and in some cases semi-explicit processes do not give even this gain.

    License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use

  • 52 J. C. BUTCHER

    Table 1

    NE Na Nj

    1234567

    136

    10152128

    259

    14202735

    26

    1220304256

    M1248

    173785

    However, the type of argument based on merely counting the number of equa-tions of the form (3) that must be satisfied, ignores the relationships betweenthem. In fact, it happens that the M restrictions of Table 1 can often be satisfiedwith considerably fewer than M variables.

    Section 2 of this paper will be devoted to some general results on the relation-ships between the different $, and the following sections will contain a study of aset of processes which are generalizations of the Gauss-Legendre quadrature formu-lae.

    2. Some General Results. It is convenient to restrict ourselves in this paper toprocesses in which ci, c2, , c, are all distinct and none of 6i, b2, , 6, vanishes.If these restrictions are relaxed some results of this section will require slightmodifications.

    We shall use the symbols A, B, C, D, E to represent certain statements aboutthe numbers oy , &. These statements will depend on one or, in the case of E, twointegral parameters which we will write as though they were arguments and thestatement symbol a function of them. In the definitions of the symbols which nowfollow, k and I will always denote positive integers.

    Ait):

    Bit):

    Cit):

    Dit):

    $ = whenever r ^ ,7

    [*"*]- to *"*-1 far **,=i k2Z Gti c/-1 = ~ for i = 1, 2, , v and k ^ ,yi k

    6, ct^ij = 6y(1 7 Ci ) for j = 1, 2, -, v and k ,=.1 K

    1EiS, r,): [**-V-1]] - ZZbic'-'atjc/-1 = T.t-l 3=1 t(K -\- I)

    for k g S and g ij.

    A number of theorems which are needed in the later sections of this paper can beexpressed conveniently in terms of these symbols. After a statement of these theproofs will follow.

    Theorem 1. If Ait), then Bit).Theorem 2. If Ait + v), then Eit, v)-

    License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use

  • IMPLICIT RUNGE-KUTTA PROCESSES 53

    Theorem 3. If Bit + v) and Civ), then Eit, ij).Theorem 4. If Bit + v) and Dit), then E(t, ).Theorem 5. If Biv 4- n) andEiv, v), then Civ).Theorem 6. If Bit + v) and Eit, v), then Dit).Theorem 7. //5(f), Civ) and Dit), where f t + V + 1, f ^ 2v + 2, then

    A(t).To prove Theorem 1, we note that r = k for $ = [

  • 54 J. C. BUTCHER

    Using this symbol we see immediately that if $i = $< (i = 1, 2, , s) then[$!$2 ... $j = [$/c>2' $,']. As a preliminary to the proof of the theorem weshow by induction the lemma that

    7$ s r[ = [$i$2 $] and that

    (8) 7i*i - r^-'lfsince the orders n , r2 r, of $i, 4?2 ,* are each less than r, we have

    y = ryi72 7.

    so that

    7$ = ryi 7j 7[1 $2 ... $,]

    -mm---fJI#'1"lI^] [*r'_1]]

    rSUinE 3 c/1_1 jf r2 Z y cPj \r^l aa tyu)

    Using Civ) and that fact that none of n , r2 , rs can exceed v we find

    7$ = r Z & v + 1 we write$ = [i2 $] and suppose that $ is one of $i, $2, , $ chosen so that itsorder r is not exceeded by any of ri, r2, , r. For r ^ v the lemma still holdsso we need consider only the cases r > v- We now use induction on /, r assumingthe result true for all lower values of r and with a given r for all lower values of r .

    No two of n , r2, ,rs can exceed v for otherwise we would have

    r = 1 + n + r2 + + rs ^ 2V 4- 3 > f.

    Using the same sort of calculation as in the proof of the lemma we find7$ = ry [i>

  • IMPLICIT RUNGE-KUTTA PROCESSES 55

    and, by the induction hypothesis,

    $ = .7

    Now r r it + V + 1) iv + 1) = t- Hence, using Dit) we have

    y* - -m - eT')x!r r i=i; (*' - [pr-r */*' CD

    r _ r'

    where $' being of order greater than 1 can be written

    3> = [$i$2 $.']

    The orders of $i', $2', , 3v are each less than r so, using the induction hy-pothesis, a short calculation enables us to evaluate

    Hence

    proving Theorem 7.

    [r r $1'$2/ $'..] = .

    7 T

    7$ = -^(i7--!) = l,r r \y yr/

    3. Integration Processes Based on Gaussian Quadrature Formulas. For thesystem

    (9)dy2 ,^ - 1, 2/2 - x

    at x = o,

    the integration process reduces to a means of evaluating I = fxl+h f(x) dx usingthe quadrature formula I = ft/JLi 6/(x0 + Ac,-). For this formula to be accurateto terms in h", it is not necessary for all the conditions implied by A (p) to be satis-fied but only those contained in the statement Bip). Hence to each Runge-Kuttaprocess there corresponds a quadrature formula characterized by the values ofbi,b2, ,b ; Ci ,c2, , c. In this section we concern ourselves with the wellknown Gauss-Legendre quadrature formulae [10]. It will be found that to eachsuch formula, as adapted for integrating in the range (x0, x0 + h), there correspondsa unique Runge-Kutta process with the same order of accuracy.

    For the form of the Gauss-Legendre formula that we use, c\,c2, ,c, are theroots of the equation P(2c 1) = 0, where Pix) is the Legendre polynomial ofdegree v. The consequence of this choice is that when any v of the equations of theset B (2v) hold the rest do also so that bi ,b2, , b, can be found as solutions tothese equations.

    The important results of this section are the following:

    License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use

  • 56

    v = 1:

    J. C. BUTCHER

    Table 2. Processes based on Gauss-Legendre quadrature formulas

    v = 2: 1_41 a/34 6

    4 "r 6 4

    1 V32 6

    i 4- V?2^6

    36

    1 12 2

    2 VT5 j^ VIE9 "

    _5_ \/5 236 24 9

    15 36 30

    _5_ \/536 24

    A 4- V5 2 a/5 _5_36 + 30 9 15 36

    _5_18

    49

    _5_18

    1 yl52 10

    12

    1 \/l52 ^ 10

    * = 4: i 0)1 3 + 0>4 0>1 0)3 W4 Oil 0)5

    / . / II I0)1 0)3 + 0)4 0)1 0)1 0)5 0)1 0)3 0)4

    0)1 + 0)3 -J- 0)4 0)1+0)5 0)1 0)1 + 0)3 0)4

    0)1 + 0)5 0)1 + 0>3 + 0)4 0)1+0)3 0)4 0)1

    20)1 w 2o)/ 0)1

    2~'2

    0)2

    2 + W2

    1,2+0)2

    [0,1 = i

    V3 / = 1 , a/30144 ' Wl 8 ~t~ 144 ' 0)2 1 /15 + 2V32T 35

    0)2 =

    0)4 = 0)2

    1 /15 - 2V3 A , V3\ / _ / A a/30\2I/-35-' 3

    License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use

  • IMPLICIT RUNGE-KUTTA PROCESSES

    Table 2.Continued

    57

    v = 5:

    0)1/ . / 32 / /

    0)1 0)3 + 0)4 Kc "6 0)1 0)3 0>4 0)1 0)

    0)1 0)3 +0)4 0)132225

    0)6 0)1 0)6 0)1 0)3 0)4

    0)1 + 0)7 0)1 + 0>732

    225

    320)1 + 0)3 + 0)4 0)1 + 0)6 j^- + fa>6'

    0)1 0)7

    0)1

    0)1 0)7

    0)1 + 0)3 0)4

    . / . . / 32 / /0)1 + 0)6 0)1 + 0)3 + 0)4 +0)6 0)1 + 0)3 0)4 0)1

    0>2

    0)2

    12

    1 -L. '

    2+0)2

    1 ,2+2

    20)! 225 2W1'

    322 + 13\/70

    2o)i

    _ 322 - 13\/7 , _ 322 + 13\/7 _ 1 /LW1-36 ' Wl-36-' W2 ~ W' 35 + 2V763452 - 59y

    3240, 1 t /35 - 2V7 /452 + 59\/7\ > >(

    = 2V -63-' u = 2 Z0>3 0)6 ,

    /308 - 23\/7\ / = , /308 + 23y/7\"|\ 960 / ' ** '= ** \ 960 )\ "

    0)7 = 0)2

    Theorem 8. If A(2v) then Bi2v), Civ), D{v).Theorem 9. // jB(2k) {hence a Gauss-Legendre formula) then either of C(v),

    D (v) implies the other and either implies A (2v).Theorem 10. Given values of bi ,b2, , b,, ci, c2, ,cy either C(v) or D (v)

    defines ay (i,j = 1,2, , v) uniquely.To prove Theorem 8 we use Theorems 1 and 2 to deduce Bi2v) and Eiv, v)

    and then Theorems 5 and 6 to deduce Civ) and (v).Theorem 9 also follows easily from the previous results. From Theorems 3 and

    4, Bi2v) with either of Civ), Div) implies Eiv, v); whereas from Theorems 5 and6, Bi2v) and Eiv, v) imply both Civ) and Div). Finally we use Theorem 7 withthe values v = t = v, f = 2v to deduce A (2v).

    Theorem 10 follows from the fact that Civ), Div) each consist of v sets of vequations in v unknowns and since bi ,b2, ,b are all non-zero and Ci ,c2, , c,are all distinct, the Matrices of the coefficients in the various sets are non-singular.

    The sets of equations symbolized by C iv) are slightly simpler than those sym-bolized by D i v ) so it is the former sets that we use for practical evaluations of

    License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use

  • 58 J. C. BUTCHER

    flu, 12, , o. To find the parameters in a v stage Runge-Kutta process oforder 2v we may thus perform the following steps

    (1) Evaluate the roots of P, (2c - 1) =0(2) For each value of (t 1, 2, , *) find an (j = 1, 2, , v) as solutions

    to the linear system 1

    Z Clij Cy*_1 = 7 Ci (k = 1,2, ,v).y=i k

    (3) Find 6y ij = 1, 2, , v) as solutions to the linear system

    W_1 = (*= 1,2, -,,).y=i kIn Table 2 values of the parameters are given for v 1, 2, 3, 4, 5. It will be

    noticed that for the case v = 2 the process is the same as one suggested by Hammerand Hollingsworth [11].

    4. The Error Term. For a process of order p, the two series (1) and (2) areidentical for rgp, Hence the error vector y y is given by

    7 y k K ir-l)! " r\) kir- 1)! V y)and we shall suppose that h is sufficiently small for this to be approximated by

    77^1T-(#-1)F = ^ ^F = A>+1 eFr-p+l (r 1)! \ 7/ p! r=p+l r=p+l

    where

    5 = 4--, e = .7 p!

    The coefficient of hp+1 in y y is called by Henrici [12] "the principal error func-tion" and our approximation is to assume that the principal error function is theonly important contributor to the truncation error.

    We now restrict ourselves to the processes considered in the previous sectionso that p = 2v. We shall study properties of 5 for the different F of order 2v + 1so that a simple procedure can be found for evaluating e in these cases.

    Suppose for an elementary differential F of order 2v + 1 we have

    (10) F= {FiF2--- Fa}and the orders of Fi, F2, , F, are n , r2, , rs. Then since

    l + ri + r2+ +rs = 2v+l

    it is clear that the number of n , r2, ,r which exceed v is either 0 or 1. In theformer case we describe F as a "central" elementary differential and in the lattercase a "non-central" elementary differential.

    If F is non-central, suppose rx > v and that

    (11) Pi- {T ,).Consider

    (12) F' = {TO--- FF2F3--- F,}}

    License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use

  • IMPLICIT RUNGE-KUTTA PROCESSES

    so that the orders , f2, fc, 2v + 1 n of Fi, F2, , F,, {F2F3 F}are each less than ri while the order of F is the same as that of F.

    Either F' is central or we may form F" from F' in the same way as F' wasformed from F. We may continue this process, forming in turn F, F', F", untilthe sequence is terminated at a central elementary differential F* say. It will beproved that S = ' and hence that 5 = 5' = 8" = = S* where 5* corre-sponds to F* and the sign to be used is determined by the parity of the number ofmembers in the sequence F', F", , F* .

    We now summarize the principal result of this section.Theorem 11. // F = {f2"}, 5 = - (x!)7{ (2?) !f> + 1)1}.Theorem 12. If F is central, = - (vOVifi !] 7}.Theorem 13. If F is non-central, and F' is as defined by (12), 5 = S'.To prove Theorem 11 we write the polynomial P(2c 1) as

    Py (2c - 1) = po + pic + p2c2 +

    and consider the equations B (20,

    + PvC,

    jL, bi Ci ' = =-,

  • 60 J. C. BUTCHER

    so that

    5 = -Dy+i/Dy = -

    where we have written (see for example [13])

    M)4i2v)l(2v+ 1)1

    DN N

    h \ N+ 1

    N N + 1 N+2 2N - 1

    [1!2!3! jN - l)!]3iV!(Ar + !)! (2tf - 1)!

    This completes the proof of Theorem 11.Since n ,r2, ,r, are no more than v for a central elementary differential (8)

    holds and the proof of the lemma that was introduced before that of Theorem 7holds in this case as well. We thus have

    7$= i2v + l)[2'] = 1 - (,!)4/[(2.)!]2

    and Theorem 12 follows on dividing by y.Using the notation of (10), (11) and (12) we compute using, as usual, equation

    (31) of [1]

    7l = *i7l72 7

    7 = ni2v + 1)7172 7V7273 7

    7' = i2v + 1) i2v n + l)fi72 7VY273 7.

    so that

    1 + 1 = _I_.7 7' (2v n + 1)7j 72 7

    Again employing the methods of Theorem 7 we arrive at the results

    $ =7i 73 7,

    [*!

    $' =

    i2v r + 1)72 73 7

    r; T T r ,2yr

    ($1 - [#1 #2 *,

  • IMPLICIT RUNGE-KUTTA PROCESSES 61

    Hence

    $ + $' = 75--7rr-$i(2c n + 1)72 73 7

    -1 + i7 7

    which is equivalent to the result of Theorem 13.Although in principal these results enable us to write down the coefficients for

    each elementary differential of order 2c + 1 occuring in y y, the practical valueof this is lessened by the large numbers of such terms which actually occur. How-ever, as some guide to the truncation error we shall evaluate the coefficients i ande2t, say, for Fx = jf2"} and F2 = {2yf}2y.

    These may be regarded as extreme cases; Fi is the only elementary differentialof order 2v + 1 which involves derivatives of order 2v of the elements of f (y) whileF2 is the only elementary differential of the same order involving only first deriva-

    r)f f)ftives. If is the matrix whose U, j) element is then we can write

    3y >yyn n n fiv4

    Fi = Z Z Z --fhfit -' fiu>n-i 12-1 2,=i dyil dyiz dyi2.-(

    i is given by Theorem 11 as iv !)*/{ (2') ! (2c + 1) !j while i is given immediatelyfrom Theorem 7 of [1] as 1.

    Hence, we have

    fci -(c!)4i =(2c)! [(2c)!]3(2c+ 1)

    On the other hand, F2 is non-central and the central F* corresponding to it is{,_! f}2_j}. 7* is iv !)2(2c + 1 ) so that

    =F(c!)252 5* (2c)!(2c+ 1)!

    and the upper (or lower) sign is to be used if v is even (or odd). It is easy to seethat 2 is (2c) ! Thus

    2S2 +(c!)262 =

    (2c)! (2c)!(2c+l)-*0-The first few values of n and e2 may be found from Table 3.It is natural to compare the integration processes described here with methods

    based on quadrature formulae such as the method of Stoller and Morrison [14]. In

    t The convention concerning subscripted constants a, , y will be extended to S and .

    License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use

  • 62 j. c. butcher

    Table 3

    1A 1/62

    1 -24 +122 -4320 -7203 -2 016000 +1008004 -1778 112000 -25 4016005 -2 534876 467200 +10059 033600

    this type of method y is estimated at x0 + cth (z = 1, 2, , c) by some convenientintegration process and y is "finally found at x0 + h by the quadrature formula

    y = yo + h Z M(y

  • IMPLICIT RUNGE-KUTTA PROCESSES

    We define I as the greatest of

    | 02i |, | Osi I + I O321, , I drt I + I ay21 + +| ayy-i I,u as the greatest of

    I On I + I Ok I + + I 01,1,1 0221 + I oi I + + I a* I, ,\a\and a as I + u. For a vector v. = (vi ,v2, , vn) we denote by || v || the greatestof I Vi |, I v21, , I v I (any other norm could be used in the discussion whichfollows but we choose this one for definiteness). We now prove the following result:

    Theorem 14. // f (y) satisfies a Lipschitz condition,

    l|f(y)-f(z)ll = ||y-z||and I A I < 1/ (La) then the equations defining g(1), g(2), , g'"' have a uniquesolution and gw(1), g//2>, , gy(,,) defined by (13) fewd io Az's solution as N tends toinfinity.

    To prove that there is no more than one solution we assume, on the contraryMthat there are two and denote these by g(1>, g(2), , gc,) and g(1), g(2>,

    so that,g

    (14) g(i) = f(yo + AZo.yg(y))y=i

    Hence we have

    gW - iW

    g(i) =f(yo + AZa.yg

  • 64 J. C. BUTCHER

    ii (il (i) m ^ \h\ Lu n (y) (y)max ||gV" - gtf-i || - ' , , T1 max || gj& - g^i,y i 111 "' i

    . , , j. / a a | A | Ll u\ h i (j=||LVa- i-iail ;mfx||glrii-g"

    ^ fc max || gj& - & ||

    (3)2

    where fc = | A | La < 1.From this it easily follows that every element of girM g^-i has modulus less

    than MkN for some fixed M and this ensures the convergence of the sequencegwM)AT=l)2, .

    Suppose the limit of the sequence is gw. It is trivial to show thatga), g g// is a "contractionmapping" [15] and Theorem 14 is a special case of a fundamental theorem on suchmappings in complete metric spaces.

    University of CanterburyChristchurch, New Zealand

    1. J. C. Butcher, "Coefficients for the study of Runge-Kutta integration processes,"J. Aust. Math. Soc, v. 3,1963, p. 185-201.

    2. C. Rtjnge, "ber die numerische Auflsung von Differential-gleichungen," Math. Ann.,v. 46, 1895, p. 167-178.

    3. W. Kutta, "Beitrag zur naherungsweisen Integration von Differentialgleichungen,"Zeit. Math. Physik, v. 46, 1901, p. 435-453.

    4. E. J. Ntstrm, "ber die numerische Integration von Differentialgleichungen," ActaSoc. Sei. Fennicae, v. 50, 1925, p. 1-55.

    5. S. Gill, "A process for the step-by-step integration of differential equations in anautomatic digital computing machine," Proc. Cambridge Philos. Soc, v. 47, 1951, p. 96-108.

    6. A. Huta, "Une amlioration de la mthode de Runge-Kutta-Nystrm pour la rsolutionnumrique des quations diffrentielles du premier ordre," Acta Foc. Nat. Univ. Comenian.Math., v.l. 1956, p. 201-224.

    7. A. Huta, "Contribution la formule de sixime ordre dans la mthode de Runge-Kutta-Nystrm," Acta Fac. Nat. Univ. Comenian. Math., v. 2, 1957, p. 21-24.

    8. K. S. Kunz, Numerical Analysis, 1st edition, McGraw-Hill, New York, 1957, p. 206.9. R. H. Mekson, "An operational method for the study of integration processes," Pro-

    ceedings of Conference on Data Processing and Automatic Computing Machines at WeaponsResearch Establishment, Salisbury, South Australia, 1957, paper No. 110.

    10. Z. Kopal, Numerical Analysis, 1st edition, Chapman and Hall, London, 1955, p. 368.11. P. C. Hammer & J. W. Hollingsworth, "Trapezoidal methods of approximating

    solutions of differential equations," MTAC v. 9, 1955, p. 92-96.12. P. Henrici, Discrete Variable Methods in Ordinary Differential Equations, 1st edition,

    John Wiley & Sons, Inc., New York, 1962, p. 131.13. T. Muir, A Treatise on the Theory of Determinants, Dover, New York, 1960, p. 437.14. L. Stoller & D. Morrison, "A method for the numerical integration of ordinary

    differential equations," MTAC, v. 12, 1958, p. 269-272.15. A. N. Kolmogorov & S. V. Fomin, Elements of the Theory of Functions and Functional

    Analysis, v. 1, Graylock Press, Rochester, 1957, p. 43.

    License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use