I
ITERATIVE METHODS FOR THE SOLUTION OF EQUATIONS
J. F. TRAUB BELL TELEPHONE LABORATORIES, INCORPORATED
MURRAY HILL, NEW JERSEY
- ii -
i
TO SUSANNE
/
- Ill -
PREFACE
This book presents a general theory of iteration algorithms for the numerical solution of equations and systems of equations. The relationship between the quantity and quality of information used by an algorithm and the efficiency of the algorithm are investigated. Iteration functions are divided into four classes depending on whether they use new information at one or at several points and whether or not they reuse old information. Known iteration functions are systematized and new classes of computationally effective iteration functions are introduced. Our interest in the efficient use of information is influenced by the widespread use of computing machines.
The mathematical foundations of our subject are treated with rigor but rigor in itself is not the main object. Some of the material is of wider application than to our theory. Belonging to this category are Chapter 3 , "The Mathematics of Difference Relations"; Appendix A, "Interpolation and Appendix D "Acceleration of Convergence". The inclusion of Chapter 1 2 , "A Compilation of Iteration Functions" permits the use of this book as a handbook of iteration functions. Extensive numerical experimentation was performed on a computer; a selection of the results are reported in Appendix E.
1
The solution of equations is a venerable subject. Among the mathematicians who have made their contribution are Cauchy, Chebyshev, Euler, Fourier, Gauss, Lagrange, Laguerre, and Newton. E, Schroder wrote a classic paper on the subject in 1 8 7 0 . A glance at the bibliography indicates the level of contemporary interest. Perhaps the most important recent contribution is the book by Ostrowski; papers by Bodewig and Zajta are also outstanding.
Most of the material is new and unpublished. Every attempt has been made to keep the subject in proper historical perspective. Some of the material has been orally presented at meetings of the Association for Computing Machinery in 1 9 6 l , 1 9 6 2 , 1 9 6 3 , the American Mathematical Society in 1 9 6 2 and 1 9 6 3 * and the International Congress of Mathematicians in 1 9 6 2 .
I wish to acknowledge with sincere appreciation the assistance I have received from my friends and colleagues at Bell Telephone Laboratories, Incorporated. I am particularly indebted to M. D. McIlroy, J. Morrison, and H. 0 . Pollak for numerous important suggestions. I want to thank A. J. Goldstein, R. W. Hamming, and E. N. Gilbert for stimulating conversations and valuable comments. My thanks to Professor G, E. Forsythe of Stanford University for his encouragement and comments during the preparation of the manuscript and for reading the final manuscript. My appreciation to S. P. Morgan and Professor A. Ralston
T
- v -
of Stevens Institute of Technology who also read the manuscript. I am particularly grateful to J. Riordan for suggesting numer-
/— ous improvements in style. I want to thank Miss Nancy Morris for always dig
ging up just one more reference and Mrs. Helen Carlson for editing the final manuscript. I am grateful to
^ Mrs. Elizabeth Jenkins for her splendid supervision of the preparation of the difficult manuscript and to Miss Joy Catanzaro for her speedy and accurate typing.
To my wife, for her never-failing support and encouragement as well as her assistance in editing and proofreading, I owe the principal acknowledgment.
J. P. TRAUB
- vi -
TABLE 1 0 F CONTENTS
PREFACE TERMINOLOGY GLOSSARY OF SYMBOLS 1 . GENERAL PRELIMINARIES
1 . 1 Introduction 1 . 2 Basic Concepts and Notations
1 . 2 1 Some concepts and notations 1 . 2 2 Classification of iteration functions
1.23 Order 1.24 Concepts related to order
2 . GENERAL THEOREMS ON ITERATION FUNCTIONS 2 . 1 The Solution of a Fixed Point Problem 2 . 2 Linear and Superlinear Convergence
2.21 Linear convergence 2 . 2 2 Superlinear convergence 2.23 The advantages of higher order iteration
functions 2 . 3 The Iteration Calculus
2 . 3 1 Preparation 2 . 3 2 The theorems of the iteration calculus
- vii -
3 . THE MATHEMATICS OF DIFFERENCE RELATIONS 3 . 1 Convergence of Difference Inequalities 3 . 2 A Theorem on the Solutions of Certain
Inhomogeneous Difference Equations 3 .3 On the Roots of Certain Indicial Equations
3 . 3 1 The properties of the roots 3 . 3 2 An important special case
3 . 4 The Asymptotic Behavior of the Solutions of Certain Equations 3 . 4 1 Introduction 3.42 Difference equations of type 1
3 . 4 3 Difference equations of type 2
4, INTERPOLATOR! ITERATION FUNCTIONS 4 . 1 Interpolation and the Solution of Equations
4 . 1 1 Statement and solution of an interpolation problem
4 . 1 2 Relation of interpolation to the calculation of roots
4 . 2 The Order of Interpolatory Iteration Functions 4.21 The order of iteration functions generated
by Inverse interpolation 4 . 2 2 The equal information case 4 . 2 3 The order of Iteration functions generated
by direct interpolation 4 . 3 Examples
- vili -
1
ONE-POINT ITERATION FUNCTIONS 5 . 1 The Basic Sequence E g
5 . 1 1 The formula for E 0
s 5 . 1 2 An example 5 # 1 3 The structure of E o
s 5 . 2 Rational Approximations to E o
s 5 . 2 1 Iteration functions generated by
rational approximation to E a
5 . 2 2 The formulas of Halley and Lambert 5 # 3 A Basic Sequence of Iteration Functions
Generated by Direct Interpolation 5 . 3 1 The basic sequence $
o j s 5 . 3 2 The iteration function $ q
5 . 3 3 Reduction of degree 5 . 4 The Fundamental Theorem of One-Point Iteration
Functions 5 . 5 The Coefficients of the Error Series of E Q
s 5 . 5 1 A recursion formula for the coefficients 5 . 5 2 A theorem cohcerning the coefficients
ONE-POINT ITERATION FUNCTIONS WITH MEMORY 6 . 1 Interpolatory Iteration Functions
6 . 1 1 Comments 6 . 1 2 Examples
- ix -
6 . 2 Derivative Estimated One-Point Iteration Functions with Memory 6 . 2 1 The secant iteration function and its
generalization 6 . 2 2 Estimation of f^" 1) 6 . 2 3 Estimation of g^" 1^ 6.24 Examples
6 . 3 Discussion of One-Point Iteration Functions
with Memory 6 . 3 1 A conjecture 6 . 3 2 Practical considerations 6 . 3 3 Iteration functions which do not use
all available information
6 . 3 4 An additional term in the error equation
MULTIPLE ROOTS 7 . 1 Introduction 7 . 2 The Order of E g
7 . 3 The Basic Sequence & s
7 . 3 1 Introduction 7 . 3 2 The structure of g g
7 . 3 3 Formulas for g s
7 . 4 The Coefficients of the Error Series of g R
7 . 5 Iteration Functions Generated by Direct Interpolation 7 . 5 1 The error equation 7 . 5 2 On the roots of an indicial equation 7 . 5 3 The order 7 . 5 4 Discussion and examples
7 . 6 One-Point Iteration Functions with Memory 7 . 7 Some General Results 7 . 8 An Iteration Function of Incommensurate Order MULTIPOINT ITERATION FUNCTIONS 8 . 1 The Advantages of Multipoint Iteration Functions 8 . 2 A New Iterpolation Problem
8 . 2 1 A new interpolation formula 8 . 2 2 Application to the construction of
multipoint iteration functions 8 . 3 Recursively Formed Iteration Functions
8 . 3 1 Another theorem of the iteration calculus 8 . 3 2 The generalization of the previous theorem 8 . 3 3 Examples 8 . 3 4 The construction of recursively formed
iteration functions 8 . 4 Multipoint Iteration Functions Generated by
Derivative Estimation 8 . 5 Multipoint Iteration Functions Generated by
Composition 8 . 6 Multipoint Iteration Functions with Memory
- xi -
i
9 . MULTIPOINT ITERATION FUNCTIONS: CONTINUATION 9 . 1 Introduction 9 . 2 Multipoint Iteration Functions of Type 1
9 . 2 1 The third order case 9 . 2 2 The fourth order case
9 . 3 Multipoint Iteration Functions of Type 2
9 . 3 1 The third order case 9 . 3 2 The fourth orjder case
9 . 4 Discussion of Criteria for the Selection of an Iteration Function
1 0 . ITERATION FUNCTIONS WHICH REQUIRE NO EVALUATION OF DERIVATIVES 1 0 . 1 Introduction 1 0 . 2 Interpolatory Iteration Functions
1 0 . 2 1 Direct Interpolation 1 0 . 2 2 Inverse Interpolation
1 0 . 3 Some Additional Iteration Functions 1 1 . SYSTEMS OF EQUATIONS
1 1 . 1 Introduction 1 1 . 2 The Generation of Vector-Valued Iteration
Functions by Inverse Interpolation 1 1 ; # 3 Error Estimates for Some Vector-Valued
Iteration Functions
1
- xii -
1 1 . 3 1 The generalized Newton iteration
function 1 1 . 3 2 A third order iteration function
1 1 . 3 3 Some other vector-valued iteration functions
1 1 . 3 4 A test function 1 1 . 4 Vector-Valued Iteration Functions which
Require No Derivative Evaluations 1 2 . A COMPILATION OF ITERATION FUNCTIONS
1 2 . 1 Introduction 1 2 . 2 One-Point Iteration Functions 1 2 . 3 One-Point Iteration Functions with Memory 1 2 . 4 Multiple Roots
1 2 . 4 1 Multiplicity known 1 2 . 4 2 Multiplicity unknown
1 2 . 5 Multipoint Iteration Functions 1 2 . 6 Multipoint Iteration Functions with Memory 1 2 . 7 Systems of Equations
A. INTERPOLATION A.l Introduction A.2 An Interpolation Problem and Its Solution
A . 2 1 Statement of the problem A.22 Divided differences A.23 The Newtonian formulation A.24 The Lagrange-Hermite formulation A.25 The interpolation error
APPENDICES
! i
- xiii -
A.3 The Equal Information Case A.31 The Newtonian formulation A.32 The Lagrange-Hermite formulation A.33 The interpolation error
A . 4 The Approximation of Derivatives A.41 Statement of the problem A.42 Derivative estimation from the
Newtonian formulation A . 4 3 Derivative estimation from the
Lagrange-Hermite formulation A . 5 The Error in the Approximation of Derivatives
A . 5 1 Discussion A . 5 2 First proof of the error formula A . 5 3 Second proof of the error formula
ON THE jTH DERIVATIVE OF THE INVERSE FUNCTION SIGNIFICANT FIGURES AND COMPUTATIONAL EFFICIENCY ACCELERATION OF CONVERGENCE D.l Introduction
2 D.2 Aitkenfs 5 Transformation D . 3 The Steffensen-Householder-Ostrowski Iteration
Function
- xiv -
E« NUMERICAL EXAMPLES E.l Introduction E . 2 Growth of the Number of Significant Figures E . 3 One-Point and One-Point with Memory Iteration
Functions E,4 Multiple Roots E.5 Multipoint Iteration Functions E.6 Systems of Equations
F> AREAS FOR FUTURE RESEARCH
BIBLIOGRAPHY
I r
- XV -
TERMINOLOGY
Page asymptotic error constant 1 . 2 - 1 2
basic sequence 1 . 2 - 1 8
derivative estimated iteration function 6 . 2 - 3
informational efficiency 1 . 2 - 1 6
informational usage 1 . 2 - 1 6
interpolatory iteration function 4 . 1 - 5
iteration function 1 . 2 - 4
multipoint iteration function 1 . 2 - 1 1
multipoint Iteration function with memory 1 . 2 - 1 1
one-point iteration function 1 . 2 - 1 0
one-point iteration function with memory 1 . 2 - 1 0
optimal iteration function 1 . 2 - 1 8
optimal basic sequence 1 . 2 - 1 8
order 1 . 2 - 1 2
order is multiplicity-dependent 1 . 2 - 1 3
order is multiplicity-independent 1 . 2 - 1 3
I I
- xvi -
GLOSSARY OP SYMBOLS
The following list, which is intended only for reference, contains the symbols which occur most frequently in this book.
- xvii -
a j ( x )
A j ( x )
a
B, (x)
B s ,n 1,3
f ( j )(x)/j!
aj(x)/a1(x)
Lagrange-Hermite coefficient
a zero of f
a , H m - l | x ) ma w(x) m v 7
A - ; » ( t )
( a )
t=x.
Pk,a C
C[x,l]
7 '1,3 (t)
d
1,3
dominant zero of g, Q(t)
asymptotic error constant binomial coefficient
Newton coefficient
inf orma11ona1 us age (s)
'1,3 (t)
t=X.
e e i EPF
x - a x^ - a
informational efficiency
ir
- xviii -
Page E g a certain family of iteration 5*1-2
functions *E o a family of iteration functions 6.2-5 n»o
generated by derivative estimation
" 4 s Q a family of iteration functions 6.2-14 n, o generated by derivative estimation
S fx,f.m) = E o(x,f l / m,l) 7 . 3 - 3
f function whose zero is sought 1 . 2 - 1
*f£ S) an estimate of f^S^ 6 . 2 - 5
f [ x ± , 7 0 ; x ± - 1 , 7 1 j x 1 _ n , 7 n ]
confluent divided difference A.2-4
f a certain kind of interpolatory 8 . 2 - 2
function
3 the inverse function to f 1 . 2 - 2
1 a £ S ) an estimate of 6 . 2 - 1 4
k -1
g k j a(t) t k - a £ tJ 3 . 3 - 1
the inverse of the Jacobian matrix 1 1 . 2 - 1 ,
L P . iteration function 1 . 2 - 4
class oi order p
I class of iteration function of 1 . 2 - 1 3
- xix -
class of Iteration functions with informational usage d and order p Jacobian matrix
gs(x,f,m) = Z-\t^(m)(x-a)1
the multiplicity of a the number of points at which old information is reused
0 u(x) = Sv^x-a)
order (in sense of order of magnitude)
mu(x) = 2cD^(m)(x-a)
order hyperosculatory polynomial for f
names of Iteration functions family of iteration functions generated by Inverse interpolation family of Iteration functions generated by direct Interpolation family of iteration functions generated by rational approximation to E
s
hyperosculatory polynomial for 3
s(n+l)
- XX -
R
p s , j ( m )
P,3
T
u(x)
V(x)
W ( x )
Yj(x)
Z j(x)
ij • • «
S(n+1)
s S s + 1(x,f,m) = x - £ p s^(m)Z (x,f,l)
j=l s - 1 derivatives are used in many functions s - 1
Stirling numbers of the first kind
Page 6 . 2 - 4
7 . 3 - 4
1 . 2 - 1 7
6 . 2 - 4
7 . 3 - 7
Wjx,f,m) = ^ a t^(m)Zj(x,f,l) 7 . 3 - 5 .
Stirling numbers of the second kind 7 . 3 - 7
E g(x) = ZtttaU-a)1
f(x)/f'(x) cp(x) - a (x-a) P
cp(x) - a u P(x)
y3(x,t,m) = Z3(x,tl/m,l)
approximant to a
( • l ) J " V j ) f y )
j U * ' ( y ) ] J
Y J(x)u J(x)
y=f(x)
5 . 5 - 4
1 . 2 - 6
2 . 3 - 3
2 . 3 - 5
7 . 3 - 3
1 . 2 - 4
1 . 2 - 7
5 . 1 - 1 3
1 1 . 3 - 9
- xxi -
Page approximate equality between 1 . 2 - 8
numbers same order of magnitude 1 . 2 - 8
the set of x for which the 1 . 2 - 9
proposition p(x) is true
1.0-1
CHAPTER 1
GENERAL PRELIMINARIES
The basic concepts and notations to be used throughout this book will be introduced in Section 1.2.
1 . 1 Introduction The general area into which this book falls may be
labeled algorithmics. By algorithmics we mean the study of algorithms in general and the study of the convergence and efficiency of numerical algorithms in particular.
More specifically, we shall study algorithms for the solution of equations. Our approach will be to examine the relationship between the quantity and quality of information used by an algorithm and the efficiency of that algorithm We shall investigate the effect of reusing old information and of gathering new information at certain felicitous points.
Our interest in the efficient use of information is influenced by the widespread use of high-speed computing machines. The introduction of computers has meant that many algorithms which were formerly of only academic interest become feasible for calculation. They are, in fact, used many times in many establishments on a wide variety of problems. The efficiency of these algorithms is therefore most important Furthermore, there are situations where the acquisition of more than a certain amount of data is prohibitively expensive. It is then imperative that as much information as possible be squeezed from the available data. Here again the question of efficiency is of paramount importance.
I
1 . 1 - 2
Iteration algorithms for the solution of equations will be studied in a systematic fashion. In the course of this study, new families of computationally effective iteration algorithms will be introduced and certain well-known iteration algorithms will be identified as special cases. It is hoped that this comprehensive approach will moderate, if not prevent, the rediscovery of special cases. This uniform approach will lead to certain natural classification schemes and will permit the uniform rigorous error analysis and the uniform establishment of convergence criteria for families of iteration algorithms. The final verdict on the usefulness of the new methods will not be available until the new methods have been tried on a variety of problems arising in practice. At present, however, extensive numerical experimentation on test problems support the theoretical error analysis.
Although we shall confine ourselves to the solution of real equations and systems of real equations, the field of potential application of this work is of much broader scope. Thus, analogous techniques may be applied to such problems as the solution of differential and integral equations and the calculation of eigenvalues. The generalization of our results to abstract spaces is of interest. The reader is referred to Appendix P for some additional discussion of this point.
\ i I
1.2-1
1.2 Basic Concepts and Notations 1.21 Some concepts and notations. The symbols to
be introduced below are global; they will preserve their meaning for the remainder of this book. A few of these symbols will be used with other than their global meanings (for example, as dummy indices in summation) but context will always make this clear.
Our investigation concerns the approximation of a zero of f(x), or equivalently, the approximation of a root of the equation f(x) = 0 . The zero and root formulations will be used interchangeably. There does not seem to be a good phrase for labeling this problem. If one says that one is solving an equation, one may be concerned with the solution of a differential equation. The adjectives algebraic and transcendental are usually used to distinguish between the cases where f is or is not a polynomial. Perhaps the best generic term is root-finding.
We will restrict ourselves to f(x) which are real single-valued functions of a real variable, possessing a certain number of continuous derivatives in the neighborhood of a real zero a. The number of continuous derivatives assumed will vary upwards from zero. The restriction to real zeros is not essential. With the exception of Chapter 11, f(x) will be a scalar function of a scalar variable; in Chapter 11, f(x) will be a vector function of a vector variable.
r
1 . 2 - 2
The "independent variable" will sometimes appear explicitly and will sometimes not appear explicitly. This will depend on whether the function or the function evaluated at a point in its domain is what is meant. It will also depend on the readability of formulas. Thus we will use both f and f(x). See Boas F l . 2 - 1 , pp. 67-681.
(l) Derivatives of f will be denoted either by f v ' with
f(°) = f, or by f'9f»9... • If f does not vanish in a (l)
neighborhood of a and if f v ' is continuous in this neighbor-(l)
hood, then f has an inverse 3, and !r ' is continuous in a
neighborhood of zero. A zero a is of multiplicity m if
f(x) = (x-a) mg(x),
where g(x) is bounded at a and g(a) is nonzero. We shall always take m as a positive integer. Ostrowski [ 1 . 2 - 2 , Chap. 5 ]
deals with the case that m is not a positive integer. If m = 1 , a is said to be simple; if m > 1 , a is said to be nonsimple. If a is nonsimple, it is called a multiple zero.
Perhaps the most primitive procedure for approximating a real zero is the following bisection algorithm. Let a and b be two points such that f(a)f(b) < 0 . Then f has at least one real zero of odd multiplicity on (a,b). For the purpose of explaining the algorithm it is sufficient to take
the interval as (0,l) and to assume that f ( 0 ) < 0 , f(l) > 0 .
Calculate f(J>). If f Q ) = 0 , a zero has been found. If f(l) < 0 , the zero lies on (|,l) and one next calculates f(|). If f(^) > 0 , the zero lies on ( 0 , - | ) and one next calculates f(-j ). The zero will either be found by this procedure or the zero will be known to lie on an interval of length 2 ~ q after the bisection operation has been applied q times. By then estimating the value of the zero to be the midpoint of this last interval, the value of the zero will be known to within a maximum error of 2 ~ ^ " " 1 . Once the zero has been bracketed it takes just q evaluations of f to achieve this accuracy. Observe that the accuracy to which the zero may be found is limited only by the accuracy with which f may be evaluated. Indeed, it is only necessary to be able to decide on the sign of f.
The bisection algorithm just described is an example of a one-point iteration function with memory, as defined below. Since no use is made of the structure of f, such as the values of its derivatives, the rate of convergence is not very high. On the other hand, the method is guaranteed to converge. Gross and Johnson [ 1 . 2 - 3 ] use a property of f to achieve faster convergence. See also Hooke and Jeeves [ 1 . 2 - 4 ]
and Lehmer [ 1 . 2 - 5 ] . Kiefer [ 1 . 2 - 6 ] gives an optimal search strategy for the case of a maximum of a unimodal function.
By using additional information we can do far better than the bisection algorithm. The natural information available are the values of f and the values of its derivatives. We shall use the words information, samples, and data quite interchangeably.
Let xi>xi-i9 •••>xi-n b e n + 1 a P P r o x l m a n t s t o a-Let be uniquely determined by information obtained at xi*xj__]_* • • • * x i - n * t l i e function that maps x-J/XJL_I> • • • >x±-n
into xi + 1 be called cp. Thus
We call cp an iteration function. The abbreviation I.F. will henceforth be used for iteration function and its plural. Let
We shall also write y ±_j for f(x±_^) and for ^ ^ ^ . . j ) -Since the information used at x
i _ j a r e the values of f and its derivatives, we may write
cp = cp (lQ) (tn)"
i ' i ' ' l 7 i - l ' i - l ' ' 1-1 ' ' l - r r l - r r l - n
(1-2) We shall not permit I.F. as general as thisj the types of I.F. to be studied will be specified in Section 1.22.
I
1 . 2 - 5
Rather than writing c p(x i,x i_ 1, .. . , x i _ n ) , it is more convenient to write cp or cp(x) . Observe further that cp is a functional depending on f and should perhaps be written cp(x,f) , This notation is not necessary and will therefore not be used except in Chapter 7 .
Even the simplest iteration algorithm must consist of initial approximation(s), an I.F., and numerical criteria for deciding when "convergence" is attained. We shall only be concerned with the I.F. Thus, for us, iteration algorithm will be the same as I.F.
Two I.F. are almost universally known. They are the Newton-Raphson I.F. and the secant I.F. For the former,
c p ( x . ) - x . - — 7 , ( 1 - 3 )
while for the latter,
cp(x1,x1_1) = x± - f ±
x l xl-l f -f I I I l - 1
f l * fi-l« ( 1 - 4 )
We shall henceforth call the former Newton's I.F. The secant I.F. is closely related to regula falsi. This last method, as it is usually defined, keeps two approximants which bracket the root; the secant I.F. always uses the latest two approximants.
1 . 2 - 6
V
In much of our work we shall deal not with f, but with a normalized f defined by
u = I 9 f' ^ 0 . ( 1 - 5 )
If f' = 0 , f ^ 0 , the u is undefined. If f' = 0 , f = 0 ,
which is the case at a multiple zero, we define u = 0 . The importance of u in our future work cannot be overestimated. One reason for its importance is that
For simple zeros,
11m x a x-a m
11m x a1
u(x) x-a = 1 ,
Now, x - a is the error in x but is not known till a is known. On the other hand, u is known at each step of the iteration. In the new notation, Newton's I.F. becomes
cp = x - u, ( 1 - 6 )
For later convenience we introduce some additional notation. Let
e i = x i ~ a, e = x - a. ( 1 - 7 )
1 . 2 - 7
Thus e i is the error of the ith approximant. Let
a j (x) .
A ( x ) » V ; 3 It'{x)>
B J , m ( x )
• H f f l -
ma(x) ' ( 1 - 8 )
a ( y ) = g ( J ) ( y )
v ( x ) . ( - p ^ ^ ' f y )
J J ! [ s , ( y ) ] J y-f(x)
The first of these is the Taylor series coefficient for f, while the second is a "normalized Taylor series coefficient" for f. The third is a "generalized normalized Taylor series coefficient" which will be used when the multiplicity m is greater than unity. Note that B. -,(x) = A.(x). The fourth is a "normalized Taylor series coefficient" for 3. The last Is a Taylor series coefficient for 3 with a different normalization and with f(x) substituted after differentiation.
We shall use the symbols 0 , ~ and ^ according to the following conventions:
1 . 2 - 8
If
lim g(x.) = C, l — oo 1
we shall write
g(x 1) - C or If
x lim g(x) = C, -* a
we shall write
g(x) C or
How the limit is taken will be clear from context.
where C is a nonzero constant, we shall write
f * Cg.
For approximate equality between numbers, the symbol be used. Thus
i(i+v^) ~ 1 . 6 1 8 ^ 1 . 6 2 .
The use of these symbols is illustrated in
f = 0(g) or
where
1 . 2 - 9
EXAMPLE 1 - 1 . Let
i+1 M±ef + L±e\,
e, 0 , M. K / 0, N ± - L ^ 0 ,
and where e± Is nonzero for all finite i. Then
since
ei+i - *A + ^ ( e i >
L i e i L.
Also,
since e ± + 1 * M e 1 ,
= M ± + L ± e 1 -+ M.
We shall use the notation
= {x|p(x)|
to denote the set of x for which the proposition p(x) is true Thus
J = [x||x| < lj
denotes the interior of the unit interval.
1-2-10
1.22 Classification of iteration functions. We shall classify L P . by the information which they require. Let be determined only by new information at x^. No old information is reused. Thus
xi+l = ' t 1" 9)
Then cp will be called a one-point I.F. Most I.F. which have been used for root-finding are one-point I.F. The most commonly known example is Newton's I.F.
Next let be determined by new information at x. and reused information at x. x. . Thus
i-1* i-n
xi+l = <P(XI'* x i - l " • #^ xi-n^ • U " 1 0 )
Then cp will be called a one-point I.F. with memory. The semicolon in (1-10) separates the point at which new data are used from the points at which old data are reused. This type of I.F. is now of special interest since the bid information is easily saved in the memory (or store or storage) of a computer. The case of practical interest is when the same information, the values of f and f' for example, is used at all points. The best-known example of a one-point I.F. with memory is the secant I.F.
1.2-11
Now let be determined by new information at x 1 , x i - 1 > .. . ,x 1_ k, k ^ 1. No old information is reused. Thus
Then cp will be called a multipoint I.F. There are no well-known examples of multipoint I.F. They are being introduced because of certain characteristic limitations on one-point I.F. and one-point I.F. with memory.
Finally let be determined by new information
at xi>xi-i> • • • i-k a n d reused information x^^-l' # * * , xi-n #
Thus
xi-l-l = ^ x i ' x i - 1 9 * * * *xi*-k* xi-k-l* * * * * xi-n^ 9 ft > k. (1-12)
Then cp will be called a multipoint I.F. with memory. The semicolon in (1-12) separates the points at which new data are used from the points at which old data are reused. There are no well-known examples of multipoint I.F. with memory.
1.2-12
1.23 Order. We turn to the important concept of the order of an I.F. Let x Q,x 1,...,x i,... be a sequence converging to a. Let e 1 = x^ - a. If there exists a real number p and a nonzero constant C such that
6 1 + 1 " C , ( 1 - 1 3 ) e p e i
then p is called the order of the sequence and C is called the asymptotic error estimate.
The remainder of this section will consist of commentary on this definition. We wish to associate the concept of order with the I.F. which generates the x ^ To emphasize this point, we can write ( 1 - 1 3 ) as
I cp(x) - a| / , v lv±-L—^-^c# (1-14) |x-a|p
We will associate an order to an I.F. whether or not the sequence generated by cp converges. The order assigned to an I.F. is the order of the sequence it generates when the sequence converges.
Recall that cp is a functional which depends on f. Hence the order of cp may be different for different classes of f. For the type of f and cp that we shall study, we shall insist that cp be of a certain order for all f whose zeros are
1 . 2 - 1 3
of a certain multiplicity. We shall always assume that we are in the neighborhood of an isolated zero of f and that the order possibly depends on the multiplicity of the zero but is otherwise independent of f. (With the exception of Chapter 7 ,
the zero will always be simple unless the contrary is explicitly stated.) To indicate that cp belongs to the class of I,P. of order p, we shall write
cp e I p. ( 1 - 1 5 )
The following definitions relate order to multiplicity. If the order is the same for zeros of all multiplicities, we say that the order is multiplicity-independent. If the order depends on the multiplicity we say that the order is multiplicity-dependent. If in particular the order is greater than one for simple zeros and one for all nonsimple zeros, we say that the order is linear for all nonsimple zeros. (The adjectives linear and quadratic will sometimes be used instead of first and second.)
Observe that if the order exists then it is unique. Assume that a convergent sequence has two orders, p^ and Pg. Let p Q = p n + 6 , 6 > 0. Then
1,2-14
and
lim:. i -+ oo
= 9 ,
which contradicts the assumption that the last limit is nonzero. If the order is integral, then the absolute values
may be dropped in the definition of order. We shall find that
then cp is of order p and C - c p V P ;(a)/p!. Equation (l - l 6 ) may be written as
c p ( x ) - a ~ C ( x - a ) p .
In his classic paper of 1 8 7 0 , E. Schroder [ 1 . 2 - 7 ]
defined order as follows: c p ( x ) is of order p if
c p ( a ) = a ; c p ^ ( a ) = 0 , J = l,2,...,p-l; c p ^ ( a ) f Q .
This definition is only valid for I.F. of one variable with p continuous derivatives. Although many authors have followed Schro'der's lead, we prefer to state ( 1 - 1 7 ) in the conclusion of Theorem 2 - 2 . However, a generalization of ( 1 - 1 7 ) will serve as the definition of order for the case of systems of equations in Chapter 1 1 .
I.F. with memory never have integral order. If cp (p+D is continuous and if
c p ( x ) - a = C ( x - a ) p [ l + O ( x - a ) ] , ( 1 - 1 6 )
( 1 - 1 7 )
1 . 2 - 1 5
The advantage of high-order I.F. will be discussed in Section 2 . 2 3 .
EXAMPLE 1 - 2 . For Newton's method,
f i + i - A (a) A = -2 *2\a), h 2 2 f / . e i
For the secant method,
| e |
, ±^p - A ( a ) , p = i(l+ v / 5 ) - 1 . 6 2 .
In Section 7 - 8 we will give an example of an I.F. whose order is incommensurate with the scale defined by ( 1 - 1 3 ) .
1 . 2 - 1 6
1.24 Concepts related to order. A measure of the Information used by an I.P. and a measure of the efficiency of the I.P. are required. A natural measure of the former is the informational usage, d, of an I.F. which we define as the number of new pieces of information required per iteration. Since the information to be used are the values of f and its derivatives, the informational usage is the total number of new derivative evaluations per iteration. We use here the convention that the function is the zeroth derivative. Ostrowski [ 1 . 2 - 8 , p. 1 9 ] has suggested the "Horner" as the unit of informational usage. To indicate that cp belongs to the class of I.F. of order p and informational usage d we write
To obtain a measure of the efficiency, we make the following definition: The informational efficiency. EFP, is the order divided by the informational usage. Thus
Another measure of efficiency called the computational efficiency, which takes into account the "cost" of calculating different derivatives, will be discussed in Appendix C. An alternative definition of informational efficiency is
( 1 - 1 8 )
EPF = d # ( 1 - 1 9 )
*EFF = p (1-20)
1 . 2 - 1 7
Ostrowski [ 1 . 2 - 9 , p. 2 0 ] calls *EFF an efficiency index; see Appendix C for a discussion of *EFF. Our definition of informational efficiency was chosen because it permits the simple statement of certain important results.
EXAMPLE 1 - 3 . For Newton's I.F.,
p = 2 , d = 2 , EFP = 1 , *EFF = ^ 2 .
For the secant I.F.,
P = 4 ( 1 + ^ 5 ) - 1 - 6 2 , d = 1 , EFP = *EFF = i(l+v/5) .
For one-point I.F. and one-point I.F. with memory there can be only one new evaluation of each derivative. If the first s - 1 derivatives are used, the informational usage will be s and
EPF = ^ . ( 1 - 2 1 )
Let n be the number of points at which old information is reused in a one-point I.F. with memory. Let
r = s(n+l). ( 1 - 2 2 )
1 . 2 - 1 8
Hence r is the product of the number of pieces of new information with the total number of points at which information is used. The quantities s, n, and r will characterize certain families of I.F. These symbols will occasionally be used with other meanings as long as there is no danger of confusion
We shall prove in Section 5~4 that for one-point I.F EFF 1. We anticipate this result to define a one-point I.F. as optimal if EFF = 1 . For optimal one-point I . F . , p = d = s . A basic sequence of I.F. is an infinite sequence of I.F. such that the pth member of the sequence is of order p. This concept is defined only for I.F. of integral order. An optimal basic sequence is a basic sequence all of whose members are optimal. In Section 5 - 1 the properties of a particular basic sequence, which will be used extensively throughout this book, will be developed.
2.0-1
CHAPTER 2
GENERAL THEOREMS ON ITERATION FUNCTIONS
In this chapter I.F. will be discussed without regard to their structure. In Section 2.1 the existence and uniqueness of the iterative solution of a fixed point problem will be proven under the assumption that the I.F. satisfies a Lipschitz condition. In Section 2.2 the difference between linear and superlinear convergence will be studied in some detail while in Section 2.3 an "iteration calculus" will be developed for I.F. which possess a certain number of continuous derivatives.
2.1 The Solution of a Fixed Point Problem We propose to study the solution of
cp(x) = x (2-1) by the iteration
x 1 + i = q>(xi) • (2-2)
If a satisfies ( 2-l), then a is called a fixed point of cp. The problem of finding the fixed points of a function occurs in many branches of mathematics. To see its relation to the solution of f(x) = 0 we proceed as follows. Let g(x) be any f u n c t i o n such that g(a) is finite and nonzero. Let
Then a is a solution of f(x) = 0 if and only if a is a solution of ( 2 - 1 ) .
fixed point problem has a unique s o l u t i o n and that the iteration defined by ( 2 - 2 ) converges to this solution. Because of the restrictive nature of our hypotheses* the proof will be very simple . Many other proofs have been given both for real functions and in abstract spaces and under a variety of hypotheses. See, for example Antosiewicz and Rheinboldt [ 2 . 1 - 1 ] , Collatz [ 2 . 1 - 2 , Chap. II], Coppel [ 2 . 1 - 3 ] ,
Ford [ 2 . 1 - 4 ] , Householder [ 2 . 1 - 5 , Sect. 3 - 3 ] , John [ 2 . 1 - 6 ,
Chap. 4 , Sect. 6 ] , and Ostrowski [ 2 . 1 - 7 , Chap. 4 ] .
cp(x) = x - f (x)g(x) . ( 2 - 3 )
We shall show that under certain hypotheses the
»
2 . 1 - 2
LEMMA 2 - 1 . Let cp be a continuous function from J = [a,b] to J. Then there exists an a, a < a < b, such that cp(a) = a.
PROOF. Since the function is from J to J,
cp(a) a, cp(b) £ b.
Let h(x) = cp(x) - x. Then
h(a) ^ 0, h(b) £ 0 ,
and the intermediate value theorem guarantees the existence of a such that h(a) = 0 . The result follows immediately.
In order to draw additional conclusions we must impose an additional condition on cp. Let
M s ) - cp(t)| < L|s-t|, 0 < L < 1 , ( 2 - 4 )
for arbitrary points s and t in J. Observe that this Lipschitz condition implies continuity. We can now show that the solution of cp(x) = x is unique.
We assume that cp is defined on some closed and bounded interval J = [a,b] and that its values are in the same interval. Then if x is in J, all the x. are in J. To
o 9 i guarantee that ( 2 - 1 ) has a solution we must assume the continuity of cp. We have
i li
2 . 1 - 3
LEMMA 2 - 2 . Let cp be a function from J to J which satisfies the Lipschitz condition ( 2 - 4 ) . Then cp(x) = x has at most one solution.
PROOF. Assume that there are two distinct solutions, and . Then
a 1 = c p ( a 1 ) , a 2 = c p ( a 2 )
and
al " a 2 = " <p(a2) •
An application of ( 2 - 4 ) yields
l ^ - a g l = | c p ( a x ) - c p ( a 2 ) | ^ L l o ^ - a i g l < l a ^ - a g l ,
which is a contradiction.
The existence and uniqueness of a solution a of the equation cp(x) = x has now been verified. It is a simple matter to show that the sequence defined by x^ ^ = cp(x^) converges to this solution.
THEOREM 2 - 1 . Let J be a closed, bounded interval and let cp be a function from J to J which satisfies the Lipschitz condition
2 . 1 - 4
|cp(s) - cp(t)| << L|s-t|, 0 1 L < 1 ,
for arbitrary points s and t in J. Let X q be an arbitrary point of J and let = cp(x^) . Then the sequence {x^} converges to the unique solution of cp(x) = x in J.
PROOF. Lemmas 2 - 1 and 2 - 2 assure us of the existence and uniqueness of a solution a of the equation cp(x) = x, This solution is the candidate for the limit of the sequence defined by x^ ^ = cp(x i) . Thus
x 1 + 1 - a = cp(x ±) - a = cp(x ±) - cp(a) .
From the Lipschitz condition we conclude that
l xi+l" a' ^ Ljx^ a l , L < 1 .
Thus
| x ± - a | £ L 1 | x Q - a |
and x^ a.
If one did not know that c p ( x ) = x had a unique solution, one would have to show that the x i satisfy the Cauchy criterion. A proof based on the Cauchy criterion may be found in Henrici [ 2 . 1 - 8 , Chap. 4 ] .
1 B
2 . 1 - 5
An estimate of the error of the pth approximant which depends only on the first two approximants and the Lipschitz constant may be derived as follows. Prom the Lipschitz condition we can conclude that for all i,
K + i ~ x i l ^ l 1 I x I " x
0 I • ( 2 ~ 5 )
Let p and q be arbitrary positive integers. Then
Xp+q " X p = (xp+q- xP4<i-l> + (Xp+q-l-Xp+q-2) + + ^ p H - l ' V
and
lXp+q-Xpl * l XP +q- X
P +q-ll + ' Xp+q - r Xp-Kl-2 • + ••• + l Xp +l- Xpl
An application of ( 2 - 5 ) shows that
|x p + q-x p| £ L p(l+L+...+L q" 1)|x 1-x o|, or
Let q —• oo. Then
' V a | lxl-xol«
This gives us a bound on the error of the pth approximant. Observe that if L is close to unity, convergence may be very slow. A good part of our effort from here on in will be devoted to the construction of I.P. which converge rapidly. To do this we shall have to impose additional conditions on cp
2.2-1
2.2 Linear and Superlinear Convergence In Section 2.1 the analytical problem, cp(x) = x,
suggested the iterative procedure = cpCx^) . This is not typical of the problems on which we will work. In general we will be given the equation f = 0 and we will want to construct a functional cp which will be used as an I.F. In Section 2.1 we showed that if cp satisfies a Lipschitz condition, then the iteration will converge to the unique solution of the fixed point problem. From now on, rather than seeking weak sufficient conditions under which the iterative sequence converges, we will be interested in constructing I.F. such that the sequence converges rapidly. We will be quite ready to impose rather strong conditions. In particular, we shall assume that f has a zero in the interval in which we work.
2 . 2 - 2
2 . 2 1 Linear convergence. Let us assume that cp' is continuous in a neighborhood of a. We will certainly insist that
cp(a) = a. ( 2 - 7 )
For if x. a and cp is continuous, then
a = lim x i —• oo ± + 1 = lim cp(x±) = cp( lim x ±) = cp(a) i —• oo i —• oo
Now xi+l = °P(xi) = <P(a) + cP'(^i)( xi" a)^
where £^ lies in the interval determined by x^ and a. An application of ( 2 - 7 ) yields
xi+l ~ a " cp^^) (x±-a) ( 2 - 8 )
We shall require that |cp'(£^)| <i K < 1 in the neighborhood of a. Since cp' is continuous it will be sufficient to require that | cp' (ct) | £ K < 1 . Then there is a neighborhood of a such that
and
Then
I cp' (x) | £ L, 0 £ L < 1
x±+l~a\ ^ L x i ~ a l •
l ^ - a l £ L | x 0 - a
and x ± —• a.
«
2 . 2 - 3
If a is such that for every starting point x Q in a sufficiently small neighborhood of a the sequence {x^} converges to a, then a is called a point of attraction. This terminology is due to J. P. Ritt. See Ostrowski [ 2 . 2 - 1 ,
p. 2 6 ] for a discussion of this and the definition of point of repulsion. In this terminology we can state our result as follows. If cp' is continuous in a neighborhood of a and if cp(a) = a and |cp'(a)| <C L < 1 , then a is a point of attraction.
We can conclude something more. Let c p ' ( a ) be differ ent from zero. Then cp' does not vanish in a neighborhood of a Let e i = x^ - a. Then ( 2 - 8 ) may be written as
ei+l " $'(£>±)e±-
It is clear that if e is not zero and cp' does not vanish, o ^
then e. does not vanish for any finite i. That is, the iteration cannot converge in a finite number of steps as long as the iterants lie in the neighborhood of a where cp' does not vanish. Hence
ei+l ,/> \
and e i+1 _ • / \ - • c p ' ( a ) e i
This is the case of linear or first-order convergence.
2 . 2 - 4
In the case of superlinear convergence It is not necessary to require that | c p ' ( a ) | < 1; the iteration always converges in some neighborhood of a.
i ir
2 . 2 - 5
2 . 2 2 Superlinear convergence. We now assume that
1 , is cp(a) = a. Then
P ^ 1* is continuous. As before we require that
q / P ^ U ) x i + l = < P ( x i . ) = a + cp' ( a ) ( x ± - a ) + . . . + ^ ( x ± - a ) p ,
where £^ lies in the interval determined by x i and a. It should be clear that cp is of order p only if c p ^ ^ ( a ) = 0, j = l,2,...,p-l, and if c p ^ (a) is nonzero. Hence we impose the following conditions on cp:
c p ( a ) = a ; < p ^ J ^ ( a ) = 0 , j = l,2,...,p-l; c p ^ P ^ ( a ) ? 0. ( 2 - 9 )
Then
< p (p ) U j ) p
p s- , ±— p ^
i+1 pi i'
where = x 1 - a. Since q/P^(a) is nonzero, q/p^ does not vanish in some neighborhood of a. Then the algorithm cannot converge in a finite number of steps provided that e Q is nonzero and that the iterants lie in the neighborhood of a where q/p^ does not vanish. (See also the beginning of Section 2 . 2 3 . ) Furthermore,
fj±i-.<p ( p )(«) ( 2 - 1 0 ) e i
Hartree [2.2-2] defines the order of an I.F. by the conditions (2-9) . This definition of order cannot be used for one-point I.F. with memory. Hence we prefer to state these conditions as part of a theorem. We summarize the results in
THEOREM 2-2. Let cp be an I.F. such that cp^ is continuous in a neighborhood of a. Let ej_ = xj_ ~ a* Then cp is of order p if and only if
cp(a) = a; cp^(a) = 0 , J = l,2,...,p-l; cp^(a) ^ 0.
Furthermore,
el+l , q> ( p )(a) e P Pi- '
Recall that we assign an order to cp whether or not the sequence generated by cp converges. Sufficient conditions for a sequence to converge are derived below.
2.2-7
EXAMPLE 2-1. Let cp = x - f/f, (Newton's I.P.). Let f"'be continuous. A direct calculation reveals that cp(a) = a, cp'(a) = 0, cp"(a) = f"(a)/f'(a) ^ 0. Hence Newton's I.F. is second order and
We require that f"' be continuous in order to satisfy the hypothesis of Theorem (2-2) that cp" be continuous. It will be shown in Chapter 4 that we only need f" continuous.
by a linear I.F. need not converge in any neighborhood of a. We shall show that a sequence formed by a superlinear I.F. always converges in some neighborhood of a. In the following analysis the cases of linear and superlinear convergence are handled jointly.
A 2(a), A 2 = f " 2f' •
We observed in Section 2.21 that a sequence formed
We start our analysis from
(2-11)
Let
i
2 . 2 - 8
(the notation is specified at the end of Section 1 . 2 1 ) and let c p ^ P ^ be continuous on J. Let x Q e J and let
JpUx) I
for all x e J. Since x Q e J,
Hence
M 0 I £ M , e Q| r .
Let
Then M T P - 1 = L < 1 . ( 2 - 1 2 )
| e i | 1 LT < r .
Hence e J. We now proceed to prove by induction that if ( 2 - 1 2 ) holds, then for all i,
x ± e J, |e ±| £ L^T. ( 2 - 1 3 )
Assume that ( 2 - 1 3 ) holds. Then
| e i + 1 l = i M j I e j P ^ M l e j P ^ l e J
£ m | r | p _ 1 | e ± | i l l j t = L 1 + 1 r < r .
r
2 . 2 - 9
we conclude that x ± -* a. We have proven
THEOREM 2 - 3 . Let q/p^ be continuous on the interval J,
J = |x |x-a| < r^ .
Let x Q e J and let
| q > ( p ) ( x ) l £ M
for all x e J. Let
M T P - 1 < 1 .
Then for all 1 , x^ e J and x i -»• a.
Hence ( 2 - 1 3 ) holds for 1 + 1 which completes the proof by induction. Since
|e ±| 1 L1]?, L < 1 ,
2.2-10
2.23 The advantages of higher order iteration functions. If cp' Is continuous in a neighborhood of a and if cp is first order, then the sequence generated by cp will converge to a if and only if |cp'(a)| < 1 unless one of the x i
becomes equal to a. That the last restriction is necessary is shown by the example
c p ( x ) = i 2x for |x| £ 1
for |x| > 1
Here cp' is certainly continuous in the neighborhood of zero and c p ' ( O ) = 2 but the iteration converges for all X q in the unit interval. Baring such cases, c p ' ( a ) will have to be less than unity for convergence. If, on the other hand, p > 1 and c p ^ ^ is continuous in the neighborhood of a, then there is always a neighborhood of a where the iteration is guaranteed to converge.
This is one of a number of advantages of superlinear convergence over linear convergence. Perhaps the most important advantage is roughly summarized by the statement that xi+l a S r e e s with a to p times as many significant figures as x^. See Appendix C.
The higher order I.F. will often require fewer total samples of f and its derivatives. Recall that the informational efficiency of an I.F. is given by the ratio of
i ir
2 . 2 - 1 1
the order to the number of pieces of new information required per iteration. As will be shown in Chapter 5 , there exist I.F. of all orders whose informational efficiency is unity. We defined such I.F. as optimal. To simplify matters, let cp
and ^ be optimal I.F. of orders 2 and 3 respectively. Let x^ be generated from x Q by the application of cp three times and let y^ be generated from x Q by the application of ^ twice. Although each process requires six pieces of informa-
o q tion, - a = 0[(xQ-a) ] whereas y^ - a = 0[(x Q-a)^]. Ehrmann [ 2 . 2 - 3 ] considers a certain family of I.F. whose members have arbitrary order. He discusses which order is "best" under certain assumptions.
The main drawbacks to the use of high order one-point I.F. are the increasing complexity of the formulas and the need for evaluating higher derivatives of f. In Chapters 8 and 9 we shall see how these disadvantages may be overcome by using multipoint I.F.
Observe that if f satisfies a differential equation it may be cheaper to calculate the derivatives of f from the differential equation than from f itself. In particular, let f satisfy a second order differential equation. After f and f have been calculated, f" is available from the differential equation. By differentiating the differential equation one may calculate f"'. This process may be continued as long as it is feasible.
i 1
2 . 2 - 1 2
It might appear that it is difficult to generate I.F. of higher order for all f. Such is not the case and algorithms for constructing I.F. of arbitrary order have been given by many authors. See, for example, Bodewig [ 2 . 2 - 4 ] ,
Ehrmann [ 2 . 2 - 5 ] , Ludwig [ 2 . 2 - 6 ] , E. Schroder [ 2 . 2 - 7 ] ,
Schwerdtfeger [ 2 . 2 - 8 ] , and Zajta [ 2 . 2 - 9 ] . Many of these algorithms depend on the construction of I.F. such that
cp(a) = a, q/J)(a) = 0 , j = l,2,...,p-l. ( 2 - l 4 )
We shall propose numerous techniques for generating I.F. of arbitrary order which also satisfy certain other criteria. It will turn out that it is not necessary to use ( 2 - l 4 ) to construct these I.F.
From two I.F. of first order, it is possible to construct an I.F. of second order by the technique of Steffensen-Householder-Ostrowski. The reader is referred to Appendix D.
In many branches of numerical analysis one constructs approximations by insisting that the approximation be exact for a certain number of powers of x. There is one class of I.F., those generated by direct interpolation (Section 4 . 2 3 ) , for which the I.F. yield the solution a in one step for all polynomials of less than a certain degree. This is not a general property of I.F. In fact, powers of x could not be used if for no other reason than that x m has a root of multiplicity m.
2 . 2 - 1 3
EXAMPLE 2 - 2 . Let
m - x - X cp - x jr?'
Then cp is second order and yields a in one step if f is any linear polynomial. Let
f 2 if = x - YT + f .
Then if/ is second order. Let f = x. Then ^ will not yield the answer in one step and will not converge if |x o| > 1 .
2 . 3 The Iteration Calculus We will develop a calculus of I.F. The results to
be proven in this section will play a key role in much of the later work. They will not apply to I.F. with memory. The nature of our hypotheses will be such that the order will be integral. No assumptions will be made as to the structure of the I.F. Theorems 8 - 1 and 8 - 2 also belong to the iteration calculus but are deferred to Chapter 8 because their applications are in that Chapter. Some of these results, for the case of simple roots, may be found in the papers by Zajta [ 2 . 3 - 1 ] and Ehrmann [ 2 . 3 - 2 ] .
2 . 3 - 2
EXAMPLE 2 - 3 - Let u = f/f and let
u cp1 = x - u, cp2 = x - mu, cp = * -
Then cp- e I 2 f ° r simple zeros whereas cp- e 1^ for multiple zeros. On the other hand, cp2 e I 2 for zeros of multiplicity m and cp e I 2 for zeros of arbitrary multiplicity. See Chapter 7 for details.
Let / P^ be continuous in a neighborhood of a. If cp e I p, then from Theorem 2 - 2 ,
cp(a) = a; cp^(a) = 0 for j = l,2,...,p-lj cp^(a) ^ 0
( 2 - 1 5 )
2 . 3 1 Preparation. In this section we do not restrict ourselves to simple zeros. The multiplicity of the zero is denoted by m and we indicate that cp is a member of the class of I.F. whose order is p by writing cp e I . We always insist that cp is of order p for all functions f whose zeros have a certain multiplicity. However cp may be of order p for simple zeros but of a different order for multiple zeros. Hence our hypotheses contain phrases such as "Let cp e I for some set of values of m."
2 . 3 - 3
The expansion of cp(x) in a Taylor series about a yields
c p ( x ) - a = ^ p V 0 U - a ) p , ( 2 - 1 6 )
where i lies in the interval determined by a and x. Now, i is not a continuous function of x but q/P^(£) may be defined as a continuous function of x as follows. Let
V(x) = <P<X) - q for x ^ a, V ( a ) - ^ , { a ) .
(x-a) p p *
Then V(x) is continuous wherever cp(x) is continuous and x ^ a. A p-fold application of L'Hospital's rule reveals that
l l m v ( x ) . a ^ l a l .
x a p • Hence
cp(x) - a = V(x)(x-a) P, (2-17)
where V(x) is a continuous function of x and V(a) is nonzero. Thus one may characterize I.F. of order p by the condition that cp(x) - a has a zero of multiplicity p at a.
By comparing ( 2 - 1 7 ) with (l - l 4 ) we observe that
C = V(a)
where C is the asymptotic error constant.
2 . 3 - 4
Equation ( 2 - 1 7 ) may be put into a form which is more convenient for many applications. Let f' be continuous in a neighborhood of a and let a be a simple root. Then f'(a) is nonzero and
f (x) = f '(T])(x-a) . Let
M x ) = Jkp- for x ^ a, * ( a ) = f'(a) .
Then
f(x) - X ( x ) ( x - a ) , ( 2 - 1 8 )
where A(x) is continuous if f is continuous and 7\(a) / 0, Prom ( 2 - 1 7 ) and ( 2 - 1 8 ) we conclude that
cp(x) - a = T(x)f p(x), ( 2 - 1 9 )
where T(x) = V(x)/A p(x). Then T(x) is continuous wherever cp^ p^(x) and f'(x) are continuous and f'(x) does not vanish. Furthermore,
T(a) - <P ( P )( a) n ^ 0 . pi[f'(a)] p
If the multiplicity m is greater than unity, then f is no longer proportional to x - a and so we cannot obtain an expression like ( 2 - 1 9 ) which involves f. However,
r
2 . 3 - 5
u = f/f' has only simple zeros and hence ( 2 - 1 9 ) is easily generalized. Let f( m) be continuous and let a have multiplicity m with m j> 1 . Proceeding as before we conclude the existence of continuous functions A-^(x) and ^g( x) s u c h that
f ( x ) = A 1 ( x ) ( x - a ) m , f ' ( x ) = ^ ( x M x - a ) 1 1 1 - 1 ,
with
Hence
with
u ( x ) = -fj^y = p ( x ) ( x - o ) , ( 2 - 2 0 )
P ( a ) - i .
Finally, from ( 2 - 1 7 ) and ( 2 - 2 0 ) ,
cp(x) - a = W(x)u p(x), ( 2 - 2 1 )
and W(x) continuous and with
W ( x ) = ^ L , w ( a ) = ^ i a I ^ 0 .
p p(x) p '
The importance of ( 2 - 1 7 ) , ( 2 - 1 9 ) , and ( 2 - 2 1 ) in our future work cannot be overestimated. In particular, we will have occasion to use ( 2 - 2 1 ) in the form
a = cp(x) - W ( x ) u p ( x ) . ( 2 - 2 2 )
I
2 . 3 - 6
2 . 3 2 The theorems of the Iteration calculus. The hypotheses of the theorems to be proven below will not be weighed down by the statement of continuity conditions on the derivatives of cp and f; they should be clear from the theorems. The notation is the same as in Section 2 . 3 1 .
THEOREM 2 - 4 . Let c p x ( x ) e I p 3 c p 2 ( x ) e J p 2
f o r
some set of values of m and let c p ^ ( x ) = c p 2 f c p 1 ( x ) ] . Then for these values of m, c p Q ( x ) e I .
1 2 PROOF. From ( 2 - 1 7 ) ,
P-i p c p 1 ( x ) = a •+ V 1 ( x ) ( x - a ) , c p 2 ( x ) = a + V 2 ( x ) ( x - a ) 2 .
Then
c p 3 ( x ) = c p 2 [ c p 1 ( x ) ] = a + V 2 [ C p 1 ( x ) ] [ < p 1 ( x ) - af2,
c p 3 ( x ) - a = V 2 [ c p 1 ( x ) ] V ^ 2 ( x ) ( x - a ) P l P 2 ,
P 2 ,
Let
V 3(x) = V 2 [ c p 1 ( x ) ] V 1d ( x )
The fact that
C 3 = V 3 ( a ) = V 2 ( a ) V P 2 ( a ) = C ^ 2 ± 0 ( 2 - 2 3 )
completes the proof.
2 . 3 - 7
N O T E . O b s e r v e t h a t ( 2 - 2 3 ) g i v e s t h e f o r m u l a f o r
t h e a s y m p t o t i c e r r o r c o n s t a n t o f t h e c o m p o s i t e I . F . i n t e r m s
o f t h e a s y m p t o t i c e r r o r c o n s t a n t s o f t h e c o n s t i t u e n t I . F .
C O R O L L A R Y . L e t
c p ^ ( x ) e I p , I = 1 , 2 , . . . , k
f o r s o m e s e t o f v a l u e s o f m . L e t
cp(x) = cpj Cqpj ( . . . ( c p j ( x ) ) . . . ) )
w h e r e t h e a r e a n y p e r m u t a t i o n o f t h e n u m b e r s l , 2 , . . . , k .
T h e n f o r t h e s e v a l u e s o f m , q>(x) e l . I n p a r t i c u l a r ,
p l p 2 * * , p k i f p . = p f o r I = 1 , 2 , . . . , k , t h e n <p(x) e l , , . i p K
T h e n
E X A M P L E 2 - 4 . L e t
q p 1 ( x ) = c p 2 ( x ) = x - u ( x ) , ( N e w t o n ' s I . F . ) .
$1 e 2' ^2 6 129 C l = C 2 = A 2 ^ a ^ A 2 8 3 2f7*
H e n c e
c p 3 ( x ) = c p 2 [ c p 1 ( x ) ] 6 l 4 , C = [ A 2 ( a ) ] 3 .
I
2 . 3 - 8
THEOREM 2 - 5 - Let <p(x) e I , with p > 1 , for some set of values of m. Then for these values of m there exists a function H(x) such that
cp(x) - x = - u(x)H(x), H(a) ± 0 .
PROOF. Since cp(a) = a, there exists a function G(x) with G(a) = 0 such that cp(x) =• x - G(x) . Then for p > 1 ,
c p ' ( a ) = 0 = 1 - G'(a) .
Therefore G(x) has a simple zero at a and G(x) = u(x)H(x) for some H(x) with H(a) V 0 .
Theorem 2 - 5 may be rephrased as follows. If a is a p-fold zero, p > 1 , of the function c p ( x ) - a, then it is a simple zero of the function c p ( x ) x . Since e p - ^ ( x ) - x - f ( x )
and c p g C x ) = x - f ( x ) are both of order one, the theorem is false when p = 1 .
EXAMPLE 2 - 5 - Since < p ( x ) - x has only simple zeros if the order of c p ( x ) is greater than unity, any I.F. which is of order p for functions f ( x ) with simple zeros will be of order p if f ( x ) is replaced by c p ( x ) - x . Thus if f ( x ) is replaced by c p ( x ) - x in Newton's I.F.,
II I
2 . 3 - 9
and f(x) e Ig. Let cp(x) be Newton's I.P. Then
x u(x)
This generalization of Newton's I.P. is of order two for roots of arbitrary multiplicity; it will be discussed in Chapter 7.
THEOREM 2 - 6 . Let <p, (x) e I n and cp0(x) e I_ for some set of values of m. Then for these values of m there exists a function U(x), with U(x) bounded at a and U(a) ^ 0 ,
such that
<P2(x) = cpx(x) + U(x)u p(x), with
p = minfp-^Pg], if p 2 ? p 2 ;
( P - , ) x ( P T )
p = p x , if p 2 = p 2 and (a) £ cp2 (a);
P > Plf if P x = P 2 and ^ (a) = cp2 (a).
.1
2 . 3 - 1 0
P R O O F . C a s e 1 . p1 / p A s s u m e i n p a r t i c u l a r
t h a t p 1 > p T h e n
P-i p
c p - ^ x ) = a + W x ( x ) u - L ( x ) , c p 2 ( x ) = a + W ( x ) u 2 ( x ) ,
c p 2 ( x ) - c p ^ x ) = u ^ ( x ) W 2 ( x ) - W 1 ( x ) [ u ( x ) ] P x - P 2
D e f i n e
U ( x ) = W 2 ( x ) - W 1 ( x ) [ u ( x ) ] p l - P 2
T h e f a c t t h a t U ( a ) = W 2 ( a ) ^ 0 c o m p l e t e s t h e p r o o f o f C a s e 1
( P n ) ( P J C a s e 2 . p1 = p 2 a n d c p x
x ( a ) / c p 2 ( a ) .
P-,
q > 2 ( x ) " c p ^ x ) = u - L ( x ) [ W 2 ( x ) - W 1 ( x ) ] .
D e f i n e U ( x ) = W 2 ( x ) - W - ^ x ) . T h e f a c t t h a t
U ( a ) = W 2 ( a ) - W l { a ) - ^ [ " ^ ( a ) - ^ ( a ) ? 0
c o m p l e t e s t h e p r o o f o f C a s e 2 .
( P i ) , ( P n )
C a s e 3 . P x = P 2 a n d ( a ) = q> 2 ( a ) . U n l e s s
^ ( x ) = c p 2 ( x ) , t h e r e e x i s t s a n i n t e g e r q , q > p s u c h t h a t
c p [ q ) ( a ) ^ c p ^ q ^ ( a ) a n d t h e p r o o f m a y b e c o m p l e t e d a s a b o v e .
!
2 . 3 - H
THEOREM 2 - 7 . Let q>-(x) e I , and let p l
cp2(x) = cp1(x) + U(x)up(x) for some set of values of m. Then for these values of m, cpp(x) e l . , where
P 2
P 2 = min[p,p 1], if p ^ P l -
P 2 = p , if p x = p and U(a) ^ -
P 2 > p , if p x = p and U(a) = -
4 P ) ( a ) m p
p l j
cp{ p )(a)m p
PROOF. Case 1 . p ^ p 1 . Assume p > P 1. Then
p l <px(x) = a + W 1(x)u (x), cp2(x) = <p1(x) + U(x)u p(x),
Hence
cpQ(x) - a = u (x) W 1(x) + U(x)[u(x)]P P l
Define W 2(x) = W 1(x) + U(x)[u(x)]P P l . The fact that W 2(a) = W-^a) ^ 0 completes the proof of Case 1 .
The case of most interest for later applications is ( P 1 > , , <Pl>, X
p l = p 2 ' ^1 ^ ^2 * W e P r o v e a converse to Theorem 2-6,
r
2 . 3 - 1 2
The "comparison theorem" just proved will permit us to deduce the order of a given I.F. if it differs by terms of order u p from an I.F. whose order is known.
'*
The following theorem permits the calculation of the asymptotic error constant of an I.F. of order p if the asymptotic error constant of another I.F. of order p is known. Recall that the asymptotic error constant C is defined by
C = l l m - « (2-24) x -» a (x-a) p
Absolute values are not required in (2-24) since p is an integer.
Case 2. p = P 1 and U(a) f - <pjP^ (a)mp/p J. Then
cp 2(x) - a = uP(x)[W1(x) + U(x)].
Define Wg(x) = W^x) + U(x) . The fact that
4 p ) ( a ) m P W 2(a) = W 1(a) + U(a) = 1
p , + U(a) ^ 0
completes the proof of Case 2.
Case 3 . P = P1 and U(a) = - cpjP^ (a)mp/p J . This case may be completed in a similar manner.
r
2 . 3 - 1 3
THEOREM 2 - 8 . Let cp- x) e I and cpg(x) e I p for some set of values of m. Let
o ( » ) - - ; / x ) ,
(x-a) p
Let C. and be the asymptotic error constants of cp and cpg Then for these values of m,
C = C 1 + 11m G(x). x a
PROOF. A p-fold application of L'Hospital's rule yields
4 p ) ( « ) - q > i p ) ( ° ) llm G(x) = -s ± = C - C x — a p ' d 1
EXAMPLE 2 - 6 . Let
cp1(x) = x - mu(x), cp2(x) = x - •
Then cp- Cx) e I 2 and cp2(x) e Ig, for m arbitrary. As will be shown in Section 7 * ^ 1
mu(x) = x - a - ^ i " ! (x-a) 2 + 0[(x-a) 3], m^ '
r
2 . 3 - 1 4
where a m(x) = f '(x)/ml. Then It is easy to verify that
n _ m + 1
1 ~ raam m C 2 -m+1
m lim G(x) = -2a "m+1
x a ma 9
m
as predicted by Theorem 2 - 8 . Let
c p ( x ) = ^ [ c p 1 ( x ) + c p 2 ( x ) ] = x - | u ( x ) m + u'(x)
Since and C 2 are equal in magnitude but opposite in sign, it is clear that cp(x) e for all m.
A more useful result than the one obtained in Theorem 2 - 8 is given by
THEOREM 2 - 9 . Let ^(x) e I p and let cp2(x) e I for some set of values of m. Let
c p 2 ( x ) - cp ( x ) f
H ( x ) = 1 s x ^ a , u = • i r ( x ) 1
Let C 1 and C 2 be the asymptotic error constants of c p - j ^ x ) and c p 2 ( x ) . Then for the values of m,
C 2 = C + l l m H(x) .
2!. 3 - 1 5
PROOF. It Is easy to show that
lim x -+ a
u(x) = 1
From the previous theorem,
lim H(x) = lim G(x) x a x -*• a
x-a
and the result follows
EXAMPLE 2-7. Let m = 1. Then
cp1(x) = x - u(x) - A 2(x)u 2(x), cp2(x) = x - 2 - A ^ f x ) ' u ( x )
are of third order and
C 1 = 2A 2(a) - f(i) A 3 ( a ) , Aj_ = j7j7
Since
cp2(x) = x - u(x) 1 + A2(x)u(x) + A 2(x)u 2(x) + 0 [ v T ( x ) ] ,
r
2 . 3 - 1 6
lim H(x) = -A 2(a) x —• a
Hence C 2 = Ag(a) - ^(a) .
THEOREM 2 - 1 0 . Let m = 1 and let U(x) be an arbitrary function which may depend on f(x) and its derivatives and which is bounded at a. Let q>-,(x) e I and let
Jtr
U(a) t - 1
P![f'(a)] p' Then
cp(x) = ^(x) + U(x)fp(x) ( 2 - 2 5 )
is the most general I.P. of order exactly p. PROOF. From ( 2 - 1 9 ) ,
( q,,(x) = a + T (x)f p(x), T^a) = * ( a )
p . 1 1 1 Pl[f'(a)] p
By hypothesis, «p(x) = cp- x) + U(x)fp(x) .
Then
<p(x) = a + f p(x)[T 1(x) + U(x)]. ( 2 - g 6 )
2 . 3 - 1 7
Define T(x) = T x(x) + U(x).
Since T(a) is nonzero, ep(x) e I p. The observation that U(x)fp(x) is the most general addend to cp- (x) which le^ds to ( 2 - 2 6 ) completes the proof.
Note that we do not assume that U(a) ^ 0 . It is clear that
cp(x) = Cp1(x) + *U(x)u q(x), q > p (2-27)
is also of order p. Setting U(x) = *U(x)u q" p(x) puts ( 2 - 2 7 )
into the form of ( 2 - 2 5 ) • Note also that Theorem 2 - 1 0 need not hold if m > 1 . For example,
cp(x) = cp1(x) + U(x)
is also of order p if m > 1 .
The following two theorems lead to another characterization of I . .F. of order p.
2 . 3 - 1 8
f(x) = * ( x ) ( x - a ) m
and
Then
f [ < p ( x ) ] = [<p(x) - a ] m M c p ( x ) ]
= [ V ( x ) ( x - a ) p ] A [ c p ( x ) ]
= V r a ( x ) M c p ( x ) ] [ ( x - a ) m ] P .
Since M a ) ^ 0 , A(x) does not vanish in some neighborhood of a and we may write
f [ c p ( x ) ] = V m ( x ) X [ c p ( x ) ] A - p ( x ) f p ( x ) .
Define
Q ( x ) = Vm(x)7v[cp(x)]7v- p(x).
THEOREM 2 - 1 1 . Let cp(x) 6 I p for some set of values of m. Then for these values of m, there exists a function Q ( x )
such that Q(a) ± 0 and f[cp(x)] = Q(x)fp(x) . PROOF, Let a be a zero of multiplicity m. As was
shown in Section 2 . 3 1 * there exists a continuous function M x )
such that
! r
2 . 3 - 1 9
The fact that
Q ( a ) = p!
m "f ( m )(a)
ml ? 0
completes the proof.
As a converse to Theorem 2 - 1 1 , we have
THEOREM 2 - 1 2 - Let f[cp(x)] = Q(x)fp(x) for some
set of values of m with Q(a) ^ 0 . Then for these values of m,
cp(x) e I p. PROOF. Let A(x) be defined as in the previous
theorem. Then
f [cp(x)j = [cp(x) - a]mA[cp(x)],
and from the hypothesis,
f[cp(x)] = Q(x)fp(x) = Q(x)A p(x)(x-a) m p.
Define V(x) by
V^x) = X-1[cp(x)]Ap(x)Q(x). Then
cp(x) - a = V(x)(x-a) p
and the fact that V(a) ^ 0 completes the proof.
! r
2 . 3 - 2 0
Theorems 2 - 1 1 and 2 - 1 2 show that cp(x) e I If and only If f [q>(x)] = Q(x)fP(x) with Q(a) ? 0.
EXAMPLE 2 - 8 . Let m = 1 and let
1 JT" cp(x) = x - u(x)H(x), H(x) = x , A ( x) u(x)* A 2 ^ x ) = 2 F 7
This is Halley's I.P. which will be studied in Section 5 . 2 .
Then
f [ c p ( x ) ] = f ( x ) - u ( x ) H ( x ) f ( x ) + | u 2 ( x ) H 2 ( x ) f " ( x ) + 0 [ u 3 ( x ) ] ,
H ( x ) = 1 + A 2 ( x ) u ( x ) + A|(X)U2(X) + 0 [ u 3 ( x ) ] .
Thus
f [ c p ( x ) ] = f ( x ) - u ( x ) f " ( x ) - A 2 ( x ) u 2 ( x ) f ' ( x ) + | u 2 ( x ) f " ( x ) + 0 [ u 3 ( x ) ]
= 0 [ u 3 ( x ) ] = 0 [ f 3 ( x ) ] .
Therefore Halley's I.P. is third order for m = 1 .
THEOREM 2 - 1 3 . Let m = 1 and cp(x) e I . Then
d Pf T c p f x ) !
dx* = f'(a)cp(p)(a)
x=a
2 . 3 - 2 1
cp(x) - f [<p(x)] = a + V 1(x)(x-a) P. ( 2 - 2 8 )
Since cp(x) e I , ir
<p(x) = a + V(x)(x-a) P. Therefore
f[cp(x)] = [V(x) - V1(x)](x-a)P. (2-29)
A second expression for f[cp(x)] may be derived as follows. Define A(x) by
f(x) = A(x)(x-a), A(a) = f'(a). Then
f[cp(x)] = A[<p(x) ][cp(x) - a] = Mcp(x)]V(x)(x-a)p. ( 2 - 3 0 )
From ( 2 - 2 9 ) and ( 2 - 3 0 ) ,
A[cp(x)]V(x) = V(x) - V 1(x). ( 2 - 3 1 )
PROOF. Let i/(x) = x - f (x) . Clearly ^(x) e I 2.
Let
¥(x) = T tcp(x)] = q>(x) - f[<p(x)].
By Theorem 2 - 4 , f(x) e I . Hence
r
2 . 3 - 2 2
From ( 2 - 2 8 ) ,
9(P)(„, . " M f W l v(a) - A. a pf[g<x)]
x=a
Taking x = a in ( 2 - 3 1 ) yields the desired result.
We turn to a generalization of the previous theorem,
THEOREM 2-14. Let m = 1 and <p(x) e I . Then
d Jf [CD(X)1 dx'
f'(a)cp ( j )(a), 0 < j < 2p
x=a
PROOF. Let = d^/dx^. It is clear from Theorem 2-11 that D^f[cp(x)]| _ = 0 for 0 < j < p. Since q/^(a) = 0 for 0 < j < p, the theorem is proved for these values of j.
Let p £ j < 2 p , Now
f(x) = f'(a)(x-a) + x(x)(x-a) 2,
cp(x) = a + V(x)(x-a) p.
r
2 . 3 - 2 3
It is clear that if f and cp possess a sufficient number of continuous derivatives, then T and V may be defined so as to possess as many continuous derivatives as required. We have
f[cp(x)] = f (a)[cp(x) - a] + T[cp(x)][cp(x) - af = f'(a)v(x)(x-a)p + S(x).
Then
D Jf [cp(x)j = f'(a) £ C[j,k]D J _ kV(x)D k(x-a) P + D JS(x), k=0
where C[j,k] denotes a binomial coefficient. Hence
D Jf [cp(x)]| x = a = f'(a)p!C[j,p]D J _ pV(x)| x = a.
It is easy to show that
D J - p V ( x ) | x = a = - 4 f H < p ( J ) ( a ) ,
which completes the proof.
CHAPTER 3
THE MATHEMATICS OF DIFFERENCE RELATIONS
In this chapter we shall lay the mathematical foundations prerequisite to a careful analysis of the convergence and order of one-point I.F. and one-point I.F. with memory. The two theorems of Section 3 ^ will be basic for that analysis.
It is assumed that the reader is familiar with the elementary aspects of difference equation theory. If this is not the case, Hildebrand [ 3 . 0 - 1 , Chap. 3 ] * Jordan [ 3 . 0 - 2 ,
Chap. 1 1 ] , Milne-Thomson [ 3 . 0 - 3 Land Norlund [ 3 . 0 - 4 ] may be used as references.
3.1-1
3.1 Convergence of Difference Inequalities
LEMMA 3-1. Let 7 Q * 7 ^ * • • • > 7 n
b e nonnegative integers and let q - 2 j = o y L e t M b e a P o s i t i v e constant and let the 6^ be a sequence of nonnegative numbers such that
n
and such that
Then
5 I + I ^ m n J
5 , 1 T , i * 0 , 1 , . . . , n . ( 3 - 1 )
M r q _ 1 < i
I m p l i e s t h a t 6^-*0. ' P R O O F . L e t
L = M r q _ 1 . ( 3 - 2 )
L e t t b e t h e f i r s t s u b s c r i p t f o r w h i c h y^ i s n o n z e r o . T h e n
5 n + l * M 5 n - t II ( 5 n - j ) 7 j
• J - t + 1
* M r q " 1 5 n _ t = L 6 n . t
W e n o w p r o v e b y i n d u c t i o n t h a t
5 i + l ^ L 5 i - t ' L < 1, ( 3 - 3 )
f o r a l l i . N o t e t h a t ( 3 ~ l ) a n d ( 3 ~ 3 ) i m p l y 6 ± + 1 £ r.
!
3 . 1 - 2
Let ( 3 - 3 ) hold for 1 = 0,1,.. .,k-l. Then
5 k + i ^ M 5 £ t n ( v / J
j=t+i
* ^ ^ ^ k - t - L 5k-t
This completes the induction. Prom ( 3 - 3 ) we conclude that § x — 0 .
LEMMA 3-2. Let yQ be a positive integer and let 7 1,7 2, ...,yn be nonnegative integers. Let q = Z^Q7y Let M and N be positive constants and let 6^ be a sequence of nonnegative numbers such that
5 I + I ^ M 5 I ° n ( 5 i + 5 ± - j ) 7 j + N 5 I o + 1 *
and such that
Then
6 < r, 1 = 0,1,...,n. ( 3 - 4 )
q ~ r o a-1 yo 2 °Mr q + NT < 1
Implies that 6^ -+ 0.
T h e n
3.1 -3
P R O O F . L e t
q ~ 7 o a - 1 7 o L - 2 M T q + N T . ( 3 - 5 )
5 n + l * M 5 n° II ( 5 n + 5 n-P 3 + N 5 n j = l
5 n + l * 5 n
n
M T 7 0 " 1 II ( 2 r ) 7 j + N T 7 °
= 8 n
2 " ' u M r q _ 1 + N r 0 = L 6 n .
W e n o w p r o v e b y i n d u c t i o n t h a t
5 i + l ^ L 6 i ' L < 1 , ( 3 - 6 )
f o r a l l i . N o t e t h a t ( 3 - 4 ) a n d ( 3 - 6 ) i m p l y 6 i + 1 £ r .
L e t ( 3 - 6 ) h o l d f o r i = 0 , l , . . . , k - l . T h e n
5 k + l * 5 k M r
7 . - 1 n
j = l
= L 6 ,
T h i s c o m p l e t e s t h e i n d u c t i o n . F r o m ( 3 - 6 ) w e c o n c l u d e t h a t
6 4 — 0 .
!
3 . 2 - 1
3 - 2 A Theorem on the Solutions of Certain Inhomogeneous Difference Equations
Let
n
X *j ai+j - °* % - 1 ( 3 - 7 )
be a homogeneous linear difference equation with constant coefficients. The indlcial equation corresponding to ( 3 - 7 )
is the algebraic equation
N
I " ,
J = 0
t J = 0 . ( 3 - 8 )
Now, if all the roots of ( 3 - 8 ) are simple, the general solution of ( 3 - 7 ) is given by
N
4=1
where the are the roots of ( 3 - 8 ) and where the c^ are constants determined by the initial conditions. It is obvious that if all the have moduli less than unity, then all solutions of ( 3 - 7 ) converge to zero.
We wish to extend this result to the case of a linear difference equation whose homogeneous part has constant coefficients and whose inhomogeneous part is a sequence converging to zero. Hence we shall consider
3 . 2 - 2
n
I K j a i + J " K N = 1 0 - 9 )
. n o
w h e r e 3 ^ -*• 0 .
L e t u s f i r s t c o n s i d e r t h e s p e c i a l c a s e o f t h e
f i r s t - o r d e r d i f f e r e n c e e q u a t i o n
0 i + l + K o a i = p i * ( 3 - 1 0 )
O b s e r v e t h a t t h e s o l u t i o n o f t h e i n d i c i a l e q u a t i o n i s g i v e n
b y t » -K q = p ^ . We s h a l l a s s u m e t h a t i s l e s s t h a n u n i t y
a n d t h a t P i -* 0 . T h e d i f f e r e n c e e q u a t i o n m a y b e " s u m m e d " a s
f o l l o w s . L e t I a n d n b e a r b i t r a r y w i t h n - 1 ;> t . T h e n
al+l + Ko°l " H'
°t+2 + Ko°l+l =
V l + K o a n - 2 = p n - 2 '
a n + K o a n - l - P n - 1 -
M u l t i p l y i n g t h e e q u a t i o n w h o s e l e a d i n g t e r m i s an _ j b y ( - K Q ) ^
o r a n d a d d i n g a l l t h e e q u a t i o n s y i e l d s
° n " 0 l ' \ + I P l P ? " 1 " 1 - ( 3 - 1 1 )
i = t
3 . 2 - 3
\on\ * I P i T ^ l a J + £ IPJIPJ 1 1" 1" 1. ( 3 - 1 2 )
Since 0 we can fix I so large that the sum has magnitude less than e/ 2 . Since p. is less than unity we can then take n so large that the first term on the right side of ( 3 - 1 2 )
has magnitude less than e/ 2 . We conclude that —> 0. This is the desired result for the case of a first-order difference equation.
We shall now derive a generalization of (3~ll) for the case of an Nth order difference equation. Thus we start with
_N
i+j " P i ' K N " 1 '
Let py j .= 1 ,2,...,N be the roots of the indlcial equation, Let d n i be constants whose values we shall choose at our later convenience. Let I and n be arbitrary integers such that n - N I. Then
n^N _N n-N
I dni Z V i + J = Z dni pi' 1=1 j=0 1=1
Hence
n - 1
3 . 2 - 4
or n^N i+N n^N
X dni Z Ks-i as - Z dni pi* 1=1 s=l 1=1
Changing the order of summation yields
n^N t+N-1 s
n i s - i
n-N s n n-N
* I ° = I d n l V i + I ° s X s * t + N i « s - N s = n - N + l i = s - N
d . K . n i s - i
- 1 + 2 + 3 .
Consider 2 , This is zero if we choose d n i as a solution of the homogeneous difference equation. It is convenient to define d n i by
d m - i v r 1 - 1 ( 3 - i 3 )
where the constants will be chosen later. Consider 3 . Using ( 3 - 1 3 ) we can show that
n n^N N-l N _N
s=n-N+l i=a-N A=0 r=N-7\ j=l
3 . 2 - 5
We have the N coefficients D-^Dg, ... ,D N at our disposal. We choose them so that the coefficient of is unity and the coefficients of the a n_^ are zero for 7\ > 0. Thus, recalling that k n - 1,
N
N N
( 3 - 1 4 )
I * r I V j 4 * " 1 = ° ' * = 1 , 2 , . . . , N - 1 . r - N - y j = l
U s i n g t h e f a c t t h a t
N 1 \
I
r = 0
* r . P i = 0 , j - 1 , 2 , . . . , N ,
i t i s n o t d i f f i c u l t t o p r o v e t h a t t h e s y s t e m ( 3 - 1 4 ) i s e q u i v a
l e n t t o t h e s y s t e m
N
D j p j = 5r,N-l' r = 09l9...9N-l ( 3 - 1 5 )
where 6^ is the Kronecker symbol. The determinant of this system is a Vandermonde determinant. Hence Dj satisfying ( 3 - 1 5 ) i and consequently ( 3-14), exist and are indeed the ratios of certain Vandermonde determinants. With this choice of the Dy 3 reduces simply to o^. We conclude that
3.2-6
n-N t+N-1 s a n - I < W * i " I
as I dni Ks-i
1=1 s=l 1=1
We summarize our results in
LEMMA 3 - 3 . Let
N \jui+J " p i ' K N
I N
J=0
be a linear difference equation whose homogeneous part has constant coefficients. Let the roots of the indicial equation be simple. Let
N d
3=1
where the are the roots of the indicial equation and where the Dj are determined by.
N
I V j " 5r,N-l' r " 0,1,...,N-l;
3 . 2 - 7
W e a r e n o w r e a d y t o p r o v e
T H E O R E M 3 - 1 . L e t
N
I V i + j - P i
b e a l i n e a r d i f f e r e n c e e q u a t i o n w h o s e h o m o g e n e o u s p a r t h a s
c o n s t a n t c o e f f i c i e n t s . L e t t h e r o o t s o f t h e i n d i c i a l e q u a
t i o n b e s i m p l e a n d h a v e m o d u l i l e s s t h a n u n i t y . L e t £ ^ -+0.
T h e n o± 0 f o r a l l s e t s o f i n i t i a l c o n d i t i o n s .
P R O O F . L e t
t + N - 1 s G 3 l - D J Z a s Z - s - i P j 1 " 1 ' ( 3 - 1 8 )
3 = 1 1 = 1
5 r N - l i s t h e K r o n e ^ i k e r s y m b o l . T h e n
n - N t + N - i s
a n - I d n A - Z ° s l d n i K s - i ' < 3 - 1 7 ) 1 = 1 s=l 1 = 1
w h e r e t a n d n a r e a r b i t r a r y i n t e g e r s s u c h t h a t n - N j> t .
3 . 2 - 8
- 1 v r 1 - 1 -ni J=l
and since the having moduli less than unity, it is clear that there exists a constant A such that
n-N
I i=l
d n i l < A -
Let e > 0 be preassigned and arbitrary. Since ^ -* 0, there exists a number L such that
< 2A f o r a 1 1 1 ^ L-
Fix £ such that I ;> L. Observe that is independent of n. With I fixed there exists a constant B such that
N
Using ( 3 - 1 6 ) , ( 3 - 1 7 ) , and ( 3 - 1 8 ) we can write
n-N N °n = E d n A " I GilPy
1=1 j=l
The notational adjustments which would be necessary if one of the roots of the indicial equation were equal to zero are obvious. Since
N d
3 . 2 - 9
P ^ 2B"
Then for all n > T|, we have
n-N N l * n | . * I | d n l | | P l | + Z I ^ J I P j l ^
i=t J=l
a n l < 2 T A + ^ B = e
which completes the proof.
For the applications that we have in mind, the inhomogeneous part of the difference equation will converge to a nonzero constant. We wish to show that all solutions of the inhomogeneous equation will converge to a constant. This result follows easily from the above theorem. We have
Let p = • m a x [ | . p 1 | , | p 2 | , . . . , | p N | ] .
Let T| be so large that
3 . 2 - 1 0
COROLLARY. Let
N
j=0
be a linear difference equation whose homogeneous part has constant coefficients. Let the roots of the indicial equation be simple and have moduli less than unity. Let co cd.
Then
CD
'i N
for all sets of initial conditions. PROOF. Let
a i 8 5 *i + ~N CD
Then
N
X * .
CD
"i+J r N
J = 0 J
I
N *±+j + * = ° V
Let = cd^ - cd. Then the conditions of the theorem apply and we conclude that 0 and hence that
0)
i N
0=0
3 . 3 - 1
3 . 3 On the Roots of Certain Indicial Equations The properties of the roots of the polynomial
equation k-1
g k j a(t) = t k - a £ t J = 0 ( 3 - 1 9 )
j=0
will be derived. This will turn out to be the indicial equation for certain families of difference equations which will be encountered in our later work. The case a = 1 has been treated by Ostrowski [ 3 . 3 - 1 , pp. 8 7 - 9 0 ] .
3 . 3 - 2
3 . 3 1 The properties of the roots. If k = 1 , the only root of ( 3 - 1 9 ) is t = a. For the remainder of this section we shall assume k > 2 . We shall permit a to be any real number such that
ka > 1 . ( 3 - 2 0 )
Since
g k , a ( l ) = 1 " k a '
t = 1 is not a solution of ( 3 - 1 9 ) • It is convenient to define
^ a ^ = ( t _ l ) g k , a ( t ) = t k + 1 " ( a + 1 H k + a - ( 3 - 2 1 )
Hence G. 0(t) has a root at t = 1 and roots at the roots of k. , a
By Descartes1 rule, g, Q(t) has exactly one real jk , a
positive simple root. This unique root will be labeled p. . ic j a
Since
gk,
we have
k-1
, a(a) = - £ aJ, S k , a ( D = 1 - ka, g k a(a+l) = 1
j=l
LEMMA 3 - 4 . The equation
k-1
g k , a ( t ) = t k " a £ t J = 0 J=0
3 . 3 - 3
has a unique real positive simple root, , and
max[l,a] < £ k ^ a < a + 1 .
Thus p k a has modulus greater than one. We shall show later that all other roots have moduli less than one. In the next lemma we shall prove that p. o is a strictly increas-
K. $ a ing function of k.
LEMMA 3-5. P k . l j f i < P k # a . PROOF. This follows from the observation that the
recurrence relation
t&krl.aM - a = *k,a^)
implies that gjj a(P k_i a ) l s negative.
The following inequality will be needed below.
LEMMA 3 - 6 . Let ka > 1. Then
^ > H ( i • £ ) k -
PROOF. Let k be fixed and define
Then J'(a) = (a+l)k(ka-l)/a2 and therefore J(a) is a strictly increasing function of a for ka > 1. The observation that
completes the proof.
Descartes1 rule shows that (3, Is a simple root. .k f a
Furthermore
LEMMA 3-7. All the roots of g k &(t) = 0 are simple PROOF. Define
G k , a ( t ) = ^ - ^ S i c a ^ ) - t k + 1 " ( a + D t k + a .
Then G k a(t) has only one nonzero root,
v (a+1).
The fact that v is positive completes the proof.
I r
3 . 3 - 5
In the following two lemmas we derive better bounds
LEMMA 3 - 8 .
k+t < a + 1> < Pk,a < a + 1.
Therefore, for a fixed
lim P k a = a + 1,
NOTE. Since ka > 1 implies that (a+1)k/(k+1) > 1, we: need not write
max[l, (a+1)] <
PROOF. Let v = (a+1)k/(k+1). Then
n ( v ) _ a ( a + l ) k + 1 /_ , l^" k
G k , a ( v ) = a " k+1 V1 + k,
An application of Lemma 3 - 6 shows that G, 0(v) < 0 , which ic j a
together with an application of Lemma 3 - 4 completes the proof,
LEMMA 3-9.
(a+l) k k ' a (a+l) k
where e denotes the base of common logarithms.
r
3.3-6
PROOF. Let v = (a+1)k/(k+1). The only positive root of the equation G," Q(t) = 0 is t = v - (a+l)/(k+l). Therefore G, 0(t) does not change sign in the interval K. , a v £ t a + 1. Since
G^ a(a+1) = 2k(a+l) k _ 1 > 0,
G, _(t) is convex in the interval. Since the secant line K. f a through the points [v,G, 0(v)], [a+l,G, (a+l)] intersects the
K j a ic f a t axis at
t = a + 1 - a
(a+1)
whereas the tangent line at [a+l,G, (a+l)] intersects the ic j a
t axis at
t = a + 1 -(a+l) k'
we conclude that
a + l - - ^ ( l ^ ) k O k , P < a + l - a
(a+l) K V ^ - - K , a - ( a + l ) k «
An application of the well-known inequality (l + l/k) k < e completes the proof.
Il
3-3-7
We will next study the polynomial generated by dividing g, Q(t) by the factor t - p. . Define c,.
K5a j 0-£ J £ k - 1, by
- £ § 7 7 " I W J * ( 3 ' 2 2 )
K ' a j=0
We first prove
LEMMA 3-10.
k-1 V c = ka-1
°J P k a-i* j=0 K ' a
PROOF. The result is obtained by setting t = 1 in (3-22) and observing that g. Q(l) = 1 - ka.
iv , a
In the following three lemmas we shall prove that all the roots of the quotient polynomial have moduli less than unity. It will be convenient to sometimes abbreviate
3.3-8
c Q = 1, C j = p " - a £ p 6 ,
t=0 (3-23)
A s e c o n d f o r m u l a f o r t h e c ^ m a y b e d e r i v e d b y n o t i n g t h a t
k-1 p k - a £ 3 * - = 0. (3-24)
t=0
B r i n g i n g t h e l a s t k - j t e r m s o f (3-24) t o t h e r i g h t s i d e o f
k - 1 t h e e q u a t i o n a n d d i v i d i n g b y p 0 y i e l d s
t = o t = i
c ^ * » a £ - p * - a y. p " " 6 , 1 £ J £ k - 1.
L e t 0 = p ' 1 . T h e n
°3 = a 0 ( l " i - e J ) » ( 3 - 2 5 )
T h e f a c t t h a t 6^1 c o m p l e t e s t h e p r o o f .
L E M M A 3-12.
c j - l c j + l < °y k > 2> 1 1 H k " 2.
L E M M A 3-11. C j > 0 f o r 0 £ j £ k - 1.
P R O O F . I t i s e a s y t o s h o w t h a t
li.
3 . 3 - 9
PROOF. Case 1 . j = 1 .
°o°2 < cl o r f r o m ( 3 ~ 2 3 ) '
We must show that
< a - a^k,a " a < ^k,a- a> 2'
This simplifies to 0 < 1 + a - 0. _ which is true from xV , CI
Lemma 3 - 4 .
Case 2 . J > 1 . Then from ( 3 - 2 5 ) , the result is equivalent to
or 0 < 1 - 29 + 6* which holds since 6^1.
LEMMA 3 - 1 3 . All the roots of the equation g k a(t) = 0 ,
other than the root 0, . have moduli less than one. K, a
PROOF. The proof depends on the following theorem which Ostrowski [ 3 . 3 - 2 , p. 9 0 ] attributes to independent discoveries by Kakeya [ 3 . 3 ~ 3 ] and Enestrom.
(i-e , k - J - l )(i-e
If in the equation g(x) = 2 j ^ Q t > n _ j X all coefficients bj are positive, then we have for each root i that
1 4 1 <: max b /b l ^ K n J
3.3-10
Applying this theorem to (3-22) we find, by virtue of Lemmas 3-11 and 3-12, that
c c l^Kk-l CJ-1 c o
Furthermore c,/c = - a < 1 from (3-23) and Lemma 3~4. jl o k. a
This completes the proof for the case k > 2. For k = 2,
= t + p Q - a
and Ul = P 2 > a - a < 1.
The major results of this section are summarized in
THEOREM 3-2. Let
k-1
If k 1, this equation has the real root p n _ = a. Assume j.,a
k ^ 2 and ka > 1. Then the equation has one real positive simple root p, 0 and a
max[l,a] < p v Q < a +.1.
3.3-H
Furthermore,
r~ o 4. i §a a + 1 ~ — s s r i z < P„ a < a + 1 -( a + l ) k ^ k , a — — ( a + l ) k *
where e denotes the base of common logarithms. Hence lim 6, „ = a + 1. All other roots are also simple and v k,a
have moduli less than one.
r
3 . 3 - 1 2
3 . 3 2 An important special case, A case of special interest occurs when a is a positive integer. (We will not be interested in nonintegral values of a until we investigate multiple roots in Chapter 7 . ) Lemma 3-4- niay be simplified to read
a < P k ^ a < a + 1 , k > 1 .
Values of p, for low values of k and a may be found in it j a Table 3 - 1 . Observe that p. has almost attained its limit,
ic j a a •+ 1 , by the time k has attained the value 3 or 4 . This is particularly true if a is large. This will have important consequences later.
3 . 3 - 1 3
TABLE 3 - 1 . VALUES OF p
"a 1 2 3 4
1 k
1 1 . 0 0 0 2 . 0 0 0 3 . 0 0 0 4 . 0 0 0
2 1 . 6 1 8 2 . 7 3 2 3 . 7 9 1 4 . 8 2 8
3 1 . 8 3 9 2 . 9 2 0 3 . 9 5 1 4 . 9 6 7
4 • 1 . 9 2 8 , 2 . 9 7 4 3 . 9 8 8 4 . 9 9 4
"5 1 . 9 6 6 2 . 9 9 2 3 . 9 9 7 4 . 9 9 9
6 1 . 9 8 4 2 . 9 9 7 3 . 9 9 9 5 . 0 0 0
7 1 . 9 9 2 2 . 9 9 9 4 . 0 0 0 5 . 0 0 0
3 . 4-1
3.4 The Asymptotic Behavior of the Solutions of Certain Difference Equations
3 .41 Introduction. Consider the difference equation
n ei+i - K II e t y ( 3 - 2 6 )
where K is a constant and s is a positive integer. Taking logarithms in ( 3 - 2 6 ) leads to a linear difference equation with constant coefficients whose indicial equation is
n t n + 1 - s t j = 0. (3-27)
This is a special case of the equation studied in Section 3 . 3
with k = n 4- 1 and a = s. The properties of the roots of ( 3 - 2 7 ) , derived in the previous section, permit us to analyze the asymptotic behavior of the sequence {e^}•
Now consider the difference equation
n ei+i = M i II el-y ( 3 - 2 8 )
where K. Is the asymptotic behavior of the sequence generated by this equation the same as that generated by ( 3 - 2 6 ) ? The answer turns out to be in the affirmative. It will turn out that the errors of certain important families of I.P. are governed by difference equations of the form specified by ( 3 T 2 8 ) .
3 - 4 - 2
In Section 3 * 4 3 we shall show that the asymptotic behavior of the sequence
n ei+i = M i e i n ( v e i ) S + N i e i + 1 ^ ( 3 - 2 9 )
J=I
where M. K, N ± —• L, is the same as the asymptotic behavior of the sequence generated by ( 3 - 2 6 ) or ( 3 - 2 8 ) . Difference equations of the type defined by ( 3 - 2 9 ) will be encountered in Chapter 6 . We shall designate difference equations of the type defined by ( 3 - 2 8 ) and ( 3 - 2 9 ) as of type 1 and type 2, respectively.
3 . 4 - 3
3 . 4 2 D i f f e r e n c e e q u a t i o n s o f t y p e 1 , W e s h a l l
s t u d y t h e a s y m p t o t i c b e h a v i o r o f t h e s o l u t i o n s o f t h e d i f f e r
e n c e e q u a t i o n
n
e i + i = M I n ^ 3 ° )
w h e r e s i s a p o s i t i v e i n t e g e r . W e s h a l l s h o w t h a t i f
M ± - K, ( 3 - 3 1 )
a n d i f t h e m a g n i t u d e s o f e0 > $ ^ , > * • • > e
n a r e s u f f i c i e n t l y s m a l l .
t h e n t h e s e q u e n c e o f c o n v e r g e s t o z e r o . O b s e r v e t h a t i f
n o n e o f t h e M 4 a r e z e r o a n d i f n o n e o f e - e - , . . . , e ^ a r e z e r o , l o l* * n
t h e n e ^ i s n o t z e r o f o r a n y f i n i t e i . W e s h a l l p r o v e t h a t
t h e r e e x i s t s a n u m b e r p g r e a t e r t h a n u n i t y s u c h t h a t
| e 1 + 1 | / | e . j j * \ c o n v e r g e s t o a n o n z e r o c o n s t a n t . I t i s e a s y t o
p r o v e ( s e e S e c t i o n 1 . 2 3 ) t h a t i f s u c h a n u m b e r p e x i s t s , t h e n
i t i s n e c e s s a r i l y u n i q u e .
L e t
5 ± = [ e ± | , r = s ( n + l ) > ( 3 - 3 2 )
S i n c e c o n v e r g e n t s e q u e n c e s a r e n e c e s s a r i l y b o u n d e d , t h e r e
e x i s t s a c o n s t a n t M s u c h t h a t
3 . 4 - 4
for all 1 . Then
n
Let
B 1 + 1 * M II B j . 4
J = 0
5 . ^ r , 1 = 0 , 1 , . . .,n.
An application of Lemma 3 _ 1 with all the y^ equal to s and with r replacing q shows that if
M r r - 1 < 1 ,
then 6 i - » • 0
Assume that and K are nonzero. We will investigate how fast the sequence {e^} converges to zero. Let
o± = In 5 ± = ln|e ±|, J± = lnJMj. ( 3 - 3 3 )
Then from ( 3 ~ 3 0 ) , we have that
_n ai+l " J i + s Z °±-y ( 3 - 3 4 )
J = 0
Let t be an arbitrary parameter whose value will be chosen later. Let
O c_ = 1, cj(t) = t3 - s ^ tl, j * l,2 , .o.,n+l,
1=0
( 3 - 3 5 )
3 . 4 - 5
r
and let D ±(t) = a 1 - tff±_1. ( 3 - 3 6 )
Then it is not difficult to see that ( 3 - 3 4 ) is identical with
X ° ^ \ + l - 3 { t ) + c n + l ( t K - n = J i ^ - 3 7 )
for all values of t. Consider the equation
n c n + 1 ( t ) = t n + 1 - s y t V « 0. ( 3 - 3 8 )
t=0
Observe that this is precisely the indicial equation of the difference equation derived from ( 3 - 2 6 ) by taking logarithms. It is a special case of the equation
k-1
S k, a(t) - t k - a £ t J = 0 ( 3 - 3 9 )
J-0
studied in Section 3 . 3 with k = n + 1 , a = s. The analysis of ( 3 - 3 9 ) was based on the assumption that ka > 1 . Let r > 1
Since ka = (n-fl)s = rs the conclusions of Section 3 . 3 apply. Hence all the roots of ( 3 - 3 8 ) are simple and there is one
f
3 . 4 - 6
real positive root greater than unity; all other roots have moduli less than unity. The real positive root is labeled
o • We shall abbreviate 6 1 o by p and choose the n+JL y s 11+1 • s parameter t equal to p. Hence
Let
c n + 1(p) - 0. (3-40)
C j = Cj(p), = Dj(p).
Then ( 3 - 3 7 ) becomes
n
The c. are just the coefficients of the polynomial gotten by J dividing c ^ t ) by t - p. [See ( 3 - 2 3 ) J Hence the indicial equation corresponding to the difference equation ( 3 - 4 l ) has only simple roots whose moduli are less than unity. Since
—• K, we conclude that -* ln|K|. All the conditions of the Corollary to Theorem 3 - 1 now apply and we conclude that
D ± - ^ L . (3-42)
c j j=0
3.4-7
An application of Lemma 3 - 1 0 , with ka = (n+l)s = r and with
^k,a * Pn+ 1 , 8 = p ' s h 0 W S t h a t
n I CJ ' P i - (3.43)
From (3-33), (3-36), (3-40), (3 - 4 2 ) , and (3-43), we conclude that
61+1* _ |K|(p-l)/(r-l)
We summarize these results in
THEOREM 3-3. Let
n
ei+i - M I n ei-j
with
| e ± | ^ T, 1 = 0 , 1 ,...,n.
Let s be a positive integer and let r = s(n+l) > 1 . Let M ± -»• K and let | M ± | £ M for all 1 . Let
M r r _ 1 < 1 .
3 . 4 - 8
Then e A - • 0 . Let p be the unique real positive root of
n t n + 1
- 8 £ - 0 .
Let M i and K be nonzero. Then
e i p i K | ^ - ^ ^ r - ^ . ( 3 - 4 4 )
Table 3 - 1 gives values of p = P n + 1 s for low values of n and s. Note that no a priori assumption has been made of the existence of a number p such that
* o .
if e p
If the existence of such a number is assumed a priori, it is not difficult to prove that it must satisfy the indicial equation ( 3 - 3 8 ) .
3.4-9
3.43 Difference equations of type 2, We shall study the asymptotic behavior of the solutions of the difference equation
n ei+i = M i e i n ( ei-j- ei) s + N i e i + 1 ' ( 3 ~ 4 5 )
J=I where
M ^ K / O , N ± -» L, (3-46)
and where n and s are positive integers. We shall show that if the magnitudes of ®0*E^S * • * > E
N
a ^ e sufficiently small, then e^ 0 and there exists a number p greater than unity such that I 1 /| e JL | p converges to a nonzero constant. InJPaet,. we shall demonstrate that the asymptotic behavior of the sequence generated by (3-45) is identical with the asymptotic behavior of the sequence generated by
n e i + l = M i II e|.j, M ± - K
which was studied in the previous section. Let
Q± = |e ±|, r = s(n+l) (3-4?)
and let |M±| £ M, |N±j ^ N
I
3.4-10
f o r a l l i . T h e n
n
s + 1 5 i + i ^ M 5 I n ( 5 i - j + 5 i ) s +
N 5 I
• J - l
L e t
5 i <> R , 1 - 0 , 1 , . . . , n .
A n a p p l i c a t i o n o f L e m m a 3-2, w i t h a l l t h e y^ e q u a l t o s a n d
w i t h q e q u a l t o r , s j i o w s t h a t i f
2 r - s M r r - l + < x ^
t h e n 5 i -»• 0 .
A s s u m e t h a t e ^ i s n o n z e r o f o r a l l f i n i t e 1 . W e m a y
w r i t e (3-4-5) a s
n
e i + l - H II 4-y (3-48) w i t h
T i = M i A i + N i 9 i ' ^3-49)
w h e r e
» i " 5 ( l - A ) " . - ~n~~^ • (3-50)
J = 1 n
W e s h a l l d e m o n s t r a t e t h a t -*• 1 a n d 6^ •—• 0 .
3.4 - 1 1
THEOREM 3-4. Let
n s+1
j-i
To show that \^ -* 1 we note from (3-45) that e 1 + 1 / e 1 - * 0 . Therefore e j _ A j _ « j ~* 0 f o r a 1 1 finite j and the result follows.
To show that 6^ 0 we proceed as follows. Prom
(3~45) and (3-50),
n e ± - Mi-i x±-i n e i - i - j + N i - i e l - i -
Hence
n 6 1 " M i - l V l e L l - n + N i - 1 n ^ " 1 • < 3 - 5 D
n 'i.3 n 4-i
Repeat this process for the second term on the right side of ( 3 - 5 1 ) * Carry out this reduction a total of n - 1 times. The problem is reduced to proving th at e
i + 1 / e ^ ~* 0 and this is clearly true from (3-45) •
Since 7^ 1 , 6^ 0, and M i K, we conclude that T i K. Theorem 3-3 niay now be applied to (3-48) and we arrive at
I
3 . 4 - 1 2
Let
2 r - s M r r - l + j j p S ^
Then e^ 0. Let p be the unique real positive root of
n t n + 1 - s £ ^ = 0 .
J-0
Assume that e i ^ 0 for all finite i. Then
f U l " , K | < P - l ) / < r - l ) .
with
| e ± | i r , 1 = 0 , 1 , . . , , n .
Let s be a positive integer and let r = s(n+l) > 1 . Let M i •* K / 0 , N ± L, and let JMj ^ M, |N ±| £ N for all 1
4.0-1
CHAPTER 4
INTERP0LAT0RY ITERATION FUNCTIONS
In this chapter we shall study I.F. which are generated by direct or inverse hyperosculatory interpolation; such I.F. will be called interpolatory I.F. The major results concerning the convergence and order of interpolatory I.F. will be given in Theorems 4-1 and 4-3. A sweeping generalization of Fourier's conditions for monotone convergence to a solution will be given in Theorem 4-2.
4 . 1 - 1
4 . 1 Interpolation and the Solution of Equations 4 . 1 1 statement and solution of an interpolation
problem. The reader is referred to Appendix A for a discussion of hyperosculatory interpolation theory. Certain salient features of that discussion which are needed for the development of the theory of this chapter will be repeated here.
Consider the following rather general interpolation problem. We seek a polynomial P such that
(k«) (k.) P J (xi_j) = f (xi-j) f o r J =0,l,...,n;
( 4 - 1 )
k^ = 0,1, ... ,7^-1, ^ 1 ; x±_k ± x±_t if k I.
That is, the first 7^-1 derivatives of P are to agree with the first 7j-l derivatives of f at the n + 1 points
xi' xi _2* • • l^t
n
q = I 7 r
Then there exists a unique polynomial P of n,7Q,71» • • - ,7n
degree q - 1 which satisfies ( 4 - 1 ) . For the sake of brevity we write
P = p (h-g) n,7 n,7 Q,7 1, .. .,7 n' v '
4.1-2
i
where 7 signifies the vector y 9y^ ...,y . If, in particular, 7^ = s for all yy then we write the interpolatory polynomial
as P n . n, s Let f( q)(t) be continuous in the interval determined
by x 1,x 1 - 1,..-jX 1 - n,t. Then
f(q)[| (t)] n 7 f(t) - P B f y ( t ) . II (t-xj.j) J , (4-3)
where ^(t) lies in the interval determined by x±>xi-i> • • • >xj_-r
Let f be nonzero and let f^q^ be continuous on an
interval J. Let f map J into K. Then f has an inverse S and
zj(q^ is continuous on K. We can state the interpolation
problem for 3 as follows. We seek a polynomial Q such that
(kj (k.) Q ( y i - p = S ( y i - j ^ f o r J = 0,1, ...,n;
(4-4)
kj = o,i,...,7^-i, 7j i; y ±_ k / y ±_ t if + i.
Then there exists a unique polynomial 7 Q,y^ 5 ...,7 n of degree q - 1 which satisfies (4-4) . For the sake of brevity we write
Qn,7 ^n,yQs71s ..-,7 n*
If, in particular, y^ = s for all 7^, then we write the interpolatory polynomial as ^(t).
(4-5)
4 . 1 - 3
The error of the interpolation is given by
« ( a ) [ M t ) ] " s(t) = Q^ 7(t) + — ^ II ( t - y , ^ ) 8 , ( 4 - 6 )
where 6^(t) lies in the interval determined by
^i'^i-l* • • • '^i-n'^ *
4 . 1 - 4
f
4 . 1 2 Relation of interpolation to the calculation
of roots. Let x±>xi-i9 • • • > xi- n b e n + 1 a P P r o x l r n a n t s t o a
zero a of the function f. To calculate a new approximant it is reasonable to calculate the zero of the polynomial which interpolates f at the points x^,x^_^,...,x1-n. The process is then repeated for the set xi+i' xi'•••> xi-n+l• One drawback of this procedure is that a polynomial equation must be solved at each step of the iteration. A second drawback is that the polynomial will have a number of zeros, some of which may be complex, and criteria are required to select one of these zeros as x
i + 1 - Once such criteria are established the point is uniquely determined by the points X • • X • -i • . . . . X . . We define $ as the function which I J 1-1 I-n n,7
maps xi>xi„i> • • • > xi- n i r r t o xi+i #
The difficulties of having to solve a polynomial equation may be avoided by interpolating the inverse to f, at the points y ^ y ^ i * • #•* yi-n' a n d evaluating the interpolatory polynomial at zero. The point is uniquely determined; let the function which maps xi>x±^iL> • • • * x i - n
i n t o x j _ + 1 be labeled cp n j /-
For either of the two processes described above, x Q,x 1,...,x n must be available. One method for obtaining these starting values is described in Section 6 . 3 2 .
4 . 1-5
If
I.F. which are generated through direct or inverse hyperosculatory interpolation will be called interpolatory I.F. Hyperosculatory interpolation is not the only means by which I.F. may be generated. Other techniques will be studied in various parts of this book. It is a very useful technique however and gives a uniform method for deriving I.F. and of studying their properties and, in particular, their order. The most widely known I.F. are all examples of interpolatory I.F. A useful by-product of generating I.F. by hyperosculatory interpolation is the introduction of a natural classification scheme.
II
4 . 2 - 1
4 . 2 The Order of Interpolatory Iteration Functions 4 . 2 1 The order of iteration functions generated by
inverse interpolation. Let xJ_*XJ__]_j •••' xi- n b e n + 1 a P P r o x l ~
mations to a zero a of f. Let 0^ ^ be the polynomial which interpolates S at the points y ^ y ^ i * • • • >y±-n
i n t J r i e s e n s e °f ( 4 - 4 ) . Define a new approximation to a by
x±+i - ^ J 0 ) '
.Then Repeat this procedure using the points x i + i j x ^ • • • > xi- n+l We shall investigate how the error of depends on the errors at the previous n + 1 points.
We observed ( 4 - 6 ) that
3 ( q )[e.(t)] " 7
- ^ , 7 ( t ) + q T n (Wi-J> J-
where 0^(t) lies in the interval determined by
y 1,y 1_ 1,...,y 1_ n,t, and where q = jy S e t t = °* T n e n
7, xi+l - a - - - ^ f S ( q ) ( e ± ) [I (y ±-j) J> ( 4 - 7 )
where 0 ^ = 9^(0) . Let j = xi-j _ a« W e c a n write ( 4 - 7 )
in terms of either the y^_j or the ^. Since
yi + i • f( xi+i) = f ' ( W e i + i '
M , = -
(4-8)
4.2-2
where T\±+1 lies in the interval determined by x ± + 1 and a, we
conclude that n
y±+i = * Mi n *l-y
(-i)qz{(i)(e±)
* Mi = ~ q!«;(p1+i> '
where p ± + 1 = f (Tl 1 + 1) . Since
we can also conclude that
n ei+i - M i II et-y
(4-9)
(-l)^( q)(e 1)
qi II [3'(pi_j)] J
where p ±_j = f (T|±_^) . It is clear that if none of the set e Q,e^,...,e n is
zero and if does not vanish on the interval of iteration, then e^ is not equal to zero for any finite 1.
4.2-3
We shall show, however, that If the initial approximations x Q,x 1,...,x n, are sufficiently close to a, then e ± -+ 0 We will use (4-9) for the proof. Let
J = -jx |x-a| 1 T\.
Let f( q^ be continuous on J and let f' be nonzero on J. Let f map the interval J into the interval K. Then 3 ^ is continuous on K and 0' is nonzero there. Let
l 8 ( q ^ ' ^ V | . ' ( y ) l i » a
for all y e K; that is, for all x e J. Let
A 2
Let x Q,x 1,...,x n e J. Then
max[|eo|,|e1|,...,|en|] £ r.
We shall show that if
M pq-1 ^ 1 } (4-10)
then all the e J.
!
4.2-4
Since x Q,x 1,...,x n e J, |Mn| £ M. Hence
l en +ll = lMn« II l e i . j l ^ i H T ^ r ,
where the last inequality is due to (4-10). We proceed by
induction. Let x ± e J, for i = 0,1,...,k. Hence |Mk| £ M
and
J=o
which completes the induction. Since all the x. e J, | j ^ M
for all i.
A modification of this proof could be used to show
that e^ -»• 0. Instead we note that
n 7 5 i + l ^ M II (^-j) =
and use Lemma 3-1 to arrive at
LEMMA 4-1. Let q = 7y Let
J = -jx | x-a | £ r|
4.2-5
and let f ^ be continuous and f nonzero on J. Let
x Q,x 1,...,x n e J. Let
for all x e J and let M = ^ A ^ - Suppose that
M r q - 1 < 1 >
Then e i 0.
Since -*• 0, x± -*• a, and we can conclude that
4 . 2 - 6
4 . 2 2 The equal Information case. We will now specialize to the case of Interest for our future work. Let
7j - s, j = 0,1,...,nf
Then the same amount of information will be used at each point. Thus the first s - 1 derivatives of the interpolatory polynomial are to agree with the first s - 1 derivatives of 3 at
yi' yi-l'•• #' yi-n # L e t
r = s(n+l).
The results of the previous section hold for this case if we
replace q by r and 7j by s. In particular,
n - * M i II y±-y
3=o ( 4 - 1 1 )
(-i) r^^(e ±) * M i - " r ! « ' ( p 1 + 1 ) '
and
n e
i + i - M i II el-y 3=0 ( 4 - 1 2 )
M. = - i-i)r^r)(e±) ' 1 n
r ! II r * ' ( P i - j ) ] '
3=0
1
4 . 2 - 7
Let the conditions of Lemma 4 - 1 hold with q replaced by r and 7 , replaced by s. Then |M±| <; M for all i, e ± 0 , and
J M 1 - * Y r ( a ) , where v
r(x) was defined ( 1 - 8 ) as
_ _ (-D r3 ( r )(y) r'[*'(y)] r
y=f(x)
All the conditions of Theorem 3-3 are now satisfied and we
conclude that
K ± l l _ | Yj a ){p-Mr-\ ( 4 . 1 3 )
where p is the unique real positive root with magnitude
greater than unity of the equation
n tn+l _ s \ tJ = o.
J=0
Furthermore, *M ± — -(-l) ra r ( 0 ), where CZr(y) is defined ( 1 - 8 )
as
Hence
I y x i
4 . 2 - 8
i l m i - . |a (o) fp-^r-i). ( 4 . 1 4 )
It is not difficult to see that one can pass from ( 4 - l 4 ) to
( 4 - 1 3 ) by observing that y ±_j = f'(\_3)e±_y
Our results concerning the convergence and order of I.F. generated by inverse interpolation are summarized in
THEOREM 4 - 1 . Let
J = jx |x-a| £ rj-.
Let r = s(n+l) > 1. Let f( r^ be continuous and let f ' S ^ ^ 0 on J. Let x Q,x 1,...,x n e J and let a sequence (x i) be defined as follows: Let g be an interpolatory polynomial for 3 such that the first s - 1 derivatives of 0^ g are equal to the first s - 1 derivatives of 2f at the points y^fY^-^* • • • i-n' Define
xi+l = V s ( x i ; x l - l " - " x i - n ) - %3^0)'
4 . 2 - 9
Let e ^ = x ^ j - a. Let
,r-l for all x e J, and let M = A g . Suppose that MT < 1 ,
Then, x ± e J for all 1 , e ± -* 0 , and
( 4 - 1 5 )
where p is the unique real positive root of
and where
n t n + 1 - s
«J=o ( 4 - 1 6 )
Also
Y r(x) = (-DrZ{r)(v) r! [S'(y)]]
y=f(x)
i ± ^ ^ | a ( o ) | M / M
where
4 . 2 - 1 0
We close this section with a number of comments. We have proven that the order of any I.F. generated by inverse interpolation is given by a certain number p without making any a priori assumptions about the asymptotic behavior of the sequence of errors. If we had assumed a priori the existence of a number p such that I e ^ + ] _ I/I e i I P converges to a limit, then it would have been much easier to prove that this number p was determined by the indicial equation ( 4 - l 6 ) .
If n = 0 , then m is a one-point I.F.; if n > 0 , n, s
then cp is a one-point I.F. with memory. The order p is an n, s integer if and only if n = 0,- that is, if the I.F. has no
memory. Observe that the asymptotic error constant of the
sequence {y^} depends upon OL^ whereas the asymptotic error constant of the sequence {x^ depends upon Y^. We shall see that whenever we deal with direct interpolation, the asymptotic error constant of the sequence (x i) will depend upon A^. Recall ( 1 - 8 ) that A r is the same function of f as <2r is of 3. Thus, in a certain sense, {y^} plays the same role for inverse interpolation that (x^} plays for direct interpolation.
Values of p for different values of n and s may be
found in Table 3 - 1 with k = n + 1 and a = s.
4 . 2 - 1 1
4 . 2 3 The order of iteration functions generated by direct interpolation. We now investigate the case where a new approximation to a is generated by solving the polynomial which interpolates f. We immediately turn to the case where all the 7j are equal to s.
Let xj^xj__]_* • • • ' x i - n b e n + 1 a P P r o x l m a t i o n s to a
zero a of f. Let P be the polynomial whose first n, s
s - 1 derivatives are equal to the first s - 1 derivatives
of f at xJ_jXJ__I' • • • > xj_- n' Define a new approximation to a by
Then repeat this procedure for xi+i>xi> • • • ' xj_- n+l #
Since P is a polynomial of degree r - 1 , where n , s r = s(n+l), xj_+2 w i l 1 n o t generally be uniquely specified by ( 4 - 1 7 ) • It is not even clear a priori that P has a real ' * n, s zero in the neighborhood of a. We shall prove that under suitable conditions P^ does possess a real zero in the
n , s neighborhood of a. Under certain hypotheses we shall, in fact, be able to prove much more.
4.2-12
LEMMA 4-2. Let
J = |x||x-a| £ r|.
Let f ^ be continuous on J and let f' /O.on J. Let
of a. Let
for all x e J. Suppose that
^ i a L U , ; ! . ( 4 - 1 8 ) v 2 r
Then P has a real root, x.,-., which lies in J. n,s -L+J-
Let J = jx |x-a| <; r|- and let xJ_>XJ__;L> • • • * x i - n e J #
Let f , be nonzero on J. If the ^..j bracket a, then it Is clear that P^ n has a real zero in J. Hence it is sufficient ^ n j s to investigate the case where all the x-j__j li e o n o n e side
of a. We shall first prove
I ]
4.2-13
and f' > 0. Hence P n s(x 1) * fCXj ) is positive. We shall prove that Q(a-r) is negative. Since P M Q interpolates f, n, s n} s
p(t) - f ( t ) . - f ( r ) | p ^ 5 ( t - x ^ ) 8 , «J=o
where £(t) lies in the interval determined by x^ jXj^_-^, .,. *x^_^, t. Then
p(orr) = f(a-r) - f ( * f o > n (a-r-x^j) 3,
where £ s £(a-r). Furthermore,
r n
p(a-r) = - rf(n) - - ^ - f ^ U ) J] ( r + X j L . r a ) s ,
where T| lies in (a-r,a). Hence P(a-r) < 0 if
J=o
PROOF. To prove the lemma It is sufficient to prove
the result for the case that
> a, j = 0,1, .. .,n,
4 . 2 - 1 4
Since
2 1
the proof Is complete.
Hence P M has a real root in J provided that ( 4 - 1 8 ) holds. We shall now show that under certain conditions we can prove a much stronger result without demanding that ( 4 - 1 8 ) holds.
LEMMA 4 - 3 . Let
J = jx |x-a| £ rj-.
Let f ^ be continuous on J and let f 'f(r) / 0 on J. Let x 0>x^,...,x n e J and assume that these points all lie on one side of a. Let these points be labeled such that x n is the closest point to a. Let
f(x n)f ( r )(x n) > 0 , r even, ( 4 - 1 9 )
f'(x n)f( r )(x n) < 0 , r odd. ( 4 - 2 0 )
4.2-15
r
Then P„ „ has a real root x^ such that n,s n+i
minta.x^] < x n + 1 < max[a,x n].
PROOF. There are four possible cases depending on the signs of f(* n) and f . We shall prove the result for only two cases which will give the flavor of the proof; the other cases may be handled analogously.
Case 1. f(* n) > 0, f' > 0. We need only prove %
that P„ (a) < 0. Since P„ a interpolates f, n, s n y s
p n, s(t) = f(t) - f ( r ) i ; p n n ( t - x ^ j ) 8 . (4-2i)
Hence
P n , s ^ =-^T^r)(i) II ( X l . r a ) s , J=0
where £ = £(a) . For this case x^_j - a > 0. Hence P n Q(a)
is negative if (-l)r_1f^r^ is negative. The proof of Case 1 may now be easily completed.
Case 2. f ( xn ) < 0, f > 0. We need only prove
that P n^ s(a) > 0. From (4-21),
*»,.<«> - - P - n (c-x^j) 3.
r
4.2-16
For this case a - x ±_j > 0. Hence P n^ s(a) Is positive if
-f( r) is positivej that is if
f(x n)f( r^(x n) > 0, f'(x n)f ( r )(x n) < 0.
Let the hypotheses of the preceding lemma hold,
We can conclude that if a < x R , then
a < x ± + 1 < x ±, i = n,n+l,... . (4-22)
If x < a, then
x i ^ xi+l ^ CL' 1 = n ' n + 1 ' » " • (4-23)
Let (4-22) hold. Then the sequence {x^} is monotone decreas
ing and bounded from below. Hence it has a limit and
xi+l " xi-j ~* °' J - 0 , l , . . . , n . (4-24)
Label the limit £. Let t = x ± + 1 in (4-21). Then
f - £ 4 F n ("iw-'i.,) 8.
4 . 2 - 1 7
where | = l( xj_ + 1) a n d where we have used the fact that P (x 4 j,) = 0 . Let i oo. An application of (4-24) yields n,s x i+ 1 '
f(5) = 0 . Since f' is assumed to be nonzero. £ = a and hence x 1 ~* a. The same conclusion would have been obtained if we had started with (4-23) • We summarize our results in
THEOREM 4 - 2 . Let
J = jx |x-a| £ rj*.
Let r = s(n+l) > 1 . Let f^r^ be continuous on J and let
f, f(r) o on J. Let x ,xn,...,x_ e J and assume that these ' o 1 n points all lie on one side of a. Suppose that
f(x ±)f( r)(x ±) > 0 , r even ( 4 - 2 5 )
f'(x ±)f( r)(x ±) < 0 , r odd ( 4 - 2 6 )
where i is any of 0 , 1 ,...,n. Define a sequence {x^ as follows: Let P be an interpolatory polynomial for f such
n, s that the first s - 1 derivatives of P_ ^ are equal to the
n, s first s - 1 derivatives of f at the points xi'xj__i* • • • > x±- n
9
Let xjL+^ e a P 0^ 3 1^ (whose existence we have verified in the preceding discussion) such that x. . is real, P^ (x. ...) = 0 ,
i"t"j. n, s i"T"jL
and
min[a,xiJ < xi + 1 < max[a,x i].
Then the sequence {x^} converges monotonically to a.
r
4 . 2 - 1 8
This result is well known for the special case n = 0 , s = 2 which is Newton's I.F. For this case the conditions of Theorem 4 - 2 are known as Fourier conditions (Fourier [ 4 . 2 - 1 ] ) . Theorem 4 - 2 gives a sweeping generalization of Fourier 16 result. Although the sufficiency of the Fourier conditions are geometrically self-evident for Newton's method, they are not self-evident in the general case.
Observe that the hypotheses of Theorem 4 - 2 do not place any restrictions on the size of the interval where
(r) / monotone convergence is guaranteed other than that f'fv ' f 0 . Note that the condition that x ,xn,...,x all lie on one side o 1 n of J is automatically satisfied if n = 0. If ( 4 - 2 5 ) and ( 4 - 2 6 ) do not hold, then we shall have to place restrictions on the size of the interval where convergence is guaranteed. We have already seen in the proof of Lemma 4 - 2 that in the general case we require a condition on the size of the interval in order to assure that P_ has a real zero in the interval.
n, s To prove convergence in the general case we start
again with
f ( r ) t e , ( t ) ] n
*»..<*>-'<*> — F T — n <*-*i-j> s-j=o
1
4 . 2 - 1 9
Then
n p
n , s ( a ) = - f ( r ) ( C i ) n ^ ei-j - XI-J - A > ^ ^ ( A ) -J-0
Let the conditions of Lemma 4 - 2 hold. Then there exists a
real x 1 + 1 e J such that P n s ( x 1 + 1 ) = 0 . Furthermore,
P n, 8(a) = (a-x ± + 1)P'(Tl 1 + 1),
where Tl 1 + 1 lies in the interval determined by a and Then
i+r
Assume that p' does not vanish in the interval determined n,s by a and x
i + 1 * Then
n
= Hi+i II el-y
( 4 - 2 7 )
i+1 "i+1 II "i-j-j=0
1 + 1 r ! c < w
4.2-20
Observe that if none of the set e0> ei>•••* e
n i s
zero and if f d o e s not vanish on the interval of Iteration, then e ^ is not equal to zero for any finite i.
We shall show, however, that if the initial approximations x Q,x 1,...,x n, are sufficiently close to a, then e A 0 . Let
for all x e J. Let
Pn,s| ^ 2
for all x In the interval determined by a and xi + 1 « l^t
H = v1/|x2. Then | H ± + 1 | £ H and
n 5i+i * H II dl-y 5i-j - K - j l -
An application of Lemma 3-1 shows that if
H T r _ 1 < 1
then eA -*• 0.
4 . 2 - 2 1
Observe that the argument used In the proof of Lemma 4 - 1 could not be used because depends upon
Since
we conclude that
H i+ 1 - (-l)rAr(^). A r = ijp-.
All the conditions of Theorem 3-3 are now satisfied and we
conclude that
l e '
i n a U | A r ( a ) H / ( r _ l ) ( 4 - 2 8 )
where p is the unique real positive root with magnitude greater than unity of the equation
t n + 1 - s £ tJ = 0.
Our results concerning the convergence and order of I.F. generated by 'direct interpolation are summarized in
4 . 2 - 2 2
r
THEOREM 4 - 3 . Let
J = |x |x-a| £ rj.
Let r = s(n+l) > 1. Let f^r^ be continuous and let
f, f(r) ^ o on J. Let x ,x1,...,x e J. Let
^T-1 1 v,, |f'| 1 v.
for all x e J. Suppose that
21 2r rr-l < 1 # ( 4 _ 2 9 )
v 2
Define a sequence {x^} as follows: Let P n g be an interpolators polynomial for f such that the first s - 1 derivatives of P are equal to the first s - 1 derivatives of f at the n , s points x ^ x ^ . ^ • . . > x^_ n- Let be a point such that x^+i
is real, P n s( x^ +l) = O j a n d xj_+i e J* T h e e x i stence of such a point is assured by ( 4 - 2 9 ) • Define $ by
n, s
x |+1 ~ $ n , S ( x i ; xi-l' • • " xi-n) •
Assume that Pn,s ;> |i2 for all x in the interval determined by a and x 1 + 1 • Let H = v1/|x2 and suppose that H r 1 ^ 1 < 1. Let e ±_j = x ^ - a.
4 . 2 - 2 3
Then, x^ e J for all 1 , e 1 -* 0 , and
I f l d i U u ( „ ) ( P - l > ^ , (4-30)
where p is the unique real positive root of
t n + 1 - 8 £ t J = 0 ,
J=o
and where
f ( r )
r r!f>#
NOTE. The point may not be uniquely defined by the hypotheses of the theorem. Additional criteria must then be imposed in order to make $ a single-valued function.
n , s
The hypotheses of this theorem may seem rather strong; in the cases of greatest practical interest, however, some of the conditions are automatically satisfied. See the examples of the next section.
The form of ( 4 - 3 0 ) is strikingly similar to the form of ( 4 - 1 5 ) • As before, the only parameters that appear are r and p.
If n = 0 , then <f> is a one-point I.F,; if n > 0 , n, s then $ is a one-point I.F. with memory, n, s
4 . 3 Examples In all the examples of this section J will denote
the interval defined by
We shall find that both the Newton I.F. and the secant I.F. may be derived from both direct and inverse interpolation. For a given I.F., the conditions sufficient for convergence may vary depending on the method of generation of the I.F. This should not be surprising since Theorems 4 - 1 and 4 - 3 were derived for general families of I.F. Hence the conditions sufficient for convergence, for a special case which Is covered by both theorems, need not agree. The notation in this section is the same as in the previous section. The Yj are defined by ( 1 - 8 ) .
EXAMPLE 4 - 1 . We perform inverse interpolation with n = 0 and s = 2 . Hence r = 2 and
Q o , 2 ( t ) = *i + (^i)*!' S i = ^ i ^
Then x l + l - O o ^ * ! * = Q o , 2 ( 0 ) = X i " u u
II
4 . 3 - 2
This is Newton's I.F. Then
^ 1 *"( 9 i ) 2 1 f"(*i) , , 2 2 ei+l = " i 7 T T e i " 5 ^ [f TU, ] 2e 2,
( 4 - 3 1 )
where 0 ± = f(£ ±), p ^ = f (T^) . Let x Q e J and let f" be * continuous and f'3" ^ 0 on J. Let
for all x e J. Let M = 7\1/A| and suppose that Mr < 1. Then 0 and
eI+l t t - Y 2(a). (4-32)
No absolute value signs are required in (4-32) since the method is of integral order.
EXAMPLE 4 - 2 . We perform inverse interpolation with n = 0 and s = 3• Hence r = 3 and
2 " Q o , 3 ( t ) = 3 i + (t-yi)^! + V
Then
x i+l - c P o , 3( x i ) = * o , 3 ( 0 ) = x i " u i " V x i K ' A 2 - • & >
and
_ 3
e ± + 1 " 6 [ 3 ' ( P l n 3 e ± '
Let x Q e J and let f"' be continuous and / 0 on J, Let £ £ 1 3 ' | ^ * 2 for all x e J. Let M = \/X^ and
p suppose that Mr < 1. Then -+ 0 and
ei+i f * - Y 3(a).
The I.F. of the form cp are of such importance that they are given the special designation E . They will be
s studied in considerable detail in Chapter 5 ; their properties form the basis for the study of all one-point I.F.
EXAMPLE 4 - 3 . We perform inverse Interpolation with n = 1 , s = 1 . Therefore r = 2 and
W l
where we have used the Newtonian form of the interpolatory polynomial as given In Appendix A. Then
xi+l - *1,1{X±; X i - 1 } = Q l , l ( 0 ) = X i " f i
This is the secant I.P. Then
ei+l " " 2 S'lp^S'tPi^) ei ei-l'
Let x Q,x^ e J and suppose that the other conditions of Example 4 - 1 hold. Then e i -* 0 and
l!i±l[-. |Y ( a ) ! ^ 1
|e±|P ' 2 ( ^ '
where p = |(l+v/5) ~ 1 . 6 2 .
Q l , l ( t ) - *i + (t-yi)
xl~ xi-l fi" fi-l
4 . 3 - 5
EXAMPLE 4 - 4 . We perform direct interpolation with n = 0 , s = 2 . Therefore r = 2 and
?o,2^ - f i + ( t - ^ K . Then
xi+l " *o,2 ( xi> = x i " u i *
This is Newton's I.F. again. Since P Q 2 is a linear poly-nomial, is always uniquely specified. Since P Q 2 * f the hypothesis of Theorem 4 - 3 which is concerned with the
nonvanishing of P is automatically satisfied by the condi-n, s
tion on the nonvanishing of f . Also ( 4 - 2 7 ) becomes
f " U ± ) 2
ei+l = 2 f ' ( X i ) e i - ( 4 ^ 3 3 )
Let x Q s J and let f" be continuous and f 'f" ^ 0
on J. Let f(x }f"(x ) > 0 . From Theorem 4 - 2 , we conclude o o that if x Q < a, then x^ converges to a monotonically from below while if x Q > a, then x ± converges to a monotonically from above. Furthermore
^ T - A ( a ) . ( 4 - 3 4 ) e i
Observe that since Yg(a) = Ag(a), ( 4 - 3 2 ) and ( 4 - 3 4 ) do not contradict each other.
If we perform direct interpolation with n = 1 and s == 1 we will again derive the secant I.F. The I.F. of the form * will be studied in Section 5.3 with emphasis on the study of * ,. The case n = 2, s - 1 will be studied in Section 10.2.
5 . 0 - 1
CHAPTER 5
ONE-POINT ITERATION FUNCTIONS
In this chapter we shall study the theory of one-point I.F. These I.F. are of integral order. One particular basic sequence, E , will be studied in considerable detail
s in Section 5 . 1 . By using E as a comparison sequence, we draw conclusions about certain properties of arbitrary one-point I.F. We consider Theorem 5 - 3 to be the "fundamental theorem of one-point I.F."
5.1 The Basic Sequence E 2 s We recall our definition of basic sequence. A
basic sequence of I.F. is an infinite sequence of I.F. such that the pth member of the sequence is of order p. Technique for generating basic sequences, some of which are equivalent, are due to Bodewig [ 5 . 1 - 1 ] , Curry [ 5 . 1 - 2 ] , Ehrmann [ 5 . 1 - 3 ] *
E. Schroder [ 5 . 1 - 4 ] , Schwerdtfeger [ 5.1-5L and
Whittaker [ 5 . 1 - 6 ] , among others. See also Durand [ 5 . I - 7 ] ,
Householder [ 5 . 1 - 8 , Chap. 3 ] * Korganoff [ 5 .1-9* Chap. 3 ] *
Ludwig [ 5.1-10], Ostrowski [5.1-11, Appendix J], and Zajta [5.1-12]- From Theorem 2 - 6 we know that two I.F. of order p can differ only by terms proportional to u^. Hence if the properties of one I.F. of order p are known, many of the properties of arbitrary I.F. of order p may be deduced. If the properties of a basic sequence are known, then many of the properties of arbitrary I.F. of any order may be deduced. In Section 5 . H we shall study a basic sequence whose simplicity of structure makes it useful as a comparison sequence.
5 . 1 - 2
5 . 1 1 The formula for E . The I.P. m were s_ T n , s defined and studied in Section 4 . 2 2 . If n = 0 , these I.F. are without memory. Because of their importance, the cp
O , S are given the special designation E a . The conditions for the
s convergence of a sequence generated by E a were derived in
s Section 4 . 2 2 ; we assume that these conditions hold. In contrast with the careful analysis which is required in the general case, we shall find that the proof that E o is of
s order s is almost trivial.
In order that the material on E g be self-contained, we start anew. Let f' be nonzero in a neighborhood of a and
(s) let f v ' be continuous in this neighborhood. Then f has an inverse 3f, and 2fv vis continuous in a neighborhood of zero.
o be the polynomial whose first s - 1 derivatives agree o, s with 2f at the point y *» f (x) . Then
«(*) - ^ . . ( t ) + « ( ' ) f f t n (t-y) 8.
and
- 1 ,(J>
where e(t) lies in the interval determined by y and t. Define
E => 0 (0) a ^o,sK ''
5 . 1 - 3
Hence
E s " I ^ ^ J ) f J ( 3 - D ,J=0
or
>S=*-Z^^. (5-2) Furthermore,
a » Eg .+ Lzl^lVs)(e)fs, ( 5 . 3 )
where 6 = 6(0).
In ( 5 - 2 ) , E o is expressed as a power series in f . For some applications It is more useful to express E as a
s power series in u where u = f/f. Hence we write
s-1
-1 In practice fj is not known and we must express E in terms
s of f and its derivatives. With the definition
Y (x) = ( - D ' W ^ f y ) J JI[S'(y)]J
y-f(x)
w e m a y w r i t e
s ^ l
E s ( x ) = *[ ~ Z Y J u J ( 5 - 4 )
a n d
a = E + (-D s * ( s ) (e) u s t
8 s ! [ S ' ] s
( 5 - 5 )
T h u s
a = E g + 0 [ u s ] . ( 5 - 6 )
W e m a y w r i t e f o r m a l l y t h a t
00
a = x " Z V J ' ( 5 - 7 ) j = l
T h e s t r u c t u r e o f t h e w i l l b e i n v e s t i g a t e d i n S e c t i o n 5 . 1 3 .
A s s u m e t h a t 0 ^
a b o u t z e r o . P r o m ( 5 - 5 ) ,
A s s u m e t h a t 3 ( ( s ) d o e s n o t v a n i s h i n a n i n t e r v a l
v a _ (-i) s- 1g( s)f 0) / _ u _ y ( x - a ) s s ! [ 3 ' ] s '
S i n c e
we conclude that
(x-a) s (5-8)
Let x. = x, x 1 + 1 = E s ' e l 8 5 x i " 1 1 1 6 1 1 (5-8) may be
written as
Y Q(a).
Hence E g Is of order s and has Y g(a) as its asymptotic error constant. Since the informational usage of E is s, its informational efficiency is unity. Hence E is an optimal basic sequence. (These terms are defined in Section 1.24.)
It is easy to see that
- (-l) 3" 1^(0),
where
y A = f(x ±),
Bodewig [ 5 . 1 - 1 3 ] attributes E o to Euler [5.1-14]. s
In the Russian literature, these formulas are credited to Chebyshev who wrote a student paper entitled "Ca.lcul des racines d'une equation" for which he was awarded a silver medal. This paper which was written in 1 8 3 7 or 1 8 3 8 has not been available to me.
We show that the formula of E. Schroder [ 5 . 1 - 1 5 ]
is equivalent to E g . Observe that
jl 1 d e v W _ _ \ _ 1
dy f '(x) dx' 75 K Y ' " f '(x) •
Then from ( 5 - 2 ) ,
s L y. 1 (x)\T^Tx) dxj TiJFf'
which is Schroder's formula. Compare with the formula for a Burmann series given by Hildebrand [ 5 . 1 - 1 6 , p. 2 5 ] .
5 . 1 2 A n e x a m p l e . I n c e r t a i n c a s e s o n e c a n s h o w
t h a t t h e f o r m a l i n f i n i t e s e r i e s f o r a c o n v e r g e s t o a . A
s e r i e s s o l u t i o n o f a q u a d r a t i c e q u a t i o n i s s t u d i e d b y
E . S c h r o d e r [ 5 . 1 - 1 7 ] .
E X A M P L E 5 - 1 . C o n s i d e r f ( x ) = x n - A , w i t h n a n
i n t e g e r . I f n ^ 2 t h i s l e a d s t o a f o r m u l a f o r n t h r o o t s ,
w h i l e i f n = - 1 t h i s l e a d s t o a f o r m u l a f o r t h e r e c i p r o c a l
o f A . I f f ( x ) = x n - A , t h e n
»(y) = ( A + y ) l / n
a n d
J - l
n
T h e n
x + x ( 5 - 9 )
I n p a r t i c u l a r ,
5 . 1 - 8
If n = 2 , this is Heron1 s method for the approximation of square roots. The formula ( 5 - 9 ) was derived by Traub [ 5 . 1 - 1 8 ]
using the binomial expansion. If n = - 1 ,
s-l E s = x £ (l-Ax) J, ( 5 - 1 0 )
<J=0
and
00
This geometric series converges to l/A if |l-Ax| < 1 .
Rabinowitz [ 5 . 1 - 1 9 ] points out that ( 5 - 1 0 ) may be used to carry out multiple-precision division.
See Durand [ 5 . 1 - 2 0 , pp. for the approximation of In A, sin"'1A, and tan""1A.
5 . 1 - 9
5 . 1 3 The structure of E . We defined E as . s s
rr s-i y j-i J y ' J![*'(y)]J
E = x s y=f(x)
It is easy to show that Y^ satisfies the difference-differential equation,
jY^x) - 2(j-l)A2(x)YJ_1(x) + Y ^ U ) = 0, Y^x) = 1, Ag(x) =
( 5 - 1 1 )
The first few Yj may be calculated directly from this equation, An explicit formula for the Y^ may be derived from
the formula for the derivative of the inverse function derived in Appendix B. We have
'f* I (-l)r(j+r-l)! II " ^ j - , ( 5 - 1 2 )
1 = 2 V
with the sum taken over all nonnegative integers p such that
2, (i-DPi = J " 1 , ( 5 - 1 3 ) 1 = 2
and where r = 2 ± £ 2 p 1 # For j = l, p ± = o for all i. Prom the definition of Y^ and ( 5 - 1 2 ) we have
5 . 1 - 1 0
THEOREM 5 - 1 . Let
( - l ) J - y j ) ( y ) Y,(x) - , J «Ji[3'(y)]J
A ( X ) =
y=f(x)
where f and are Inverse functions. Then
i=2 1
with the sum taken over all nonnegative integers such that
^ (i-l)p± = J - 1 , i=2
and where r = 0 -
The first few Yj are given in Table . 5 - 1 • Observe that Yj is independent of f; it depends only on the derivatives of f. We conclude that
j=l 1 = 2 1
( 5 - 1 4 )
where the Inner sum Is taken over all nonnegative Integers p ±
such that 2 ± £ 2 (i-l)^ 1 = j - 1 and where r = 2 1 £ 2 p ± .
5.1-11
TABLE 5-1. FORMULAS FOR Yj
1
5A| - 5A gA 3 + A 4
l4Ag - 21AgA3 + 6AgA^ + 3A^
5 . 1 - 1 2
Eg - x - u,
E3 = E 2 - A 2 u 2 ,
E 4 = E 3 - (j2A| - A 3)u 3, ( 5 - 1 5 )
E 5 = E 4 - (5A 3 - 5A 2A 3 + k^*h.
The following corollaries follow easily from Theorem 5 - 1 .
COROLLARY a. Y^ is a polynomial in A g,A 3,. .., A^
COROLLARY b. A^ is the same polynomial in
9 f * • • 9 that Yj is in Ag • A^ ••#<«Aj«
COROLLARY c. The sum of the coefficients in this polynomial is unity.
By replacing the upper limit of the first sum by <», we obtain a formal infinite series formula for a.
The first few E are given byx s
5.1-13
For some applications it is convenient to work with
Zj - Y j UJ . (5-16)
Then
s-1
E s = x - £ Z r (5-17)
We have
COROLLARY e. The form of is given by
where = 0, ip A = -1. Thus Zj is a polynomial, homogeneous of degree zero and isobaric of weight -1.
PROOF. The polynomial is an identity in x. Let f(x) - (l-x)" 1. At x = 0, A j = 1 f ° r a l l J* The Inverse to f(x) is Sf(y) = 1 - y - 1 and when x = 0, y = 1. A short calculation shows that at y = 1, Y^ = 1 for all j.
COROLLARY d, Y 1 depends on A 1 only as (-l)^A1.
5.1-14
It Is easy to show that Z^ satisfies the difference-
differential equation,
JZj(x) - (j-l)ZJ_1(x) + u(x)zj_1(x) = 0 , Z 1(x) =u(x). (5-18)
In terms of the forward difference operator A, this equation may be written as
At-JZjU) ] + Z1(x)Zj(x) = 0.
In Lemmas 5-1 and 5-2 and in Theorem 5-2, we shall make use of the fact that E is of order s. The following
s lemma enables us to calculate the asymptotic error constant of an arbitrary I.P. of order p.
LEMMA 5-1. Let ep be an I.F. of order p and let C be its asymptotic error constant. Let
cp(x) - E (x) G(x) = -f , x ji a.
(x-a) p
Then C Y (a) + lim G(x) .
p x -+ a
1
5.1-15
PROOF. Take cp2 = cp and = E p in Theorem 2-8 and observe that the asymptotic error constant of E is Y (a).
1? JP
A more useful result is given in
LEMMA 5-2. Let cp be an I.F. of order p and let C be its asymptotic error constant. Let
<p(x) - E (x) H(x) = — - E , x a .
Then u p(x)
C - Y (a) + lim H(x). p x a
PROOF. From the previous lemma,
lim H(x) = lim G(x) x a x a
x-q IP = C - Y p(a),
since
lim Ji(*l = i. x - a x " a
r
5.1-16
E X A M P L E 5-2. L e t
<p = x -1 - A 2 u *
T h i s i s H a l l e y ' s I . F . w h i c h w i l l b e d e r i v e d i n S e c t i o n 5.21
S i n c e E ^ = x - u [ l + A 2 u ] , i t i s e a s y t o s h o w t h a t
l i m H ( x ) - - A ^ ( a ) . T h e n
C = Y 3(a) - A | ( a ) - A|(o) - A^a) .
T h e o r d e r a n d a s y m p t o t i c e r r o r c o n s t a n t o f a n I . F .
w i t h I n t e g e r - v a l u e d o r d e r m a y b e c a l c u l a t e d u s i n g t h e r e s u l t s
o f T h e o r e m 2-2. I t i s a w k w a r d h o w e v e r t o a p p l y t h i s t h e o r e m
f o r a l l b u t t h e s i m p l e s t I . F . T h e f o l l o w i n g t h e o r e m p e r m i t s
t h e c a l c u l a t i o n o f t h e o r d e r a n d a s y m p t o t i c e r r o r c o n s t a n t o f
a n I . F . b y c o m p a r i n g i t w i t h E p + ^ . T h i s p r o c e d u r e t u r n s o u t
t o b e p a r t i c u l a r l y u s e f u l i n t h e d e v e l o p m e n t o f C h a p t e r 9.
T H E O R E M 5-2. cp i s o f o r d e r p i f a n d o n l y i f
l i m
x —• a
' c p ( x ) - E p + 1 ( x ) '
. u p ( x )
5.1-17
exists and is nonzero. Furthermore
C = lim x -+ a . uP(x)
where C is the asymptotic error constant of cp. PROOF. From (5-6),
a = E p + 1(x) + 0[u p + 1(x)].
Therefore,
,(») -a_*W - V l ^ ) + 0 > P + 1 ( x ) ] f x x {
(x-a) p u p(x) x-
Since
cp(x) - E p + 1(x) / u ( x ) N P
u p(x) V x " a / /u(xi} V x-a y
we conclude that
lim x -+ a
q>Cx) - a . (x-a) p J
<p(*) - E p + 1(x) u p(x)
which completes the proof.
5.1-18
Observe that Lemmas 5-1 and 5-2 each involve two I.F. of the same order; Theorem 5-2 involves two I.F. whose orders differ by unity.
The next lemma is used in Section 5.51.
LEMMA 5-3-
W * > - >.(«) - ^ B,(x)
I J - 2
PROOF. From (5-18),
JZj(x) - £ (tJ-l)Zj_1(x) + u(x) £ Zj-i(x) = 0 J=2 J=2
This telescopes to
sZ. (x) - Z 1(x) + u(x) £ zj_i( x) = ° J=2
Since Z-^x) = u(x),
sZ s(x) - u(x) 1 - £ z ^ w J=2
-» u(x)Es(x)
5.1-19
The fact that
completes the proof.
EXAMPLE 5-3. Let s = 2. Then
Eq = Eo " £UE' = x - u - Ju(l-u'), J3 "2 s ^ 2 and
E^ = x - u - |u(2A2u) = x - u - AgU 2
5.2 R a t i o n a l A p p r o x i m a t i o n s t o E a . s
I t i s c o m m o n k n o w l e d g e t h a t r a t i o n a l f u n c t i o n s a r e
o f t e n p r e f e r a b l e t o p o l y n o m i a l s f o r t h e a p p r o x i m a t i o n o f
f u n c t i o n s . T h e r a t i o n a l f u n c t i o n a p p r o x i m a t i o n s t o a f u n c
t i o n m a y b e a r r a n g e d I n t o a t w o - d i m e n s i o n a l a r r a y i n d e x e d b y
t h e d e g r e e s o f t h e p o l y n o m i a l s o f t h e n u m e r a t o r a n d
d e n o m i n a t o r . S u c h a n a r r a y I s c a l l e d a P a d e t a b l e . S e e
K o b e t l i a n t z [5-2-1], K o p a l [5-2-2, C h a p . I X ] , a n d W a l l [5.2-3,
C h a p . 20].
S i n c e E s i s a p o l y n o m i a l i n u w i t h c o e f f i c i e n t s
d e p e n d i n g o n t h e d e r i v a t i v e s o f f , w e e x t e n d t h e u s u a l p r o
c e d u r e a n d f o r m a P a d e t a b l e o f I . F . T h e r a t i o n a l a p p r o x i m a
t i o n s t o E a a r e c o n s t r u c t e d s o a s t o b e o r d e r - p r e s e r v i n g .
s
F o r e a c h s w e o b t a i n s - 1 o p t i m a l I . F . o f o r d e r s . I n
p a r t i c u l a r w e o b t a i n t h e o f t e n r e d i s c o v e r e d H a l l e y ' s I . F .
M o s t o f t h e m a t e r i a l o f t h i s s e c t i o n f i r s t a p p e a r e d i n
T r a u b [5-2-4],
r
5-2-2
5-21 I t e r a t i o n f u n c t i o n s g e n e r a t e d b y r a t i o n a l
a p p r o x i m a t i o n t o E . I t i s c o n v e n i e n t t o d e f i n e a p o l y -s
n o m i a l Y ( u , s - l ) b y
s-1 Y ( u , s - l ) = £ Y j ( x ) u j ( x ) .
T h e n EQ = x - Y ( u , s - l ) . W e w i l l s t u d y r a t i o n a l a p p r o x i m a t i o n s
t o Y ( u , s - l ) w h i c h a r e o r d e r - p r e s e r v i n g . D e f i n e
V b • x - fc:5)- » + * • • - 1 . « > °.
w h e r e
R(u,s,a) = ^ R s^(x)u J(x), Q(u,s,b)> ^ ^ (x)uJ(x), Q g q ( x ) . 1.
N o t e t h a t t h e " c o n s t a n t t e r m " i s a b s e n t i n R ( u , s , a ) a n d
p r e s e n t i n Q ( u , s , b ) . T h i s g u a r a n t e e s t h a t ^ a ^ ( a ) = a .
T h e a + b p a r a m e t e r s
R s ^ j ( x ) , J = l , 2 , . . . , a ; Q s ^ j ( x ) , J - 1 , 2 , . . . , b ,
a r e c h o s e n s o t h a t
R ( u , s , a ) - Y ( u , s - l ) Q ( u , s , b ) = 0 [ u s ( x ) ] . (5-19)
5 . 2 - 3
r
Then
and by Theorem 2 - 7 , i/a is of order s. Equivalently, b
is of order a + b •+ 1. The conditions imposed by ( 5 - 1 9 ) are satisfied if the R .,Qa , are chosen so that
s, j s, j
r3
J=0
where cd. _ » 1, for I £ a, cd, = 0 , for I > a, and v , a c, a
k * min(t-l,b). For parameters thus chosen,
Since the Yj(x), 1 ^ J <. s - 1 , depend only on f ^ , 1 <L J <1 9 - 1* and since the R .(x),Qo .(x) depend only on these Yj(x), we conclude that the ^ a fc are all optimal I.F. A number of formulas of type $ , , together with their asymp-
a, d
totic error constants, may be found in Table 5 - 2 . Note that
^a,o s-1
For s fixed, which of the s - 1 I.F. generated by this process is to be preferred? There are indications (Frame [ 5 . 2 - 5 ] and Kopal [ 5 . 2 - 6 ] ) that the I.F. which lie near the diagonal of the Pade table are best.
3 )
TABLE 5-2. FORMULAS FOR v and C *a,b a,
Formulas for f Asymptotic Error Constants
^a b" a
C = lim a ' b
7 x —•a (x-a;
s = 2: tJj~ = x - u, Newton rl, o ci,« = Y2(a)
s = 3: y 2 ^ = x - u[l+Y2u] = Y3(a)
*L,1 = X • 1-Y2u' H a l l e y = Y3(a) - r|(a)
s = k: d = x - u[l+Y2u+Y3u2]
Y 2 + (f2 - Y3)u]
*2,1 = X " U Y 2-Y 3u C2,l
- \(a)
= V a )
Y 3 ( a )
" Y2(a)
1 , 1 , 2 X 1 - Y 2u + (y2, - Y^)u2 Cl,2 = V a ) - 2Y2(a)Y3(a) + Y^a)
5.2-5
EXAMPLE 5-4. If two I.F. have the same order, then a measure of which I.F. will converge faster is given by the relative magnitudes of the asymptotic error constants. Let
3a,b f(x) m xn - A, a = A1//n. Define C h by
Then
and
^a b " a
(x-a) 8 a ' b
_ (n-l)(2n-l) s = 3: C 2'° 6c2 '
2 , n - n - 1
1 1 ~ 2* 1 , 1 12<Xd
s . 4. c = (n-l)(2n-l)(3n-l) 3,o 2 4 a 3
C - (n2-l)(2n-l) 2' 1 " 72a3 '
_ c n(n 2-l)
c c lim = 4, lim p 3 - ^ = 9,
n -+ oo 1,1 n oo 2,1
lim = •3-n eo
c2,l 2 '
5.2-6
The " o a r e n o t t n e o n l y optimal I.F. of rational form. Thus Kiss [ 5 - 2 - 7 ] suggests a fourth order formula which in our notation may be written as
u(l-Apu) cp = x ^ ( 5 - 2 0 )
l - 2 A 2u+A 3u*
Snyder [ 5 - 2 - 8 ] derives ^ 1 (Halley's I.F.) by a "method of replacement,11 and a fourth order formula by a "method of double replacement •!f When Snyder's fourth order formula is translated into our notation, it is seen to be identical with ( 5 - 2 0 ) which Kiss derives by entirely different methods. Hildebrand [ 5 - 2 - 9 , Sect. 9 - 1 2 ] studies I.F. generated by Thiele's continued-fraction expansions.
5.22 T h e f o r m u l a s o f H a l l e y a n d L a m b e r t . P e r h a p s
t h e m o s t f r e q u e n t l y r e d i s c o v e r e d I . P . i n t h e l i t e r a t u r e i s
•1,1 m x ~ T % i r
I t h a s r e c e n t l y b e e n r e d i s c o v e r e d b y F r a m e [5.2-10],
R i c h m o n d [5.2-11], a n d H . W a l l [5.2-12]. I t w a s d e r i v e d b y
Ey S c h r o d e r [5-2-13, p . 352] i n 1870. S a l e h o v [5.2-14]
I n v e s t i g a t e s t h e c o n v e r g e n c e o f t h e m e t h o d w h i c h h e c a l l s t h e
m e t h o d o f t a n g e n t h y p e r b o l a s . Z a g u s k i n [5.2"15, P . 113]
p o i n t s o u t t h a t ^ m a y b e d e r i v e d b y t h e m e t h o d o f D o m o r y a d .
B a t e m a n [5.2-16] p o i n t s o u t t h a t t h e m e t h o d i s d u e t o
H a l l e y [5.2-17] (1694).
I f f =s x 1 1 - A , H a l l e y ' s I . F . b e c o m e s
9 = x [ ( n - l ) x n + (n+l)A]f ( 5 _ 2 1 )
( n + l ) x n + ( n - l ) A
a n d
( x - a ) 0 12a*
I n t h e c u r r e n t l i t e r a t u r e , (5-21) i s o f t e n a s c r i b e d t o
B a i l e y [5.2-18] (l94l). E q u a t i o n (5-21) w a s d e r i v e d b y
U s p e n s k y [5.2-19] (1927). R . N e w t o n [5.2-20] p o i n t s o u t t h a t
D a v i e s a n d P e c k [5.2-21] (1876) c a l l i t H u t t o n ' s m e t h o d b u t
t h e y g i v e n o r e f e r e n c e . D u n k e l [5.2-22] n o t e s t h a t t h e f o r m u l a
w a s k n o w n t o B a r l o w [5.2-23] (l8l4). K i s s [5.2,-24] a n d
M u l l e r [5.2-25] a s c r i b e t h e m e t h o d t o L a m b e r t [5.2-26] (1770),
5 , 3 A B a s i c S e q u e n c e o f I t e r a t i o n F u n c t i o n s G e n e r a t e d b y
D i r e c t I n t e r p o l a t i o n .
I n S e c t i o n 5 . 1 w e s t u d i e d t h e b a s i c s e q u e n c e E o ^ m e
S O j s g e n e r a t e d b y i n v e r s e I n t e r p o l a t i o n a t o n e p o i n t . We t u r n t o
t h e b a g i c s e q u e n c e * o g e n e r a t e d b y d i r e c t i n t e r p o l a t i o n a t
O y S o n e p o i n t . T h e s e l a t t e r I . F . h a v e t h e d r a w b a c k t h a t f o r
s > • ? , a p o l y n o m i a l o f d e g r e e s - 1 m u s t b e s o l v e d a t e a c h
i t e r a t i o n . T h e y h a v e t h e v i r t u e o f b e i n g e x a c t f o r a l l p o l y
n o m i a l s o f d e g r e e l e s s t h a n o r e q u a l t o s - 1 . I n S e c t i o n 5.33*
w e i n v e s t i g a t e a t e c h n i q u e w h i c h r e d u c e s t h e d e g r e e o f t h e
p o l y n o m i a l t o b e s o l v e d b u t w h i c h p r e s e r v e s t h e o r d e r o f t h e
I . F . g e n e r a t e d .
r
5.3-2
5 . 3 1 The basic sequence 4> CT. The I.F. <J> was O j s n , s
defined and studied in Section 4.23. If n = 0 , these I.F.
are without memory. Th$ conditions for the convergence of a
sequence generated by <j> were derived in Section 4 . 2 3 ; we o , s
assume that these conditions hold. In contrast with the
careful analysis which is required in the general case, we
shall find that the proof that $ is of order s is almost o , s
trivial.
In order that the material on S> 0 be self-Q , S
contained, we start anew. Let be the polynomial whose o, s
first s - 1 derivatives agree with f at the point x i f Then
f(t) = P 0, s(t) + gr 4 ( t - x ± ) s , ( 5 - 2 2 ) and
where i^(t) lies in the interval determined by x. and t.
Define x i + 1 by
Po,s( xi +l> " <>• ( 5 - 2 3 )
Let a real root of ( 5 ~ 2 3 ) be chosen by some criteria. Let
the function that maps x. into x. be labeled * . Thus i 1+1 o,s
xi+l - * o , B( x i > '
5.3-3
The order and asymptotic error constant of $ are now o, s
easily obtained. Set t = a in (5-22). Then
° = *o,a<«> • * # ' ( a ) < e 1 ) . ; .
where e i » - a and where £^ = £^(a) . Since
where Tl^+^ lies in the interval determined by x 1 + 1 and a, we conclude that
' t \ f ( B ) ( ^ s P o , s^i+l^ ei+l ^ s! e i #
Le* P ^ o fce nonzero in the interval determined by x . a n d a ( s)
and let f v ' be nonzero in the interval of iteration. Observe that p' a -*f'(a). Then o, s
^ - ( - l ) s A s ( a ) , A 8 - f ^ . (5-24) e i
Slnoe s is an integer, it is not necessary to take absolute
values in (5-24). We conclude that $ Q fl is a basic sequence
5.3-4
r
5.32 The Iteration function ^ 0 ^ > The I.F. <t>Q 2
is Newton's I.F. We turn to Q . This I.F. was already
studied by Gauchy [5,3-1]. See also Hltotumato [5.3-2]. We
must solve
0 = f(x) + f'(x)(t-x) + |f"(x)(t-x) 2 « 0. (5-25)
Then
$ = x . £1 + l t d fi_4 A u)i U = X A 0,3 — grr v-1- 2 ' f ,' 2 2f' *
a will be a fixed point of $ 0 ^ if and only if we take
the + sign if f' > 0 and the - sign if f' < 0. If this choice
of sign is made, then $ - differs only by terms of O(u^)
from E^. Thus
^0,3 = x ~ T» + JT> (l-^A 2u)^. (5-26)
For x close to a, severe cancellation is bound to
occur in (5-26). This may easily be avoided by observing that
- b + (b 2-4ac)^ _ -2c 2 a b + (b 2-4ac)*'
and hence taking
« = x (5_ 2 7) 0 , 5 1 + (l - 4 A 0u)t
r
5.3-5
This last form is clearly best for computational purposes.
Although generations of schoolboys have been drilled to
rationalize the denominator, it is preferable, in this case,
to irratlonalize the denominator.
The generalized Fourier conditions of Theorem 4 - 2
assume a particularly simple form for $ Q ^. Let
f'f'"< 0
over the interval of Iteration. Then if x Q > a, the x^ con
verge to a monotonically from above; if x Q < a, the con
verge to a monotonically from below. Observe that in the
case of Newton's method, monotone convergence is only possible
from one side. This is because we demand that f(x )f"(x ) > 0
for Newton 1s I.F. and f must change sign as x goes through a.
5.3-6
5.33 Reduction of decree. We observed that the
use of $ requires the solution of an (s-l) degree poly-o f s nomial at each iteration. We can effect certain degree-
reducing changes which change the asymptotic error constant
but which do not affect the order.
Consider the following change. Replace one of
the t - x factors in (t-x) ~ by Eg - x = -u. (Eg = x - u
is Newton's I.F.) Define
( 5 - 2 8 )
We shall show that although R o Is a polynomial of degree s - 2 s
In t - x, it still leads to an I.F. of order s. Observe that
p0 , 3 < * > - V * > - ^ r ^ f 1 <*-*> s- 2<t-E 2)
and
E 2 - a - V(x)(x-a) 2, v ( a ) - Ag(a),
f(t) - P 0 #,(t) + f ( 3 ) f p ) l ( t . , ) B >
Since
and
5.3-7
we conclude that
0 «• f(a) = R s(a) + (-l)s(x-a) f ( B ) ( 0 . s J ( s - l ) l ' V W
where i lies In the interval determined by x and a. Define
i+ 1
x , „ by
Then
R s ( x i + 1 ) = 0.
R B(a) - - R B ( T l 1 + 1 ) ( x 1 + 1 - a ) ,
where T ) 1 + 1 lies in the interval determined by x ± + 1 and a.
Assume that R S(T1 1 + 1) does not vanish. Then if e± 0,
'i+1 (-l) s[A g(a) - A ^ ^ ^ A g C a ) ] . ( 5-29)
EXAMPLE 5-5. Let s = 3. Then we must solve R 3(t) = 0. Prom (5-28),
0 - f(x) + (t-x)[f'(x) + if"(x)(-u)]
or
* = X " I^AgH-
5 . 3 - 8
This is Halley's I.F. which we have now generated by linear
izing a second degree equation. From ( 5 - 2 9 ) ,
'i+1 A ^ a ) - A 2(a) = Y 3(a) - Y 2 ( a ) ,
as we found in Section 5 . 2 1 .
EXAMPLE 5 - 6 . Let s = 4 . Then
0 = f(x) + f'(x)(t-x) + (t-x) ; \ f"(x) - \ f"{x)u
or
0 = u + (t-x) + (t-x) 2[A 2-A 3u]
and
cp = x - 2.u 1 + [ 1 - 4 u(A 2-A 3u)]i
Then
'i+1 - [A 4(a) - A 2(a)A 3(a)]. ( 5 - 3 0 )
One could also arrive at ( 5 - 3 0 ) by expanding <p into a power
series in u and applying Lemma 5 - 2 .
r
5.3-9
/
E X A M P L E 5-7. I n
P 0 j 3 ( t ) = f ( x ) + ( t - x ) f ' ( x ) 4- i ( t - x ) 2 f " ( x ) ,
r e p l a c e ( t - x ) 2 b y ( E g - x ) 2 = u 2 . T h e n
0 = f ( x ) + ( t - x ) f ' ( x ) + ^ f ' ( x )
o r
p <p = x - u - u A 2 ,
w h i c h i s j u s t E ^ .
E X A M P L E 5-8. I n
P o , 4 ( t ) = f ( x > + ( t - x ) f ' ( x ) + | ( t - x ) 2 f " ( x ) + \ ( t - x ) 3 f ' " ( x ) ,
T h e r e a r e n u m e r o u s o t h e r w a y s b y w h i c h t h e d e g r e e
o f P (t) c a n b e l o w e r e d w i t h o u t c h a n g i n g t h e o r d e r . I f o n e
O y S
o f t h e t - x t e r m s i n ( t - x ) 8 " " 1 w e r e r e p l a c e d b y E ^ - x , t h e n
t h e d e g r e e w o u l d b e l o w e r e d b y o n e b u t n e i t h e r t h e o r d e r n o r
t h e a s y m p t o t i c e r r o r c o n s t a n t w o u l d b e c h a n g e d . W e s h a l l
^ c o n t e n t o u r s e l v e s w i t h t w o m o r e e x a m p l e s .
5.3-10
replace one of the t - x In the quadratic term by E^ - x and
replace (t-x) in the cubic term by (Eg-x) 2 = u . Then
0 - f(x) + (t-x) f'(x) - \ f "(x)(u+A 2u 2) + \ f"'(x)uJ
cp = x - u
1 + A gu + [k^ - Ag^u'
This is y1 2 derived in Section 5,21,
5 . 4 - 1
5.4 The Fundamental Theorem of One-Point Iteration Functions
We review some of the terminology introduced in
Section 1.24 which will be used in this section. The
informational usage, d, of cp is defined as the number of new
pieces of information required per iteration. If an I.F.
belongs to the class of I.F. of order p and informational
usage d, we write cp e ^ 3 . * T^e informational efficiency 9 EFF,
of cp is defined by EFF = p/d. If EFF = 1 , cp is an optimal I.F,
In this section we consider both simple and multiple zeros.
If the order of cp is independent of the multiplicity m of the
zero, then we say that its order is multiplicity-independent.
The reader familiar with I.F. has no doubt observed
that one-point I.F. of order p depend explicitly on at least
f and its first p - 1 derivatives. Hence the informational
usage of the I.F. is at least p. A theorem which gives a
formal proof of this fact is given below. It is this theorem
which causes us to label as optimal those I.F. whose informa
tional efficiency is unity. The theorem is quite simple to
prove and the result is in the "folklore11 of numerical
analysis. Its importance is that it makes us look for types
of I.F. whose informational efficiency is greater than unity.
Multipoint I.F. and I.F. with memory are not subject to the
conclusions of this "fundamental theorem of one-point I.F."
Recall that E , which was studied in Section 5 . 1 , is of order s p = s. This fact will be emphasized in this section by
writing E . We give two equivalent formulations of
THEOREM 5 - 3 * Let m = 1 . Let cp denote any one-
point I.F. Then there exist cp of all orders such that
EFF(cp) = 1 and for all cp, EFF(cp) <; 1 . Moreover cp must depend
explicitly on at least the first p - 1 derivatives of f,
ALTERNATIVE FORMULATION. Let m = 1 . . Let cp denote
any one-point I.F. Then for all p, there exist cp e I and
if cp e H^r>* then d ^ p. Moreover cp must depend explicitly
on the first p - 1 derivatives of f.
PROOF. Since E^ e I . there exist optimal I.F. p ° p p' of all orders. This disposes of the first part of the proof.
Let cp e Ip. From Theorem 2 - 1 0 , the most general I.F. of
order p is
cp = cpx + Uf p
where U is any function bounded at a. Hence U cannot contain I P
any terms in l/f. We take cp1 = E_. The most convenient form
to take for E is given by ( 5 - 2 ) ,
E p = x . P f i ^ i g ( j ) f J „ L J!
E p depends explicitly on f , f , f ( p _ 1 ) . Since the
highest power appearing in E is f p _ 1 , none of its terms can
be cancelled by Uf p. Since cp is a one-point I.P., none of
5 . 4 - 3
the f(^) can be approximated by lower derivatives to within 0
terms of 0(u ), I > 0 . Hence cp must depend explicitly on
f ,f', . .. ,f P""1^ which completes the proof.
The restriction to one-point I.F. is essential
as the following considerations show. Observe that
f [ V ( x S X ) 3 = V x > u 2 ( x > + °-tu3(x) ] .
Recall that
E 3(x) = x - u(x) - A 2(x)u 2(x).
Since
c p ( x ) = x - u( X) - f [ \ : $ x ) ]
and E^ differ only by terms of 0[u 3(x)], cp is third order.
That is, the second derivative appearing explicitly in has
been approximated in cp. Observe that cp uses information at
x and at x - u(x) and is therefore an example of the multi
point I.F. which are studied in Chapters 8 and 9 . Such
approximation of derivatives is impossible for one-point I.F.
We turn to the case of multiple zeros.
COROLLARY. Let m be arbitrary and known. There
exist cp of all orders, with cp depending explicitly on m, such
that EPF(cp) = 1 and for all cp, EFF(cp) £ 1 . Moreover cp must
depend explicitly on the first p - 1 derivatives of f.
ALTERNATIVE FORMULATION. Let m be arbitrary and
known. Then for all p there exist cp depending explicitly on m
such that cp e p I p and if cp e then d > p. Moreover, cp
must depend explicitly on the first p - 1 derivatives of f.
PROOF. Define F = f1//m. Then F has only simple
zeros and F ^ depends only on f ^ , I <; j. An application
of Theorem 5-3 completes the proof.
Note that this corollary assures us of the existence
of optimal I.F. whose order is multiplicity-independent. A
basic sequence of such I.F. will be explicitly given in
Section 7-3. Observe that if we define G = f ( m - 1 ) and insert
G into any optimal one-point I.F. of order p, we obtain a
one-point I.F. with informational efficiency equal to unity.
^ This approach leads to I.F. which depend explicitly on f(m-l) .(m) f(m+p - 2 )
A case of greater interest Is-when m is not known. i Note
that u = f/f' has only simple zeros. Replacing f by u in any
/-^ optimal one-point I.F. of order p leads to an I.F. which is
of order p. We conclude that there exist I.F. of all orders
such that EFF(cp) = p/(p+l) for zeros of all multiplicities.
These I.F. do not contain m explicitly. These matters will be
taken up in greater detail in Chapter 7.
r
5 . 5 - 1
5 - 5 The Coefficients of the Error Series of E s ^ We saw in Section 5 . 1 that
E g - a = Y g(a)(x-a) s + 0 [ ( x - a ) s + 1 ] . ( 5 - 3 1 )
In certain special cases, an explicit expression may be found
for the error of E . Thus for the calculation of square p
roots, f = x - A and
e 2 4 E 2 " a = 2(a+e)' a = A > e = x - a.
In general, the expression for E - a, is an infinite series s
whose leading term is given by the first term on the right
side of ( 5 - 3 1 ) • The coefficients of the infinite series are E i^(a)/J!- A recursion formula for these coefficients will s be found which does not involve differentiation. An interest
ing property of the coefficients will be proven in Section 5 . 5 2 .
5 . 5 - 2
5 . 5 1 A recursion formula for the coefficients.
We recall the definitions of the following symbols which will
be used frequently;
( \ f ( x ) f v f ( j ) ( x ) . f , f ( j ) ( x ) u ( x ) = f 7 ^ , a . ( x ) = _ ^ A . ( x ) = _ | ^ e - x - a .
The expansion of u(x) into a power series in e which
will be needed below is derived now. Define by
00
u(x) = f(x)/f'(x) = £ vtel. ( 5 - 3 2 )
1=1
Since
00
f(x) = £ a ^ , f'(x) = £ J a j e J _ 1 ' J=l j=l
and u(x)f'(x) = f(x), we obtain
I al = I
V q ( t + 1 - q ) a t + l - q ' 1 - 1 * 2 , . . . ,
q=l
or
t Al = Z
V q ( t + 1 - ^ V l - q ' 1 - 1 , 2 , . . . ,
q=l ( 5 - 3 3 )
5 . 5 - 3
and finally
Vl = h " Z V t + 1 " q ) V l - q ' t = l , 2 , . . . , ( 5 - 3 4 )
q-i
as a recursion formula for the calculation of the with
= 1. In ( 5 - 3 3 ) and ( 5 - 3 4 ) . as throughout this section, the
functions which occur are evaluated at a unless otherwise
Indicated.
An explicit formula for the may also be obtained.
It is not difficult to prove that
l V j I W"- J " ^ V \ (5-35) J-l i=l
where r = a i a n d w h e r e t h e inner sum is taken over all integers such that ia^ = j.
From either ( 5 - 3 4 ) or ( 5 - 3 5 ) , the first few v l may
be calculated as
v, = 1,
V 2 = ~A2'
v 3 = 2A 2 - 2A 3,
v 4 = - 4A| + 7 A 2 A 3 - 3 A 4 .
We are ready to turn to the problem of finding the
coefficients of the error series. Define t » a by v j S
00
(x) = £ T t / B e 1 ' . ( 5 - 3 6 ) 1=0
Since E (x) e I 0, we expect x, o = 0 , for 0 < I < s, and S S C • s
t = a. This may be proven directly by induction on s. o, s Let s = 1 . Then
E 1(x) = x « a + (x-a) = a + e.
Now assume t . a = 0 , 0 < I < s, t a = a. Substitute ( 5 - 3 6 ) Ks } S O , S
into the formula of Lemma 5 - 3 ,
« s « W = E . W - f ! > > . ( 5 - 3 7 )
to find
00 00 00
t=0 l=S l=s
which completes the induction. Substituting ( 5 - 3 6 ) into ( 5 - 3 7 ) yields
00 00 00
V ^ V ^ . u(x) V # ^t-i A
t=0 t=0 1=0
r
5 . 5 - 5
and using the expansion of u(x) given by ( 5 - 3 2 ) , multiplying t
the series, and equating the coefficient of e to zero yields
t - 1
S T t , s + l + ^ - s W , s + I r v t + l - r T r , s - °- ( 5 ~ 3 8 ) r=l
Since the may be considered known, ( 5 - 3 8 ) can be used to
determine the t * 0 . A number of i9 are given in Table 5 - 3 c, s c, s
These results are summarized in
THEOREM 5 - 4 . Let
00
E 1=0 1=1
Then
t - 1
, s + l + U - s K , s + I r vt+l-r Tr,s - °'
r=l
with t a = a, T-, -1 = 1 , t . n = 0 for I > 1 , and t , „ = 0 O , S JL,J_ v , -L C , S
for 0 < l < s and s > 1 .
a 5.5-6
TABLE 5 - 3 - FORMULAS FOR t . •I, s
t 1 2 3 k i I 0 a a a a 1 1
2 A2
3 - 2A2 + 2A3 2 A 2 " A3
h 4 A | - 7 A 2 A 3 + 3 A ^ - 9 A | + 12A 2A 3 - 3 A U 5 A ^ - 5 A 2 A 3 + A^
5 . 5 - 7
Using Table 5 - 3 we find
E 2 ( x ) - a = A 2 e 2 + (- 2k\ + 2 A 3 ) e 3 + (kk\ - T A ^ + lk^k + 0 ( e 5 ) ,
E ( x ) - a = (2k* - A 3 ) e 3 + ( - 9 A 3 + 1 2 A £ A 3 - 3 A ^ ) e U + 0 ( e 5 ) , ( 5 " 3 9 )
E 4 ( x ) - a = (5A3, - 5A 2 A 3 + A 4 ) e 4 + 0 ( e 5 ) .
Observe that for the cases worked out above, the coefficient
of e s in the expansion of E (x) - a is Y (a), as expected. s s
(See Table 5 - 1 for the formulas of Y .) s
r
5 . 5 2 A theorem concerning the coefficients. The
following theorem may be used to check tables of the t „ _
and has a somewhat surprising corollary.
THEOREM 5 - 5
I
s=l
PROOF. Note that since t , a = 0 , for I < s, (5-40)
is equivalent to Z t« _ = A,. The proof is by induction S —JL v , S v
on t. For t = 1 , T ^ s = T i 1 = 1 = Ai« Let
Z o^i t „ „ — A . for r - 1 , 2 , w i t h I > 2 . Sum the
recursion formula for t» _ to obtain v , S
£ *C ' t^- l
t + l - r l r , s
s = l s = l s = l s = l r = l
or
I l-l r
I H,s - I r v t + l - r I ^ s ' s=l r=l s=l
5.5-9
Finally, the recursion formula for the v^, (5-34), yields
s=l
which completes the proof.
COROLLARY. Let k be an arbitrary positive integer.
Then
s=l
PROOF.
J - X\. 00
s=l s=l t=0
k k oo k
4=1 8=1 t=k+l 8=1
where we have used the fact that t = 0 for r < s. An r, s
application of the inductive hypothesis yields
I l - l I I Tt,s * I r v t + l - r A r " I ^ + 1 " ^ v
q V l - q " t v l s=l r=l q=l
5 . 5 - 1 0
Since
s=l s=l
an application of Theorem 5 - 5 yields
k k £ E g(x) = ka + £ A^e* + 0 ( e k + 1 ) s=l 1 = 1
The fact that
1=1 1=1
and hence that
t = l
yields the corollary.
Taking the limit as k oo of ( 5 - 4 l ) yields a special
case of the general theorem that if a series is convergent,
then it is also Cesaro summable.
I
6 . 0 - 1
CHAPTER 6
ONE-POINT ITERATION FUNCTIONS WITH MEMORY
Two classes of one-point I.F. with memory are:
studied: Interpolatory I.F. and derivative estimated I.F.
These I.F. are always of nonintegral order. The structure
of the main results, given by Theorems 6 - 1 to 6 - 4 , Is
remarkably simple.
6.1-1
6.1 Interpolatory Iteration Functions
In Chapter 4 we developed the general theory of
I.F. generated by direct or inverse hyperosculatory inter
polation; such I.F. are called interpolatory I.F. If n * 0 ,
the interpolatory I.F. are one-point I.F,; if n > 0 , they
are one-point I.F.'with memory. The reader is referred to
Theorems 4 - 1 , 4 - 2 , and 4 - 3 for results concerning the conver
gence and order of interpolatory I.F. The generalization to
multiple zeros is handled in Chapter J\ • In this section
we limit ourselves to comments and examples. The notation is
the same as in Chapter 4 .
6 . 1 - 2
6 . 1 1 Comments. Observe that Interpolatory one-
point I.F. with memory use s pieces of new information at x^
and reuse s pieces of old Information at the n points x i - l ' x i - 2 > # * #' xi-n # Thus their informational usage is s.
Their order is determined by the unique positive real root
of the equation
n t n + 1 - s ) t J
= 0 . ( 6 - 1 )
j= 0
As was shown in Section 3 . 3 , bounds on this root are given by
Hence bounds on the informational efficiency are given by
Furthermore,
1 < EFF < 1 + i s
n1 i f Pn+l,s = 3 + 1
n ~* oo 9
Theorem 5 - 3 states that the informational efficiency of any
one-point I.F. is less than or equal to unity. Hence the
increase in informational efficiency for interpolatory I.F.
with memory is directly traceable to the reuse of old infor
mation. On the other hand, we conclude from ( 6 - 2 ) that the
old information adds less than one to the order of an Inter
polatory I.F.
6 . 1 - 3
T h e d e p e n d e n c e o f t h e o r d e r o n n a n d s m a y b e s e e n
f r o m T a b l e 3 - 1 w i t h k = n + 1 a n d a = s . O b s e r v e t h a t t h e
o r d e r a p p r o a c h e s i t s l i m i t i n g v a l u e , s •+ 1 , q u i t e r a p i d l y a s
a f u n c t i o n o f n ; t h i s i s p a r t i c u l a r l y t r u e f o r l a r g e s . T h e
c a s e o f m o s t p r a c t i c a l i n t e r e s t i s n = 1 . T h e n ( 6 - 1 ) m a y b e
s o l v e d e x a c t l y a n d
pl,s"" + ( s 2 + 4 s ) ^ ] .
A d r a w b a c k o f i n t e r p o l a t o r y I . F . w i t h m e m o r y i s t h a t
m u l t i p l e p r e c i s i o n a r i t h m e t i c m a y h a v e t o b e u s e d f o r a t l e a s t
p a r t o f t h e c a l c u l a t i o n . A d r a w b a c k o f i n t e r p o l a t o r y I . F .
g e n e r a t e d b y d i r e c t i n t e r p o l a t i o n i s t h a t a p o l y n o m i a l o f
d e g r e e r - 1 m u s t b e s o l v e d f o r e a c h i t e r a t i o n . A r e d u c t i o n
o f d e g r e e t e c h n i q u e , d e m o n s t r a t e d f o r o n e - p o i n t I . F . i n
S e c t i o n 5 . 3 3 , i s a l s o a v a i l a b l e f o r o n e - p o i n t I . F , w i t h m e m o r y
a n e x a m p l e i s g i v e n i n S e c t i o n 1 0 . 2 1 .
6 . 1 - 4
6.16 Examples. The reader is referred to Appendix A
for the interpolation formulas used in the following examples.
The notation is the same as in Chapter 4. We shall not give
the conditions for convergence; such conditions may be found
in the examples of Section 4.3. The first three examples use
inverse interpolation; the next three examples use direct
interpolation. We take y ^ = f 1 = f(x^)^ !J1 - (y^),
EXAMPLE 6 - 1 . n ~ 1 , s * 1 , (secant I.F.). In the
Newtonian formulation,
f i = * ± ~ y i ' ^ i ^ i - i ] - x i - f t x 1 > x 1 , 1 ] '
In the Lagrange-Hermite formulation,
fi xi-l" l'i^l xl
e |Ya(oi) P " 1 , P «• i ( l + v / 5 ) - 1 , 6 2 .
e i
EFF - £ -v 1 . 6 2 .
II
6.1-5
EXAMPLE 6-2. n » 2, s = 1
f 4f cp2
+ 1 j-* r i i i , 1 " + [f Lx i,x i_ 1J " f L x ^ x ^ g ] / '
- ^ f t — |Y 3(a)|^P- 1), p „ !.84. e l '
E P F = 1.84.
EXAMPLE 6-3. n = 1 , s = 2.
' 1 , 2 = ' 0 , 2 + f i H '
a, - X ^ 'o,2 - x i " T 7 ' I I
H . — 1 Xi 1 ,] V l fl_ . _ _ 1 _ 2 f i - f i - i i f ; ^ I ^ T ; (f.-f,.,) 2 t f; f ; _ / ^ ^ i
| e^±|i- l Y ^ a ) ! * ^ " 1 ) , p = l + v / 3 ^ 2.73. e ' p
• 1
EFF - -| 1.37.
6 . 1 - 6
EXAMPLE 6 - 4 . n = 1 , s = 1 , (secant I.F.). In the Newtonian formulation,
P l , l ( t ) = f i + (t-x ±)f[x 1,x 1_ 1],
" x i ' fLx 1,x 1_ 1J
In the Lagrangian, formulation,
<D = fl Xi-l" fi-l Xl 1 , 1 f i _ f i - i
Observe that for the case n = 1 , s = 1 , the I.F. generated by
direct and inverse interpolation are identical. This is the
only case, for n > 0 , where this is so.
! ei+l - | A p ( a ) | p - 1 = |Yja)|P-\ e P ' 2
• 1
p « |(l+ v / 5 ) ~ 1 . 6 2 ,
EFF = ~ 1 . 6 2 ,
s
6 . 1 - 7
h = x i " xi-l* f [ x i ' x i - l ' x i - 2 ] " x i " x i - 2
must be solved for t - x i .
i f ^ ^ l M a ) ! * ^ 1 . ) , p . 1.84.
EFP = f ~ 1.84
This I.F. is discussed in Section 1 0 , 2 1
EXAMPLE 6 - 6 , n = 1 , s = 2 . The third degree poly
nomial equation
0 = P x 2(t) - f ± + (t-x±)f'± + (t- X l) 2H,
H = f i - ^ x i ^ x i - l ^
xi" xi-l + (t-x ±)
f i + fi-l " 2 f t x i * x i - i ]
( x 1 - x j L _ 1 ) 2
must be solved for t - x
EXAMPLE 6 - 5 . n = 2, s = 1 . The second degree
polynomial equation
P 2 ^ ( t ) = f ± + (t-x ±)f[x 1,x 1_ 1] + (t-x ±)(t-x 1+h)f[x ±,x 1_ 1,x 1_ 2],
6 . 1 - 8
e i 4 - 1 i ± l i - | A , ( a ) p ( p _ l ) , P = 1 + v / 3 - 2 . 7 3
EPF = ^ 1.37
r
6 . 2 - 1
6 . 2 Derivative Estimated One-Point Iteration Functions With Memory
6 . 2 1 The secant iteration function and its
generalization. We have used direct and inverse interpolation
to derive one-point I.F. with memory. We observed that the
secant I.F. could be generated from either direct or inverse
interpolation. We now give two additional derivations of the
secant I.F.; the method of derivation suggests a second general
technique for generating one-point I.F. with memory.
The secant I.F., together with slight modifications
thereof, must share with Halley 1s I.F. (Section 5 . 2 ) the dis
tinction of being the most often rediscovered I.F. in the
literature. Discussions of the secant I.F. may be found in
B a c h m a n n [ 6 . 2 - 1 ] , Collatz [ 6 . 2 - 2 , Chap. Ill], Hsu [ 6 . 2 - 3 ] ,
Jeeves [ 6 . 2 - 4 ] , Ostrowski [ 6 . 2 - 5 , Chap. 3 ] , and Putzer [ 6 . 2 - 6 ] .
Its order seems to have been first given by Bachmann [ 6 . 2 - 7 ]
( 1 9 5 4 ) . Let 3 be the inverse to f. One way of writing
Newton 1s I.F. is
Newton 1 s I.F. is second order and uses two pieces of Informa-
tion. We estimate 3'(y) by % 1 where
" 3 ( y ) - s ( y i _ i ) '
Q 1 / L ( t ) - * ( y ) + (t-y) y-yi_ i-l
6 , 2 - 2
is the first degree polynomial which agrees with S at the
points y and y^.^* It is convenient to use the symbols y
and y^ and the symbols x and x i interchangeably. Replacing
3 ' by ^ in (6-3) leads to
xi" xl-l
which is the secant I.F.
On the other hand, we may write
* 2 x f '(x)
and estimate f'(x) by differentiating the first degree poly
nomial
P 1^ 1(t) m f(x) + (t-x) f(x) - t(xlmmly x-x i-1
which agrees with f at the points x and X j ^ - Again the secant
I.F. is generated.
In the following sections we will deal with two
broad generalizations of these ideas.
a. Rather than estimating the first derivative of an
optimal I.F. of second order, we estimate the
(s-l)st derivative of an optimal I.F. of order s.
b. Rather than estimating the first derivative from
two values of the function, we estimate the
(s-l)st derivative from n + 1 values (one new and
n old) of the first s - 2 derivatives.
In general the estimation of f 1 1 and the estimation of
2 r 1 leads to different I.F. We investigate the former
in Section 6 . 2 2 and the latter in Section 6 . 2 3 . After study
ing the estimation of derivatives in the optimal basic
sequence E , we will be able to deal with the case of arbi
trary optimal I.F. The one-point I.F. with memory thus gen
erated will be said to be derivative estimated.
r
6 . 2 - 4
6 . 2 2 Estimation of f f . We first develop the
theory of one-point I.F. with memory which are generated by
estimating f v ' in the I.F. E o ; the corresponding theory s
for arbitrary optimal I.F. then follows easily. From
Section 5 - 1 1 , s - 1
E = x + s > 1
( 6 - 5 )
_ ( - i ) H » ( j ) ( y ) y,(x) = - — , 3 J ! [ 3 ' ( y ) ] J
9
y-f(x)
f U ~ J J .
E uses s - 1 derivatives and hence s pieces of information, s We estimate f^ s^ 1^(x 1) from f ^ f x ^ ) , with j * 0 , 1 , . . . , n ,
and I = 0 , 1 , . . . , s - 2 . The symbols x and x i will be inter
changeably. The I.F. so generated use s - 1 pieces of new
information at the point x^ and reuse s - 1 pieces of informa
tion from the previous n points. Hence s - 1 will be of basic
importance and we define
S - s - 1 . ( 6 - 6 )
In place of r = s(n+l), we introduce
R « S ( n + 1 ) . ( 6 - 7 )
The advantage of using S and R instead of s and r will become
evident below.
r
6 . 2 - 5
Let P Q(t) be the polynomial such that Ti y O
P n ^ S ( x i - p " f ^ ( x i - j ) ' J »- 0,l,...,n, I - 0,1,...,S-1.
( 6 - 8 )
We estimate f^ S^(x) by p j ; S I ( x ) . Let IJ y O
* f n S ) < x ) ~ P n ! s ( x ) - ( 6 _ 9 )
As shown in Appendix A,
n f ( S ) ( x ) . . f U ) ( x ) = | j f ( R ) ( | i ) jj ( x - x ^ j ) 3 , ( 6 - 1 0 )
where ^ lies in the interval determined by x i , x i . .. j X ^ .
Let * E ^ o be generated from E c , n by estimating
f ^ ( x ) by *f^ s)(x). A new approximation to a is defined by
ki+l " ' j 5n,S^ xi j xi-l' * * " x i - n ^ * x,., = *E_
We derive the error equation for * E n g. Recall that
is independent of f^S^ for j < S. Let * ^ n > s be generated
from Y g by estimating f^S^ by * f ^ . Then
Sri _ - v _l_
Jn,S * E - " = x + I V* + X Y n , S u S - E S + * Y n , S u S * ( 6 ' 1 ; L )
J-l
On the other hand,
a = E s + Y s u S + « ( S + l ) ( © 1 ) f S 4 ' : L , ( 6 - 1 2 )
where 0^ lies in the interval determined by 0 and y^. Hence
* En,S " a " u SI* Yn,S-V " i f ^ T T T 8 ( S + 1 ) ( « i ) f S + 1 -
(s)
Since by corollary d of Theorem 5 - 1 * depends on f v 7 only
as (-l)SAs, A s = f^SV(S!f')* conclude that
* E a _ (-D Su S r. f(S) _ f(S)l _ (-l)°;x g ( S + l ) ( 0 ) f
An application of ( 6 - 1 0 ) yields
S + 1 (Sfl), f l N-S+l
*E_ „ - a = -J n , S R ! f
Let
ei-j - xl-j ' a> ei+l = xi+l " a = * En,S " °"
7± - F'TV6! = g 7 ^ - p where T|1 lies in the interval determined by x i and a and
P l = f (Tl1) . Then
6.2-7
- * M i e i S ( e i - r e i ) S + * N i e i + 1 -
( - i ) R f ( R ) ( ^ ) *M, - - v - / " q l i 1 ' [ f ' C n J ] 3 , ( 6 - 1 3 )
( _ 1 ) S + i 5 ( s + i ) ( }
*N, = -1 ( S + l ) ! [ * ' ( P i n
S+l*
ume
This is the error equation for „; we use It to n f o
derive the conditions for convergence and the order. Ass
that e i is nonzero for all finite i. We shall show that if x o ' x l ' * # # , x n a r e s u f f i c : J - e n t l y ° l o s e to a, then e ^ ^ + 0 .
Let
|x-a| £ r j .
Let f v ' be continuous on J and let f be nonzero on J. Let
f map the interval J into the interval K, Since n > 0 ,
R ^ S + 1 , Hence we can conclude that g ( s + 1 ) is continuous
on K. Let
•(H)
6 . 2 - 8
Let
3(8+1)1
for all y e K; that is, for all x e J. Let
*N = x
Let x .x,,,..,x e J. Then o 1 n
max[|e o|,|e 1|,...,|e n|] £ r .
By an inductive argument analogous to one employed in
Section 4 . 2 1 , it can be shown that if
2 R " S * M r R " 1 + *NT S £ I ,
then x ± e J for all 1. Hence | * M ± | £ *M, 1* 1 ^ *N for all i
Then
n 5 1 + 1 < * M 5 i II ( 6 l - j + 5 i ) S + ^sf 1, b ± _ 5 = l e ± _ j I .
j=l
for all x e J. Let
S v l v 3
1 1 S+l* v 2
An application of Lemma 3-2 shows that if
+ * N r S < 1,
then 6 4 - * 0 . Hence e. -* 0 . 1 i
Observe that
*M ± -+ - (-l) nA R(a), *N. ± - Y s + 1 ( a ) .
All the conditions of Theorem 3-4 now apply and we conclude
that
(6-14)
where p is the unique real positive root with magnitude greater
than unity of the equation
n t n + 1 - S
j= 0
We summarize our results in
THEOREM 6 - 1 . Let
r
6.2-10
Let R m s(n+l), n > 0. Let f^ R^ be continuous and let f be
nonzero on J. Let x0 * x i > * " > x
n e J and let a sequence { x ^
be defined as follows; Let be defined by (6-8) and (s)
(6-9). Let * E n g be generated from E s + 1 by estimating f v '
by • Define x i+1 * * E n , s ( x i ' x i - l ' ' ' " x i - n ) '
Let ~ xi-j " °"
T s + D
for all x e J and let
v2 X2
Assume that e A is nonzero for all finite i. Suppose that
a^^Mr1*-1 + *NTS < 1. Then, x± e J for all 1, e± -+ 0, and
i y * ) l ( p ~ 1 ) / ( R ~ 1 ) . < 6- 1 5 )
6 . 2 - 1 1
where p Is the unique real positive root of
n t n + 1 - S > t J = 0 ,
J« 0
and where A R = f^/iRlf).
(S)
We turn to the case where f v ' is estimated In an
arbitrary optimal one-point I.F. It will turn out that the
order and asymptotic error constant of the I.F. so generated (S)
are identical with the case where f v ' is estimated in £ 5 + 1 •
From Theorem 2 - 1 0 ,
* s + i - V i + U f S + 1 < 6 " 1 6 )
is the most general I.F. of order S + 1. Let cpg+1 be an
optimal one-point I.F. Let *cpn s be generated from cp s + 1 by (S) ' estimating f v ' by
analogously. Then
estimating f ^ by *f£ S) in cp g + 1 and let * U ^ S be defined
S+l n,S ~ "an,S " "n,S J
6 . 2 - 1 2
Hence
s + i *<p . - a - *ETI q - a + * u f ,
Y n , S n , S n , f a
n *U * c p Q - a * e . ^ = * M . e f [ f ( e . < - e . ) S + ( * N . + S + X
y n,S l+l i i ii- l-J 1 ' t 1 [ S ' ( p . ) ] J —1 l
( 6 - 1 7 )
where and were defined In ( 6 - 1 3 ) . We can proceed as
before to arrive at
THEOREM 6 - 2 . Let
x |x-a| £ ry.
Let R = S(n+l), n > 0. Let f^ R^ be continuous and let f be
nonzero on J. Let x Q,x^,...,x n e J and let a sequence {x i}
be defined as follows$' Let *f£ s) be defined by ( 6 - 8 ) and ( 6 - 9 )
Let <P S l f l t>e a r* arbitrary optimal one-point I.F.j let S+1
cp s + 1 m E g + 1 + Uf . Let *cpn g be generated from cp s + 1 by
estimating f^ S^ by in <p s + 1; then
Define
xi+l = * cPn,S ( xi ; x i - i " " ' x i - n ^
6 . 2 - 1 3
Let e 1-J = x 1-J - a, Let
R1 v 3 1 |f'I > v 2 ,
for all x e J. Let
q V n V 0 +
S+'l* .S+l* v 2 A 2
Assume that e ± Is nonzero for all finite 1. Suppose that 2 H - S # M r R - l + * N r^. < x >
Then, x± e J for all 1 , e ± -»• 0, and
1 ^ ± 4 - IVa)l ( p- l ) / ( R- l ), ( 6 - 1 8 )
where.p is the unique real positive root of
tn+l _ s \ tJ , 0 j
n
j - 0
and where A R = /(Rlf) .
6 . 2 - 1 4
6 . 2 3 Estimation of g^"" 1^. We turn to one-point f s-l)
I.F. with memory which are generated by estimating £ v y in the I.F. E ; the corresponding theory for arbitrary optimal s I.F. then follows easily.
We again take S = s - 1 . R = S(n+l); the
symbols y and y^ are used interchangeably. Let g be the
polynomial such that
( 6 - 1 9 )
We estimate ^ S \ y ) by oj s(y) . Let
lz{
n
3)(y) s ^ s ( y ) . ( 6 - 2 0 )
As shown in Appendix A,
n s(s)(y) - ^ s ) ( y ) = f T S f ( R ) ( e i ) II (y-y±-j)S> ( 6 - 2 1 )
where 6± lies in the interval determined by y^Y^i> • • • >y±-n'
Let 1 E n s be generated from E g + 1 by estimating
g^ s)( y) by 1 3 ^ S ^ ( y ) . A new approximation to a is defined by
xi+l ~ l E n , S ^ x i j xi-l'*'* , xi-n^
r
6 . 2 - 1 5
We derive the error equation for I E N g . We have
_ S-l 1
i- v E _ x + V L D l B(J) fJ + I z l i f y s ) f s n,S - x + L 31 S! tfn
= E g + J b | ^ S ) f S f ( 6 . 2 2 )
a - E s + ^ ^ S ) f s + ^+1he±)f2+1,
( 6 - 2 3 )
where 8 i lies in the Interval determined by 0 and y^. Prom
( 6 - 2 1 ) , ( 6 - 2 2 ) , and ( 6 - 2 3 ) ,
kS+l
Since
V j T a
where Tl i + 1 lies in the interval determined by x 1 + 1 and a and
P i + 1 = f ^ i + l ^ w e c o n c l u d e that
n
y±+i - X + i ^ II ( y i - j - y i ) s + V ^ f " 1 -
M i + 1 = - R ! g ' ( p 1 + 1 ) >
L ( - i ) 8 * 1 ^ 1 ^ ) N i + 1 ~ ~ ( S H ) J * ' ( P i + 1 ) '
r
6 . 2 - 1 6
Because of the dependence of (6-24) on Y±„y we
shall focus our attention on H,
H = |y| |y| £ A}-
Let 3 ^ R ) be continuous on H and let 3 ' be nonzero there.
Let y o J y 1 , . . . , y n e H. Since M 1 + 1 and N 1 + 1 depend on y 1 + 1
through the parameter P i + 15 we cannot prove that all yi e H in a manner analogous to the proof of the last section. Instead
we assume that y^ e H for all I and that yi Is nonzero for all
finite i. Let
R! ^ V l 3'I^V (s+1)! ^ V
for all y e H. Let
Then
V * 2
n S , 1S+1 | y i + 1 l <; lM\y±\s II (| y i _jl + |y ± J) s + 1k\j±+1\
An application of Lemma 3 - 2 shows that if
2R-S ljy^R-1 + i N A S < l f
6 . 2 - 1 ?
( - 1 ) % ( 0 ) , a R ( y ) = f r ^ { f }
All the conditions of Theorem 3-4 are now satisfied and we can
conclude that
where p Is the unique real positive root of
n .n+1 „ V 4-J -t" T A - S ^ t = °-
j= 0
Since e
where lies in the interval determined by y ^ j a n d a* w e
have that
We summarize our results in
J > l i - | y R ( „ ) | ( P - l ) / ( » - l ) .
then y A -*• 0 . Hence e± -*• 0 . Observe that
S(R)
6 . 2 - 1 8
THEOREM 6 - 3 . Let
I <; a } H = iy
Let R = S(n+l), n > 0 . Let be continuous and let be
nonzero on H. Let x Q,x^,...,x n be given and let a sequence
{x ±} be defined as follows: Let i ^ S ^ be defined by ( 6 - 1 9 )
and ( 6 - 2 0 ) . Let 1 E n g be generated from E s + 1 by estimating
3 ^ by i3I5IS^ . Define
i+1 _ "n,S^ xi ; xi-l'''*' xi-n)'
Assume that y ± e H for all 1 and that y± is nonzero for all
finite i. Let
l * ( R ) l s , K H s * 1 S ( S + 1 ) 1 , R!
for all y e H and let
\ 2 A 2
Suppose that 2 R _ S 1 M r R _ 1 + lWS < 1 .
6 . 2 - 1 9
Then y ± -• 0 and
^ i ± l l . | V 0 ) | ( P - l ) / ( H - l ) , ( 6 - 2 5 )
where p Is the unique real positive root of
n t n + i - s y t j = o , i
and where
(R)
Furthermore,
K ± i " _ l V a ) | ( p - i ) / ( H - i ) , ( 6 - 2 6 )
where
Y R(x) = . ( - D R g ( R ) ( y ) R! [ 3'(y) ] J y=f(x)
/—
The case where % K ' is estimated in an arbitrary
optimal one-point I.F. may be handled in a fashion which
should by now be familiar. We summarize the results in
6 . 2 - 2 0
THEOREM 6 - 4 . Let
H = {y| |y| 1 A}.
Let R = S(n+l), n > 0 . Let 3 f ( R ) be continuous and let 3 ' be
nonzero on H. Let x ,x^,.,.,x be given and let a sequence
{x±} be defined as follows: Let be defined by ( 6 - 1 9 )
and ( 6 - 2 0 ) . Let cpg+1 be an arbitrary optimal one-point I.P.; S + 1 1
let cp s + 1 * Es + 1 + u f • Let <Pn 5 be generated from cp s + 1 t»y
estimating 0 ^ by in cp s + 1; then
Define
xi+l " ± c p n , S ^ x i ; xi-l' * * * , xi-n^
Assume that y± e H for all 1 and that y ± is nonzero for all
finite i. Let
\ ^ S + l ) \ .lu I s K
( S + 1 ) ! * V I V s 1 *
6,2-21
for all y e H. Let
1 X 3 + V K
Suppose that 2 R " S X M A R _ 1 + ^NA^ < 1
Then y i -*• 0 and
- ^ 4 - l ^ o ) ! ^ " 1 ' ^ - ! ) .
where p Is the unique real positive root of
n .n+1 Q ^ +.J _ t 1 1 T X - S > t J = 0.
j=0
Furthermore,
6,2-22
6.24 Examples. The reader is referred to Appendix A
for the approximate derivative formulas used in the following
examples. We shall not give the conditions for convergence.
The notation is the same as in Sections 6 . 2 2 and 6 . 2 3 .
EXAMPLE 6-7. n ~ 1, S = 1 , (secant I.F.).
* Ei,i = x i ' Tp-
^ ± 4 - lAgCcOp- 1, P - i(l+V*>) - 1 . 6 2 .
EXAMPLE 6-8. n = 2, S » 1 .
*f' - f[x 1,x 1_ 1] + f[x ±,x ±_ 2] - f [ x 1 _ 1 , x 1 _ 2 ] ,
* E 2 , i - x i " ;7'
*2
I ^ ± 4 - . l A g C a ) ! * ^ " ^ . P . 1.84. ' e i
r
6 . 2 - 2 3
EXAMPLE 6 - 9 . n = 1 , S = 2 .
* fl " x p f ^ { 2 f I + ^ ' 3 f C x i ' X i - l ] } '
2 r 1 U i " f i
l A ^ C a ) ! ^ 5 " 1 5 , P - 1 + 7 3 - 2 . 7 3
EXAMPLE 6 - 1 0 . n = 1 , S = 2 . We estimate f" by *f 1
in Halley's I.F. rather than in . Thus
u i - *i " 1 - I ( u X j <
| A 4 ( c t ) | * ( P - D , p . i + v / 3 ^ 2 . 7 3 .
e i '
6.2-24
EXAMPLE 6 - 1 1 . n = 1 , S = 1 , (secant I.F.).
1 , ' 1
* 1 - f [ x ± , x 1 - 1 J '
1 1 '
E l , l " x i ~ f i V
I * ± i i - | Y 2 ( a ) | p - \ p = | ( 1 + v / 5 ) - 1 . 6 2 ,
EXAMPLE 6 - 1 2 . n = 2 , S - 1 .
• U ' 1 , 1 1 3 2 * f i [ x 1 , x 1 _ 1 J + f [ x 1 , x 1 _ 2 J " f L x ^ ^ x ^ J '
1 1 ' E 2 , l " X l " f l
8 2 '
b ± i j ^ , y ( o),i(P-l), p . 1.84. e l
6 . 2 - 2 5
EXAMPLE 6 - 1 3 . n = 1 , S = 2
1 " 5 i f -f T + ,
i M - l 1 _ -3
f, , f [ x i ' x i - l J
S , 2 - X i " U i +
lA
^ ± 4 - l Y . f a ) ! ^ " 1 ) , P = 1 + 7 3 ^ 2,
| . , I P " 4
7 3
6 . 3 Discussion of One-Point Iteration Functions With Memory
6 , 3 1 A conjecture. The theory of interpolatory I.F.
and derivative estimated I.F. has been developed in
Sections 6 . 1 and 6 . 2 . In the case of interpolatory I.F.,the
first s - 1 derivatives of f are evaluated. Hence s pieces
of new information are required for each iteration. For the
case of derivative estimated I.F., the (s-l)st derivative of
an optimal one-point I.F. is estimated from the first
s - 2 derivatives at n + 1 points. If we set S - s - 1 , then
S pieces of new information are used for each iteration. For
interpolatory I.F.,, r » s(n-fl) represents the product of the
number of new pieces of information per iteration with the
number of points at which information is used; R = S(n+l) plays
the corresponding role for derivative estimated I.F. A glance
at Theorems 4 - 1 , 4 - 3 , 6 - 1 , 6 - 2 , 6 - 3 , and 6 - 4 , reveals a remark
able regularity in the structure of the results. The only
parameters which enter are p and r or p and R; p depends only
on s and n or S and n. When we deal with inverse interpolation
the asymptotic error constant depends on Y^; when we deal with
direct interpolation the asymptotic error constant depends
on A r . From Theorems 6 - 2 and 6 - 4 we can conclude that the
asymptotic behavior of a sequence generated by a derivative
estimated I.F. is independent of the optimal one-point I.F. in
which the derivative is estimated.
r
Values of p for different values of n and s or n and
S may be found In Table 3 - 1 ,
The effect of estimating the highest derivative of
optimal one-point I.F, has been studied. The idea of estimat
ing the two highest derivatives suggests itself. Calculations
of orders of such methods indicate that such a procedure would
not be profitable.
We have proved, for the case of interpolatory and
derivative estimated I.F,, that the old information adds less
than unity to the order. We conjecture that this is true no
matter how the old information is reused.
CONJECTURE. Let
cp 388 cp[x^; x i - i , x i - 2 ' ** 9*xl-n^
« G ^JL*^1^'^i# * 0 # * ^ * 1 ^i-l'^l—l ,^i , p ,l'" * # *
f U - i ) v f f' f U - i ) ~ i~l ' • * " x l - n ' I i - n ' I i - n " " ' I i - n
be any one-point I.F. with memory. (The semicolon shows that
new information is used only at x^.) Let cp e ^I^. Then
p < I . + 1 .
In particular, we conjecture that it is impossible
to construct a one-point I.F, with memory which is of second
order and which does not require the evaluation of derivatives.
The fact that neyf information is used at only one point is
critical; in Section 8.6 we give I.F. which are of order
greater than two and which do not require the evaluation of
any derivatives. Those I.F. are, however, multipoint I.F.
with memory.
It must be emphasized that the conjecture does not
apply to the case where the sequence of approximants generated
by an I.F. is "milked" for more information. See Appendix D
for a discussion of acceleration of convergence.
6 . 3 2 Practical considerations. One-point I.F, with
memory typically contain terms which approach o/o as the
approximants approach a. Consider, for instance, the I.F. of
Example 6 - 9 :
# f l " x p i j ^ { 2 f i + fi-l " 3 f [ x ^ x , ^ ] } ,
( 6 - 2 9 )
2
1 U i * E l , 2 - x i - u i - 5 7 * f i -
x i
In theory, *f-L -*f"(a). In practice, it approaches a o/o form
which naturally poses computational difficulties. Note that " 2 *f-L is multiplied by u£ which goes to zero quite rapidly. Note
also that the last term of ( 6 - 2 9 ) may be regarded as a correc-
tlon term to x^ u^. Hence *f^ need not be Imown too
accurately. It might be worthwhile to do at least part of the
computation with multiple precision arithmetic; this matter
has not been fully explored.
In order to use an I.F. such as CP , n + 1 approxi-Yn,s
mants to a must be available. This suggests using g ,
which requires but two approximants, followed successively
by <Po a i ? Q o ' * » « ' 9 v * a * a t t h e beginning of a calculation.
6 . 3 - 5
6 . 3 3 Iteration functions which do not use all
available information. All the I.F. studied in this chapter
use all the old information available at n points. It is
possible to construot I.F. which use only part of the old
information available at n points. This may lead to simpler
I.F. but ones which are not of as high an order. For example,
the simplest estimate of f" is
1 • x r x i - i •
Define + E 1 2 by
2
+ » i . 2 - » i - » i - a ? t f i - ( 6 " 3 0 )
It may be shown that the indicial equation associated with
this I.F. is t 2 - 2 t - 1 - 0 , with roots 1 ± y / 2 . Thus the 4-
order of E^ 2 is 2 . 4 l . On the other hand. * E 1 2 , which
estimates f" from f ^ f ^ ^ f ^ f ^ p is of order 2 . 7 3 - If the
evaluation of f and f is expensive, it is preferable to use
the slightly more complicated *E-^ g.
r
6 . 3 - 6
6 . 3 4 A n a d d i t i o n a l t e r m i n t h e e r r o r e q u a t i o n . I n
S e c t i o n 5 . 5 a d i f f e r e n c e e q u a t i o n w a s d e r i v e d w h i c h p e r m i t t e d
t h e r e c u r s i v e c a l c u l a t i o n o f t h e c o e f f i c i e n t s o f t h e e r r o r
s e r i e s o f E . G e n e r a l i z a t i o n t o I . F . w i t h m e m o r y h a s n o t b e e n
a t t e m p t e d . T h e f i r s t t w o t e r m s o f t h e e r r o r s e r i e s h a v e b e e n
w o r k e d o u t f o r a n u m b e r o f I . F . a n d a r e g i v e n b e l o w . T h e
l e a d i n g t e r m o f t h e s e I . F . i s , o f c o u r s e , g i v e n b y T h e o r e m 6 - 1 .
*E 1 , 1 " a * A 2 ^ a ^ e i e i - 1 " ^(a) - ( a ) e - e
i " i - r s e c a n t I . F . ,
( 6 - 3 1 )
* E 2 , 1 - ° * - A 3 ( ° 0 e i e i _ i e i - 2 + A 2 ( a ) e i > ( 6 - 3 2 )
* E n 1 , 2 • a * V a ) e i e i - i + 12 A|( a) - V a ) e{. ( 6 - 3 3 )
7 . 0 - 1
CHAPTER 7
MULTIPLE ROOTS
I n S e c t i o n 7 . 2 w e s h o w t h a t a l l E a r e o f l i n e a r
s
o r d e r f o r n o n s i m p l e z e r o . I n S e c t i o n 7 - 3 w e s t u d y a n o p t i m a l
b a s i c s e q u e n c e { g } w h o s e o r d e r i s m u l t i p l i c i t y - i n d e p e n d e n t . s
O u r r e s u l t s o n I . F . g e n e r a t e d b y d i r e c t i n t e r p o l a t i o n a r e
e x t e n d e d t o t h e c a s e o f m u l t i p l e r o o t s i n S e c t i o n 7 * 5 .
T h e o r e m 7 - 6 g i v e s a n e c e s s a r y a n d s u f f i c i e n t c o n d i t i o n f o r a n
I . F . t o b e o f s e c o n d o r d e r f o r r o o t s o f a r b i t r a r y m u l t i p l i c i t y .
A n I . F . o f i n c o m m e n s u r a t e o r d e r i s s t u d i e d i n S e c t i o n 7 . 8 .
7 . 1 - 1
7 . 1 I n t r o d u c t i o n
A n u m b e r o f d e f i n i t i o n s w h i c h r e l a t e o r d e r t o
m u l t i p l i c i t y a r e g i v e n i n S e c t i o n 1 . 2 3 .
T h r e e f u n c t i o n a l s a r e o f p a r t i c u l a r i n t e r e s t :
v-frs G = fi^1), p = f l A . ( 7 - 1 )
O b s e r v e t h a t u h a s o n l y s i m p l e z e r o s ; G a n d P h a v e o n l y s i m p l e
z e r o s p r o v i d e d t h a t t h e z e r o o f f h a s m u l t i p l i c i t y m . T h e
m u l t i p l i c i t y m u s t b e k n o w n a p r i o r i i f G o r F a r e t o b e u s e d .
I f f a n d i t s d e r i v a t i v e s a r e r e p l a c e d b y u , G , o r F a n d t h e i r
d e r i v a t i v e s i n a n y I . F . , t h e n t h e e n t i r e t h e o r y w h i c h p e r t a i n s
t o s i m p l e z e r o s m a y b e a p p l i e d . I f , f o r e x a m p l e , w e r e p l a c e f
b y u i n N e w t o n ' s I . F . , t h e n w e g e n e r a t e
u
9 - X -
w h i c h i s s e c o n d o r d e r f o r z e r o s o f a l l m u l t i p l i c i t i e s a n d
c p - a u " ( a )
/ n 2 2 u ' a ) #
( x - a ) v 1
I t i s w e l l k n o w n t h a t N e w t o n ' s I . F . i s o f l i n e a r
o r d e r f o r a l l n o n s i m p l e r o o t s . E . S c h r o d e r [ 7 . 1 - 1 , p . 3 2 4 ]
p o i n t s o u t t h a t
cp = x - m u ( 7 - 2 )
7 . 1 - 2
i s o f s e c o n d o r d e r f o r z e r o s o f m u l t i p l i c i t y m ; t h i s f a c t h a s
b e e n o f t e n r e d i s c o v e r e d . I n S e c t i o n 7 . 2 w e w i l l p r o v e t h a t
a l l E a r e o f l i n e a r o r d e r f o r n o n s i m p l e z e r o s . I n s
S e c t i o n 7 - 3 w e w i l l c o n s t r u c t a n o p t i m a l b a s i c s e q u e n c e o f
w h i c h ( 7 - 2 ) i s t h e f i r s t m e m b e r .
7 . 2 - 1
7 . 2 T h e O r d e r o f E o
s
E g w a s d e r i v e d f r o m t h e T a y l o r s e r i e s e x p a n s i o n
o f 3 u n d e r t h e a s s u m p t i o n t h a t f h a s o n l y s i m p l e z e r o s . W e
c a n , n e v e r t h e l e s s , i n q u i r e a s t o t h e b e h a v i o r o f E w h e n
s a p p l i e d t o t h e c a l c u l a t i o n o f m u l t i p l e z e r o s . W e p r o v e
T H E O R E M 7 - 1 . T h e o r d e r o f E g i s l i n e a r f o r a l l
n o n s i m p l e z e r o s . M o r e o v e r ,
E _ , n - a / -i \ s
• S i - - 1 ^ - n a - * - ) . ( 7 - 3 ) 3 I m t = i
P R O O F . F r o m ( 5 - l 6 ) a n d ( 5 - 1 7 ) ,
s
E s + 1 ( x ) = x - £ Z j ( x ) , Z j ( x ) = Y j ( x ) u J ( x )
j = l
H e n c e
E g + 1 ( x ) - a - x - a - ^ Z ^ ( x ) .
J - l
D e f i n e 7 ^ b y
Z j ( x ) = £ 7tt3(*-a)1. ( 7 - 4 ) t = l
7 . 2 - 2
H e n c e
E g + 1 ( x ) - a =
s
1 - I ( x - a ) + 0 [ ( x - a ) 2 ] . ( 7 - 5 )
S i n c e Z j ( x ) s a t i s f i e s t h e d i f f e r e n c e - d i f f e r e n t i a l e q u a t i o n
J Z j ( x ) - ( j - l ) Z J _ 1 ( x ) + u ( x ) Z j _ 1 ( x ) = 0 , Z 1 ( x ) = u ( x ) ,
( 7 - 6 )
w e f i n d o n s u b s t i t u t i n g ( 7 - 4 ) i n t o ( 7 - 6 ) a n d n o t i n g
u ( x ) = ( x - a ) / m + 0 [ ( x - a ) 2 ] , t h a t
I7l9i - ( M ) 7 l j H + £ 7 1 ) y 1 - 0 , y 1 } 1 = 1 . ( 7 - 7 )
S e t t i n g M = l / m y i e l d s
' 1 , J
, 1 - 1 - M 7
1 , J - 1 ' 7 1 , 1 = M ,
H e n c e
( 7 - 8 )
w h e r e C [ M , J ] d e n o t e s a b i n o m i a l c o e f f i c i e n t . S u b s t i t u t i n g
( 7 - 8 ) i n t o ( 7 - 5 ) y i e l d s
E g + 1 ( x ) - a = ( x - a ) £ (-DJC[M,J] J = 0
+ 0[(X-AF] = ( x - a ) ( - l ) s C [ M - l , s ] + 0[(X-AF],
r
7 . 2 - 3
w h e r e w e h a v e u s e d t h e w e l l - k n o w n I d e n t i t y
£ ( - l ) J C [ M , j ] - ( - 1 ) 8 C [ M - 1 , b ] .
J = 0
H e n c e
E s + 1 ( x ) - a = ( x - a )
- t = i
+ 0 [ ( x - a ) ] .
S i n c e m i s a p o s i t i v e i n t e g e r , t h e c o e f f i c i e n t o f ( x - a ) i s z e r o
i f a n d o n l y i f m = 1 ; f a c t o r i n g o u t l / m f r o m e a c h t e r m o f t h e
p r o d u c t c o m p l e t e s t h e p r o o f .
E X A M P L E 7 - 1 .
E 2 ( x ) - a = ( l - ( x - a ) + 0 [ ( x - a ) 2 ]
w h i c h i s a w e l l - k n o w n r e s u l t ,
T h e a s y m p t o t i c e r r o r c o n s t a n t i s g i v e n b y t h e r i g h t
s i d e o f ( 7 - 3 ) . We h a v e
C O R O L L A R Y . L e t
G ( M , S ) = L = I L [ J ( i - t m )
s Jm 1=1
7 . 2 - 4
T h e n
Q ( m , s ) < 1 , l i m Q ( m , s ) = 1 ,
m —•• oo
P R O O F . S i n c e G ( m , s ) m a y b e w r i t t e n a s
s
G ( m , s ) = II ( l -
t = l
t h e r e s u l t f o l l o w s i m m e d i a t e l y .
I f m > 1 a n d i f a s e q u e n c e o f a p p r o x i m a n t s i s f o r m e d
b y x ± + 1 = E 2 ( x ± ) , t h e n
x i + l " a = C 1 " m ) ( x i " a ) + 0 [ ( X i - a ) 2 ] . ( 7 - 9 )
S i n c e t h e s e q u e n c e c o n v e r g e s l i n e a r l y , w e c a n a p p l y A i t k e n ' s
5 f o r m u l a ( A p p e n d i x D ) a n d e s t i m a t e a b y
a * = x - ( x i + 2 ~ x i + l ) 2
i + 2 x i + 2 " 2 x i + l + x i *
U s i n g a * a s a n e s t i m a t e f o r a ,
x i + l " a w C 1 - m ) ( x i " a )
m a y b e u s e d t o c a l c u l a t e a n e s t i m a t e f o r m . N o t e t h a t m i s
a n i n t e g e r - v a l u e d v a r i a b l e . O n c e m i s k n o w n , t h e s e c o n d o r d e r
I . F . , cp = x - m u m a y b e u s e d .
7.3-1
7.3 T h e B a s i c S e q u e n c e g s
7 . 3 1 I n t r o d u c t i o n . I n S e c t i o n 5 * 1 3 * e x p l i c i t
f o r m u l a s w e r e d e r i v e d f o r E g ; t h e s e I . P . f o r m a n o p t i m a l b a s i c
s e q u e n c e f o r s i m p l e z e r o s . I n S e c t i o n 7 . 2 w e s h o w e d t h a t t h e
o r d e r o f E i s l i n e a r f o r n o n s i m p l e z e r o s a n d h e n c e t h a t { E ) s s
i s n o t a b a s i c s e q u e n c e f o r m > 1 . A n o p t i m a l b a s i c s e q u e n c e
i s c o n s t r u c t e d b e l o w f o r t h e c a s e w h e r e m i s a r b i t r a r y b u t
k n o w n .
a p r i o r i , t h e r e s u l t s a r e o f l i m i t e d v a l u e a s f a r a s p r a c t i c a l
p r o b l e m s a r e c o n c e r n e d . T h e s t u d y i s , h o w e v e r , o f c o n s i d e r a b l e
t h e o r e t i c a l i n t e r e s t a n d l e a d s t o s o m e s u r p r i s i n g r e s u l t s .
We w i l l f i n d t h a t m u l t i p l i c a t i o n o f t h e t e r m s o f E g b y c e r t a i n
p o l y n o m i a l s i n m l e a d s t o I . F . w i t h t h e d e s i r e d p r o p e r t i e s .
T h e s e p o l y n o m i a l s a r e f o u n d e x p l i c i t l y ; t h e i r c o e f f i c i e n t s
d e p e n d o n S t i r l i n g n u m b e r s o f t h e f i r s t a n d s e c o n d k i n d .
O n e m a y d e r i v e t h e s e n e w I . F . b y t h e f o l l o w i n g
t e c h n i q u e . L e t a b e a z e r o o f m u l t i p l i c i t y m . L e t
T h e n h ( x ) h a s a s i m p l e z e r o a t a . L e t H b e t h e i n v e r s e t o h .
T h e n p r o c e e d i n g a s i n S e c t i o n 5 . 1 1 i t i s c l e a r t h a t
S i n c e t h e m u l t i p l i c i t y o f a z e r o i s o f t e n n o t k n o w n
h ( x ) = f^m(x) = z .
s - 1
j=0
7 . 3 - 2
i s o f o r d e r s f o r a l l m . I n p a r t i c u l a r .
. p l / r r i / \
c p 2 = H ( z ) - z H ' ( z ) = x - h ' ( x ] T = X " m u ( x ) >
( 7 - H )
cp 3 = H ( z ) - z H ' ( z ) + | z 2 H " ( z ) = x - |m(3-m)u(x) - m 2 A 2 ( x ) u 2 ( x ) .
c p 2 w a s k n o w n t o E . S c h r o d e r [ 7 - 3 - 1 ] ( 1 8 7 0 ) . S e e a l s o
B o d e w i g [ 7 - 3 - 2 ] a n d O s t r o w s k l ' [ 7 . 3 - 3 , C h a p . 8 ] .
R a t h e r t h a n u s i n g ( 7 - 1 0 ) t o g e n e r a t e h i g h e r o r d e r
I . F . o f t h i s t y p e , w e a t t a c k t h e p r o b l e m f r o m a n o t h e r p o i n t
o f v i e w .
T
4
7 * 3 - 3
7 . 3 2 T h e s t r u c t u r e o f g . I t i s a d v a n t a g e o u s t o
e x t e n d t h e n o t a t i o n f o r I . F . s o t h a t f a n d m a p p e a r e x p l i c i t l y
a s p a r a m e t e r s . T h u s w e r e p l a c e ( 5 - 1 7 ) b y
E a + 1 ( x , f , l ) = x - ^ Z j ( x , f , l ) .
L e t
F = f l / m . ( 7 - 1 2 )
O b s e r v e t h a t a z e r o o f m u l t i p l i c i t y m o f f i s a z e r o o f
m u l t i p l i c i t y 1 o f F . C l e a r l y , { E 1 ( x , P , l ) ) i s a n o p t i m a l
b a s i c s e q u e n c e f o r a l l m . ( I t i s c o n v e n i e n t t o u s e E g + 1
r a t h e r t h a n E t h r o u g h o u t t h i s s e c t i o n . ) L e t
s
Z j ( x , P , l ) = W j ( x , f , m ) .
T h e n ( 5 - 1 8 ) b e c o m e s
J W j ( x , f , m ) - ( j - l ) W J _ 1 ( x , f , m ) + m u ( x ) W j _ 1 ( x , f , m ) = 0 ,
( 7 - 1 3 )
W ^ ( x , f , m ) = rau(x),
w h i l e L e m m a 5 _ 3 b e c o m e s
7 . 3 - 4
L E M M A 7 - 1 .
e s + 1(x,f,m) = fia(x,f,m) - e^(x,f,m). s w s
We s e e k c o e f f i c i e n t s p . . ( m ) s u c h t h a t
e s + 1 ( x , f , m ) = x - £ p s ^ J ( m ) Z j ( x , f , l ) ;
P s ^ j ( m ) = 0 , s < j ,
( 7 - 1 4 )
a n d s u c h t h a t { £ s + 1 ( x , f , m ) } i s a n o p t i m a l b a s i c s e q u e n c e .
S u b s t i t u t i n g ( 7 - l 4 ) i n t o t h e f o r m u l a o f L e m m a 7 - 1
y i e l d s
x - ^ p s ^ ( r a ) Z J ( x , f , l )
J = l
s-1
3=1 - u ( x ) s v '
s-1 1 - y p B . l i J ( m ) Z j ( x , f , l )
We u s e ( 5 - 1 8 ) t o e l i m i n a t e u ( x ) z l ( x , f , l ) a n d f i n d
X Z j U ^ l H - s p S j . ( m ) + sps_1}i(m) + m J p B . l j J . 1 ( m ) - n J p 8 . l j J ( m ) ] = 0, j = l
w h e r e w e h a v e t a k e n p _ ( m ) , w h i c h m a y b e c o n s i d e r e d a s t h e s , o
c o e f f i c i e n t o f x i n ( 7 - l 4 ) , e q u a l t o u n i t y . T h e n ,
s p s ^ ( m ) + ( m j - s ) p s _ 1 ^ j ( m ) - m J p s - 1 ^ - : L ( m ) = 0 , ( 7 - 1 5 )
w i t h p (m) = 1 f o r s > 0 a n d p . ( m ) = 0 , f o r s < j a s s , o s , j
i n i t i a l c o n d i t i o n s . E q u a t i o n ( 7 - 1 5 ) p e r m i t s t h e r e c u r s i v e
c a l c u l a t i o n o f t h e p ^ ( m ) . E q u a t i o n ( 7 - 1 5 ) s h o w s t h a t S 9 J
p ^ ( m ) i s a p o l y n o m i a l i n m ; a n e x p l i c i t f o r m u l a i s d e r i v e d S 9 J
b e l o w .
D e f i n e t h e a s s o c i a t e d f u n c t i o n s a . . ( m ) b y ^9 J
W t ( x , f , m ) = ^ a t ^ j ( m ) Z J ( x , f , l ) , t < j . ( 7 - l 6 )
j = l
T h e ^ ( m ) w e r e i n t r o d u c e d b y Z a j t a [ 7 . 3 - 4 ] . S i n c e
s s t
e s + 1 ( x , f , m ) = x - ^ W t ( x , f , m ) = x - ^ ^ a ^ f m ) Z ^ ( x , f , l )
1=1 t = l j = l
s
j = l l=i
= X
7 . 3 - 6
r
w e h a v e
s
P s ^ ( m ) = £ a t , j ( m ) ' ( 7 " 1 7 )
1=3
T o f i n d a r e c u r s i o n f o r m u l a f o r t h e j ( m ) w e s u b s t i t u t e
( 7 - 1 6 ) i n t o ( 7 - 1 3 ) , a n d u s e ( 5 - 1 8 ) t o e l i m i n a t e u ( x ) Z ^ ( x , f , 1 ) .
T h e n
1-
^ Z j ( x , f , l ) [ t a ^ j ( m ) - U - l ^ ^ j C m ) - m « J 0 ^ _ i ^ j _ i ( m ) + m J a t - l , j ( m ^ = ° *
J = l
H e n c e
( t + l ) a t + 1 ^ ( m ) + ( m j - t ) a t ^ ( m ) - m j a ^ ^ C m ) = 0 ( 7 - 1 8 )
w i t h a (m) = 1 , a , (m) = 0 f o r t > 0 , a n d a Am) = 0 f o r
I < j , a s i n i t i a l c o n d i t i o n s . E q u a t i o n ( 7 - 1 8 ) s h o w s t h a t
a . . i s a p o l y n o m i a l i n m .
We d i g r e s s b r i e f l y t o l i s t s o m e d e f i n i t i o n s f r o m t h e
C a l c u l u s o f F i n i t e ' D i f f e r e n c e s . T h e r e a d e r i s r e f e r r e d t o
J o r d a n [ 7 - 3 - 5 ] o r R i o r d a n [ 7 * 3 - 6 ] f o r t h e s t a n d a r d t h e o r y .
O u r n o t a t i o n i s n o t q u i t e s t a n d a r d . We d e f i n e :
7.3-7
i-i
C _ [ x , t ] = Ax],/II " R i s i n g F a c t o r i a l C o e f f i c i e n t "
[ x ] ^ = ^ S ^ " S t i r l i n g N u m b e r s o f t h e F i r s t K i n d "
I J=0
xl " S t i r l i n g N u m b e r s o f t h e S e c o n d K i n d "
j=0
S t i r l i n g n u m b e r s o f t h e f i r s t a n d s e c o n d k i n d a r e o f t e n d e n o t e d
b y t w o d i f f e r e n t t y p e s o f s ; f o r e x a m p l e , b y s a n d S . W e
w i l l u s e S a n d T , T h e r e i s n o d a n g e r o f c o n f u s i n g t h e l a t t e r
w i t h t h e u s u a l s y m b o l f o r a C h e b y s h e y p o l y n o m i a l .
W e c o n t i n u e o u r s t u d y o f p . ( m ) a n d ap ^ ( m ) . A
g e n e r a t i n g f u n c t i o n f o r t h e j ( m ) ^ a y b e d e r i v e d b y d e f i n i n g
hj(x,m) = ^ ot^{m)xl. t = 0
[ x ] t = J[ ( x - i ) " P a l l i n g F a c t o r i a l "
1 = 0
l-l p [ x ] t - ( x + i ) " R i s i n g F a c t o r i a l "
1 = 0
C[x,l] = [x],/ll " F a l l i n g B i n o m i a l C o e f f i c i e n t "
7 . 3 - 8
I t f o l l o w s f r o m ( 7 - 1 8 ) t h a t h j ( x , m ) s a t i s f i e s
( l - x ) h j ( x , m ) + m j h j ( x , m ) - m j h J _ 1 ( x , m ) - 0 , ( 7 - 1 9 )
w h e r e m I s a p a r a m e t e r . A s o l u t i o n o f ( 7 - 1 9 ) I s
h j ( x , m ) = [ 1 - ( l - x ) m ] J .
I t i s n o t d i f f i c u l t t o v e r i f y t h a t t h e f u n c t i o n s k - . ( m ) w h i c h
^9 J s a t i s f y
00
[ 1 - ( l - x ) m ] J = £
1=0
a l s o s a t i s f y ( 7 ~ l 8 ) a n d I t s I n i t i a l c o n d i t i o n s , a n d h e n c e
00
[ 1 - ( l - x ) m ] J = £ a l f i ( m ) x l .
O b s e r v i n g t h a t
m
1 - ( l - x ) m = £ (-D^Cim.rW r = l
a n d a p p l y i n g t h e m u l t i n o m i a l t h e o r e m y i e l d s
a t j J ( m ) = J . ( - l ) < - + J I n • ( 7 - 2 0 )
r = l
7 . 3 - 9
w i t h t h e s u m t a k e n o v e r a l l n o n n e g a t i v e i n t e g e r s a r s u c h
t h a t
L rar - l> I a r = J '
r = l r = l
O n t h e o t h e r h a n d ,
[ 1 - ( l - x ) m ] J = £ C [ j , r ] ( - l ) r ( l - x ) r m
r=0
t = j r=0
T h u s
a t , j ( m ) = 1 ( - 1 ) r c [ J ^ H - l ) t C [ r m , t j . ( 7 - 2 1 )
r ^ O
S i n c e C [ r m , t ] i s a p o l y n o m i a l i n m o f d e g r e e I, ( 7 - 2 1 ) e x h i b i t s
e x p a n s i o n o f cj^ j ( m ) i n t h e p o l y n o m i a l s C [ r m , t ] . S i n c e a n
r=0 k-0
k=0 r=0
r
7 . 3 - 1 0
U s i n g
J
r = 0
y i e l d s
v
atJm) - ( - l ) t + J # I S t , k T k , J m k ' ^ - 2 2 )
E q u a t i o n ( 7 - 2 2 ) e x h i b i t s a . , ( m ) a s a p o l y n o m i a l i n m . I t
i s n o t d i f f i c u l t t o s h o w t h a t
C r [ m x , t ] - £ C r [ x , j ] ( - 1 ) ^ J | X \ , k T k , J
j = 0 k = j
k m
a n d h e n c e t h a t
C r [ m x , t ] = ^ a t ^ ( m ) C r [ x , j ]
J = 0
T h u s C [ m x , t ] i s a g e n e r a t i n g f u n c t i o n f o r t h e a . . ( m ) r e l a -
t i v e t o t h e b a s e f u n c t i o n s C [ x , j ] . S i n c e C [xsl] = C[x+l-l9l]
w e h a v e e q u i v a l e n t l y t h a t
V
C [ m x + t - l , t ] = ^ a t ^ ( m ) C [ x + J - l , j ] .
j = 0
7 . 3 - 1 1
1=3
a n d u s i n g ( 7 - 2 2 ) y i e l d s
s
S i n c e
k = j t = k
t ! " * , , k ~ s ! S s + l , k + l '
t = k
k
O u r r e s u l t s a r e s u m m a r i z e d i n
W e n o w d e r i v e a n e x p l i c i t f o r m u l a f o r t h e p . ( m ) . R e c a l l i n g
t h a t
7 . 3 - 1 2
T H E O R E M 7 - 2 . D e f i n e p 0 , ( m ) b y
fis+1(x,f,m) = x - ^ p s ^ ( m ) Z J ( x , f , l ) ; P s ^ ( m ) = ° ; f o r s < J«
D e f i n e a . ( m ) b y
^ 9 J
W t ( x , f , m ) = £ a t ^ ( m ) Z j ( x , f , 1 ) ; c r ^ j ( m ) = 0 . f o r l< j .
J - l
T h e n
00
[ 1 - ( l - x ) m ] J = £ CTt,j(m)x^ ( 7 - 2 4 )
t = 0
j
= ]T ( - l ) r C [ j , r ] ( - l ) t C [ r m , t ] , ( 7 - 2 5 ) r = 0
,j(m) = ( - D t + J f f £ S ^ k T k ^ m k , ( 7 - 2 6 )
C r [ m x , t ] = Y at ^ ( m ) C r [ x , j ] , ( 7 - 2 7 )
j = 0
P 8 j J ( » ) - ( - D J + S f t I
S s + l , k + l T
k , J m k ' ^ '
7 . 3 - 1 3
T h e f o l l o w i n g c o r o l l a r i e s f o l l o w f r o m t h e v a r l o u
p a r t s o f T h e o r e m 7 - 2 . T h i s I s n o t a c o m p l e t e l i s t o f t h e
p r o p e r t i e s o f j ( m ) a n d j ( m ) *
C O R O L L A R Y a . j ( m ) I s a p o l y n o m i a l i n m o f
d e g r e e I.
P R O O F . T h i s f o l l o w s f r o m ( 7 - 2 6 ) .
C O R O L L A R Y b . ^(m) = ml.
P R O O F . T h i s f o l l o w s f r o m ( 7 - 2 6 ) , s i n c e
T — q _ -1
C O R O L L A R Y c . 1 ( m ) = ( - l ) l + 1 C [ m , I ]
P R O O F . T h i s f o l l o w s f r o m ( 7 - 2 5 ) .
C O R O L L A R Y d . j ( m ) = 0 , f o r l < j a n d I > m j .
P R O O F . F o r I < j , ^ ( m ) w a s d e f i n e d a s z e r o .
F o r I > m j , t h i s f o l l o w s f r o m ( 7 - 2 4 ) .
7 . 3 - 1 4
r
C O R O L L A R Y e .
00
t = j t = o
P R O O F . S e t x = 1 I n ( 7 - 2 4 ) a n d u s e C o r o l l a r y d .
C O R O L L A R Y f . j ( l ) = & t j ( K r o n e c k e r d e l t a ) .
P R O O F . S e t m = 1 I n ( 7 - 2 4 ) .
C O R O L L A R Y g .
«t 00
^ a ^ j ( m ) = X a £ , j ( m ) = C t m + t " 1 ' ^ *
j = o ' J - 0
P R O O F . S e t x = 1 I n ( 7 - 2 7 ) .
C O R O L L A R Y h . T h e l e a d i n g c o e f f i c i e n t o f a . , ( m ) i s
( - l ^ C i y i ] ! 1 .
1 = 0
P R O O F . T h i s f o l l o w s f r o m ( 7 - 2 6 ) a n d n o t i n g t h a t
i = 0
r
7 . 3 - 1 5
C O R O L L A R Y J . p ( m ) = m S .
P R O O F . T h i s f o l l o w s f r o m ( 7 - 2 8 ) .
C O R O L L A R Y k . p o , ( m ) = 1 + ( - l ) s + 1 C [ ' m - l , s ] .
P R O O F . T h i s f o l l o w s f r o m ( 7 - 2 3 ) a n d C o r o l l a r y c
C O R O L L A R Y £ . T h e l e a d i n g c o e f f i c i e n t o f p , ( m ) i s
g i v e n b y
.8 ^
4 ^ £ ( - l ^ C t j , ! ] ! 3 .
1=0
P R O O F . T h i s f o l l o w s f r o m ( 7 - 2 8 ) a n d
T s , J = ^ i V - I ( - D ^ t j , ! ] ! 3 . 1=0
C O R O L L A R Y 1 . p 0 , ( m ) i s a p o l y n o m i a l i n m o f s , j
d e g r e e s .
P R O O F . T h i s f o l l o w s f r o m ( 7 - 2 8 ) .
7 . 3 - 1 6
r
C O R O L L A R Y ra. p n . ( m ) = 1 f o r e v e r y m s u c h t h a t s ^ m j s , j
P R O O F . T h i s f o l l o w s f r o m ( 7 - 2 3 ) a n d C o r o l l a r i e s d
a n d e .
C O R O L L A R Y n . p , ( l ) = 1 . s > J
P R O O F . T h i s f o l l o w s f ^ o m C o r o l l a r y ( ( 7 - 2 8 ) .
C O R O L L A R Y o .
s
p s ^ ( m ) ( - l ) J + 1 C [ l / m , j ] = 1 , s = 1 , 2 , 3 , . . . .
J = l
P R O O F .
s
J = l
r
7 . 3 - 1 7
i s t o h o l d f o r ^ a r b i t r a r y f . T a k e f ( x ) = x m . T h e n
P ( x ) = f 1 / i n ( x ) = x a n d h e n c e E s + 1 ( x , x , l ) = g s + 1 ( x , x m , m ) = 0 ,
f o r s = 1 , 2 , 3 , . . . . I t i s n o t d i f f i c u l t t o s h o w t h a t
Z j [ x , x m , l ] - ( - l ) J + 1 C [ l / m , j ] x . T h u s
s
0 = x
J - l
- £ p s ^ ( m ) i ( - l ) J + 1 C [ l / m , j ] x ,
a n d t h e r e s u l t f o l l o w s .
C o r o l l a r y n s h o w s t h a t f o r m = 1 ,
s
e s + 1 ( x , f , m ) . - x - £ p s ^ ( m ) Z J ( x , f , l )
r e d u c e s a s e x p e c t e d t o
E g + 1 ( x ) = x - ^ Z j ( x , f , l ) .
J - l
T h u s e g + 1 ( x , f , l ) - E g + 1 ( x ) .
C o r o l l a r y e l e a d s t o a n i n t e r e s t i n g r e s u l t . F r o m
( 7 - 1 4 ) a n d ( 7 - 2 3 ) ,
s s
e s + 1 ( x , f , m ) = x - ^ ^ a ^ j ( m ) Z j ( x , f , l ) .
j = l t = j
7 . 3 - 1 8
T h e n
00 00
l i m S s + 1 ( x , f , m ) = x - V V a t j j ( m ) Z J ( x J f , 1 )
> —+ 00 t—i, . _ .
00
= X - £ Z j ( x , f , l ) = l i m E s + 1 ( x , f , l )
j = l 3 0 0
7 . 3 - 1 9
7 . 3 3 F o r m u l a s f o r e . T a b l e 7 - 1 l i s t s s o m e o f t h e s
a „ . ( m ) e x p r e s s e d i n t e r m s o f C [ r m , t ] , w h i l e T a b l e 7 - 2 l i s t s
^9 J s o m e o f t h e a , . ( m ) e x p r e s s e d i n p o w e r s o f m . T a b l e 7 - 3 l i s t s
^90
s o m e o f t h e p o ^ ( m ) . U s i n g T a b l e 7~3j r e c a l l i n g t h a t
S 9 J Z j ( x ) = Y j ( x ) u ^ ( x ) , a n d u s i n g t h e e x p r e s s i o n s f o r Y j ( x ) g i v e n
i n T a b l e 5 ~ 1 * e n a b l e s u s t o c a l c u l a t e a n u m b e r o f t h e
e s + 1 ( x , f , m ) ;
S2 = x - mu(x),
£Q = x - mu(x) | (3-m) + mA2(x)u(x)
Pi P r
= x - mu(xWg (m -6m+ll) + m(2-m)A2(x)u(x) + m£ 2A|(x) - A3(x) u2(x)k
S5 = x - mu(x)|- (m3-10m2+35m-50) + ~ m(7m2-30m+35)A2(x)u(x)
12 m"(5-3m) 2Ag(x) - A3(x) u2(x)
+ nr 5A (x) - 5A2(x)A3(x) + A4(x) u3(x)j
J
T
J
o
+
M i—i O
O
7 . 3 - 2 0
H I
o
oo
o
+
O
l
oo
s OJ o oo
o oo
H I
OO
o oo
-3-
o oo
CVI
OJ
n toj
o
+
oo
OJ
o
+
OJ
o OJ
OO
o OJ
-3-
o
OJ
H OJ oo -3-
O o o o
OJ oo
7 . 3 - 2 1
CO
OJ I
9
CO
OJ OJ
H
OJ
oo
OJ
OJ
H I
oo
OJ
OO
H l
0
3 OJ
oo I 0
OJ I
3
I
-3-OJ
OJ OO
T A B L E 7 - 3 . p . ( m ) = ( - l ) S + J \ S , . . , . T . . m k . J * s , j v ' K ' s i s + l , k + l k , j
? i * 2 3 k
i 3
1 m
2 - ( m / 2 ) ( m - 3 ) 2
m
3 ( m / 6 ) ( m 2 - 6 m + l l ) - m 2 ( m - 2 ) m3
4 - ( m / 2 4 ) ( m 3 - 1 0 m 2 + 3 5 m - 5 0 ) ( m 2 / l 2 ) ( 7 m 2 - 3 0 m + 3 5 ) I - ( m 3 / 2 ) ( 3 m - 5 ) k
m
F r o m C o r o l l a r y " k , t h e e x p r e s s i o n s i n t h i s c o l u m n c a n b e r e p l a c e d b y
p s , i ( m ) - 1 + n 1 = 1
7 . 4 - 1
r
7 o 4 T h e C o e f f i c i e n t s o f t h e E r r o r S e r i e s o f £ g
T h e e r r o r s e r i e s f o r E ( x , f , l ) w a s s t u d i e d i n s
S e c t i o n 5 . 5 . We n o w d e r i v e a n a l g o r i t h m f o r c a l c u l a t i n g t h e
c o e f f i c i e n t s o f t h e e r r o r s e r i e s o f e o ( x , f , m ) . s
R e c a l l t h e d e f i n i t i o n s o f t h e f o l l o w i n g s y m b o l s
w h i c h w i l l b e u s e d f r e q u e n t l y :
* M - m > v - ' - ^ - » , w ^
a . . - ( x )
O b s e r v e t h a t B . n ( x ) = A . ( x ) .
L e t a b e a z e r o o f m u l t i p l i c i t y m . T h e e x p a n s i o n
o f u ( x ) i n t o a p o w e r s e r i e s i n e , w h i c h w i l l b e n e e d e d b e l o w ,
i s d e r i v e d n o w . T h r o u g h o u t t h i s s e c t i o n , a l l f u n c t i o n s a r e
e v a l u a t e d a t a u n l e s s o t h e r w i s e i n d i c a t e d . D e f i n e a ) ^ ( m ) b y
( x ) = £ o ) t ( m ) e 4 . ( 7 - 2 9 )
1=1
S i n c e
00 oo
f ( x ) = ] [ arer, f ' ( x ) = £
r = m r = m
r a p e r - 1
3
7 . 4 - 2
r
a n d m u ( x ) f , ( x ) = m f ( x ) , w e o b t a i n
l+l-m
Y C D q ( m ) ( ^ + 1 " ( l ) a t + l - q , 1 = m ' r a + 1 >
q = l
o r
l-l
q = l
( 7 - 3 0 )
S i n c e ^ = A ^ , w e o b s e r v e t h a t ( 7 ~ 3 0 ) r e d u c e s t o ( 5 - 3 4 ) w h e n
m = 1 , a n d h e n c e t h a t ^ ( l ) = a s e x p e c t e d .
I t i s n o t d i f f i c u l t t o p r o v e t h a t a n e x p l i c i t
f o r m u l a f o r o o ^ ( m ) i s g i v e n b y
r ^ 1 J [ ( m + i ) B , _ , , j " 1
» t c - ) - - B t > B + - 1 B t . J > m ^ ( - D r
r ! n — 3 7 1 ^ -j = i i = i
( 7 - 3 1 )
w h e r e r = > a n d w h e r e t h e i n n e r s u m i s t a k e n o v e r a l l
A n o n n e g a t i v e i n t e g e r s a ± s u c h t h a t ) i a ± = j . O b s e r v e t h a t
1 = 1
o ) p ( m ) i s t h e s a m e f u n c t i o n o f t h e B . a s v 0 i s o f t h e A . t l f > m t i
e x c e p t f o r c o e f f i c i e n t s w h i c h d e p e n d o n m o n l y . P r o m e i t h e r
( 7 - 3 0 ) o r ( 7 - 3 1 ) * t h e f i r s t f e w u ^ ( m ) n i a y b e c a l c u l a t e d a s
7 . 4 - 3
c u ^ m ) = 1 ,
= _ B 2 , m '
c 3 ( m ) = ( m + l ) B ^ m - 2 B ^ m ,
c o 4 ( m ) = - ( m + l ) 2 B ^ m + ( S r n ^ B ^ ^ - 3 B ^ m .
We t u r n t o t h e p r o b l e m o f f i n d i n g t h e c o e f f i c i e n t s
o f t h e e r r o r s e r i e s . D e f i n e 7\P ( m ) b y
Ks $ S
oo
e s ( x , f , m ) = ^ ^ t , s ( m ) e t - ( 7 - 3 2 )
1=0
S i n c e e ( x , f , m ) e I . w e e x p e c t t » „ = 0 f o r 0 < I < s , a n d
o S C , S
t = ex. T h i s m a y b e p r o v e n d i r e c t l y b y i n d u c t i o n o n s .
L e t s = 1 . T h e n
e i ( x , f , m ) = x = a + ( x - a ) = a .+ e .
N o w a s s u m e A ( m ) = a a n d A , ( m ) = 0 , f o r 0 < I < s e
U , o v • S S u b s t i t u t e ( 7 - 3 2 ) i n t o t h e f o r m u l a o f L e m m a 7 - 1 ,
S B + 1 ( x , f , m ) = C ( x , f , m ) - c ' ( x , f , m ) , ( 7 - 3 3 )
t o f i n d
00 00
t=0 t = s t = S
w h i c h c o m p l e t e s t h e i n d u c t i o n .
S u b s t i t u t i n g ( 7 - 3 2 ) i n t o ( 7 - 3 3 ) , u s i n g
oo
( x ) = Y <»t(m)el, m u %
1=1
i a n d e q u a t i n g t o z e r o t h e c o e f f i c i e n t o f e , w e a r r i v e a t
T H E O R E M 7 - 3 . L e t t h e m u l t i p l i c i t y o f a b e m . L e t
00 00
e a ( x , f , m ) = ^ -Kl)S(m)el, m u ( x ) = ^ ' ast(m)el. 1=0 ' 1=1
T h e n
l-l
s \ , s + l ( m ) + C ^ s J ^ s ^ ) + ^ r a 3 t + l - r ( m ^ r , s = ° '
r = l
( 7 - 3 4 )
w i t h ^ 0 j S ( m ) = a , ^ ^ ( m ) = 1, A ^ - ^ m ) = 0 f o r I > l , a n d
A t ^ g ( m ) = 0 f o r 0 < I < s a n d s > 1.
7.4-5
1
PQ
+
B <N OO
Q oo cvi pq^ pq g OO OJ
+ +
oo
-=J-pq
s « °° OO <N
pq oo oj B pq
I ^ oo pq
q a + a OJ OJ OJ CvJ
pq ^ pq 00 H oo
1 i i «—riOJ r-rjCM OO
CvJ
B
pq oo +
B B OO oo
+ OJ B ^ PQ
OJ *» -3-pq OJ oj +
OO OJ pq
OJ
r—1 Q H
^ O H CvJ OO ^ j -
7.4-6
S i n c e t h e o ) ^ ( m ) m a y b e c o n s i d e r e d a s k n o w n , (7"3^)
c a n b e u s e d t o d e t e r m i n e t h e 7v„ ( m ) . N o t e t h a t A - ( m ) i s t h e
v yS S
s a m e f u n c t i o n o f a ) , , ( m ) a s t « _ i s o f v . . S o m e o f t h e A - ( m ) v V / S C v ^ S
a r e l i s t e d i n T a b l e 7-4. U s i n g t h i s t a b l e w e f i n d
e 2 ( . x , f , m ) - a = B g e + ( m + l ) B + 2 B -2 ^ m
( m + l ) 2 B ^ - ( 3 m + 4 ) B p B - + 3 B , ^
| ( ^ 3 ) B ^ m - B
| ( m + l ) ( 2 m + 7 ) B 3 + 3 ( m + 3 ) B B _ - 3 B ,
e ^ ( x , f , m ) - a = 2 o
• K m + 6 m + 8 ) B ^ - ( m + 4 ) B Q B _ + B, ,
7 . 5 - 1
7 . 5 I t e r a t i o n F u n c t i o n s G e n e r a t e d B y D i r e c t I n t e r p o l a t i o n
I n t h i s s e c t i o n w e g e n e r a l i z e T h e o r e m 4-3 t o t h e c a s e
o f m u l t i p l e r o o t s . We h a v e t o a s s u m e s t r o n g e r c o n d i t i o n s t h a n
i n t h e c a s e o f s i m p l e r o o t s i n o r d e r t o c a r r y t h r o u g h o u r
a n a l y s i s .
7 . 5 - 2
7 . 5 1 T h e e r r o r e q u a t i o n * L e t x J ^ X ^ _ I J • • • ' x i - n b e
n + 1 9 . p p r o i l i m ^ i i t B t o a z e r o a o f m u l t i p l i c i t y m . L e t
o b e t h e p o l y n o m i a l w h o s e f i r s t s - 1 d e r i v a t i v e s a r e n j s
e q u a l t o t h e f i r s t s - 1 d e r i v a t i v e s o f f a t x j ^ x i _ i * • • • ' x i - n *
D e f i n e a n e w a p p r o j t i m a h t t o a b y
P n , s ( x i + l ) - ° - ( 7 - 3 5 )
T h e n r e p e a t t h i s p r o c e d u r e u s i n g t h e p o i n t s x ^ + ^ , x i ^ * * • ' x i - n + l
W e a s s u m e t h a t w e c a n f i n d a r e a l r o o t o f P w h i c h s a t i s f i e s n , s
( 7 - 3 5 ) . I f P „ 0 h a s a n u m b e r o f r e a l r o o t s 9 o n e o f t h e m i s n 9 s
c h o s e n a s b y s o m e c r i t e r i a .
W e h a v e
^ ) - + n 1 n ( t - x ^ ) 3 ,
w h e r e ^ ( t ) l i e s i n t h e i n t e r v a l d e t e r m i n e d b y
+•. = v
i + 1 x i ' x l - l ' • ' x i - n ' t ' a n d w h e r e r = s ( n + l ) . S e t t = x
T h e n
f^'te ] n
f( xi +i) = — n (-i+i-i-j)3* J = o
7-5-3
w h e r e = £±(x±+i) • S i n c e t h e m u l t i p l i c i t y o f a i s m ,
f ( x i + l ) = m l < x i + l - a ) '
w h e r e T | ^ + 1 l i e s i n t h e i n t e r v a l d e t e r m i n e d b y x ^ + ^ a n d a .
L e t t i n g e j _ _ j = x i - j " a * w e a r r i v e a t
m / n \ r m l v s i + l y rr / x s t „
e i + l = ^ r T ~JjnT7Z r 11 ^ e i - j - e i + l ) • (7-36) 1 U l i + 1 ; j = 0
E q u a t i o n (7-36) i s t h e e r r o r e q u a t i o n f o r t h e c a s e o f m u l t i p l e
r o o t s . W e a s s u m e t h a t
m < r = s ( n + l ) . (7-37)
B e f o r e a n a l y z i n g t h i s e r r o r e q u a t i o n w e i n v e s t i g a t e t h e r o o t s
o f a n i n d i c i a l e q u a t i o n .
7.5-4
7.52 O n t h e r o o t s o f a n I n d i c i a l e q u a t i o n . I n
S e c t i o n 3.3 w e i n v e s t i g a t e d t h e p r o p e r t i e s o f t h e r o o t s o f
t h e p o l y n o m i a l e q u a t i o n
k-1 g k > a ( t ) = t k - a £ t J = 0 , (7-38)
u n d e r t h e a s s u m p t i o n t h a t f o r k > 1,
k a > 1. (7-39)
T h e o r d e r o f t h e I . F . w h i c h a r e b e i n g s t u d i e d i n S e c t i o n 7.5
i s d e t e r m i n e d b y t h e r o o t s o f t h e e q u a t i o n
n
m t n + 1 - s Y t j = 0 , (7 - 4 0 )
w h i c h i s o f t h e f o r m (7-38) w i t h
k = n + 1, a = (7_4i)
S i n c e w e d e m a n d t h a t m < r = s ( n + l ) , (7-39) i s s a t i s f i e d a n d
T h e o r e m 3-2 b e c o m e s
7 . 5 - 5
T H E O R E M 7 - 4 . L e t
s n + l , s / m ( t ) = t
n - f l s_
m
n
i t J = 0
I f n = 0 , t h i s e q u a t i o n h a s t h e r e a l r o o t ^ = s / m .
A s s u m e n J> 1 a n d m < r = s ( n + l ) . T h e n t h e e q u a t i o n h a s o n e
r e a l p o s i t i v e s i m p l e r o o t P n + 1 S / / m a n d
m a x 1 ^ ± s m ^ ^ n + l , s / m ^ m
< - + 1 ,
F u r t h e r m o r e , j
* + 1 " m
e s / m . o • s , -i _
v n + l ^ p n + l , s / m ^ m
m
m + 1
m + 1
n + 1
w h e r e e d e n o t e s t h e b a s e o f n a t u r a l l o g a r i t h m s . H e n c e
^ P n + l , s / m = m = ^ + 1 .
n — • o o
A l l o t h e r r o o t s a r e a l s o s i m p l e a n d h a v e m o d u l i l e s s t h a n o n e .
T a b l e s 7 - 5 , 7 - 6 , 7 - 7 , a n d 7 - 8 g i v e v a l u e s o f p .
a = s / m , f o r m = 1 , 2 , 3 , a n d 4 r e s p e c t i v e l y . T a b l e 7 - 5 i s
i d e n t i c a l w i t h T a b l e 3 - 1 a n d i s r e p e a t e d h e r e t o f a c i l i t a t e
c o m p a r i s o n w i t h t h e o t h e r t h r e e t a b l e s . T h e c a s e s i n w h i c h
t h e c o n d i t i o n k a > 1 i s v i o l a t e d a r e l e f t b l a n k .
r
7 . 5 - 6
T A B L E 7 - 5 . V A L U E S O F P k
"a 1 / 1 2 / 1 3 / 1 V i
1 2 . 0 0 0 3 . 0 0 0 4 . 0 0 0
2 1 . 6 1 8 2 . 7 3 2 3 . 7 9 1 4 . 8 2 8
3 1 . 8 3 9 2 . 9 2 0 3 . 9 5 1 4 . 9 6 7
4 1 . 9 2 8 2 . 9 7 4 3 . 9 8 8 4 . 9 9 4
5 1 . 9 6 6 2 . 9 9 2 3 . 9 9 7 4 . 9 9 9
6 1 . 9 8 4 2 . 9 9 7 3 . 9 9 9 5 . 0 0 0
7 1 . 9 9 2 2 . 9 9 9 4 . 0 0 0 5 . 0 0 0
7 . 5 - 7
T A B L E 7 - 6 . V A L U E S O P 0 . K , d
"a 1 / 2 2 / 2 3 / 2 4 / 2
i k
1 1 . 5 0 0 2 . 0 0 0
2 1 . 6 1 8 2 . 1 8 6 2 . 7 3 2
3 1 . 2 3 4 1 . 8 3 9 2 . 3 9 0 2 . 9 2 0
4 1 . 3 4 9 1 . 9 2 8 2 . 4 5 9 2 . 9 7 4
5 1.410 1 . 9 6 6 2.484 2 . 9 9 2
6 1 . 4 4 5 1 . 9 8 4 2 . 4 9 4 2 . 9 9 7
7 1.466 1 . 9 9 2 2 . 4 9 8 2 . 9 9 9
7 . 5 - 8
T A B L E 7 - 7 . V A L U E S O P p.
"a 1 / 3 2/3 3 / 3 4 / 3
i k 1 1 . 3 3 3
2 1 . 2 1 5 1 . 6 1 8 2 . 0 0 0
3 1.446 1 . 8 3 9 2 . 2 1 0
4 1 . 1 2 6 1 . 5 5 2 1 . 9 2 8 2 . 2 8 4
5 1 . 1 9 9 1 . 6 0 4 1 . 9 6 6 2 . 3 1 3
6 1 . 2 4 3 1 . 6 3 1 1 . 9 8 4 2 . 3 2 5
7 1 . 2 7 1 1.646 1 . 9 9 2 2 . 3 3 0
I
7 . 5 - 9
T A B L E 7 - 8 . V A L U E S O F P k &
a* 1 / 4 2/4 3 / 4 4 / 4
\ k
1
2 1 . 3 1 9 I . 6 1 8
3 1 . 2 3 4 1.548 1 . 8 3 9
4 1 . 3 4 9 1.648 1 . 9 2 8
5 1 . 0 7 9 1.410 1 . 6 9 7 1 . 9 6 6
6 1 . 1 3 0 1 . 4 4 5 1 . 7 2 1 1 . 9 8 4
7 1 . 1 6 3 1.466 1 . 7 3 4 1 . 9 9 2
1
7 . 5 - 1 0
7 . 5 3 T h e o r d e r . We r e t u r n t o t h e a n a l y s i s o f t h e
e r r o r e q u a t i o n ,
n
e T + i = M± n ( e i . j - e i + i ) s >
^ f ( r ) U )
W e s h a l l a s s u m e t h a t
H e n c e e ^ 0 .
We m a y r e w r i t e ( 7 - 4 2 ) a s
n
' i + 1
J = 0
w h e r e
( 7 - 4 2 )
e i + l
- T ^ - 0 - ( 7 - ^ 3 )
( 7 - 4 5 )
P r o m ( 7 - 4 3 ) , w e c o n c l u d e t h a t A ± 1 . A s s u m e t h a t e± d o e s n o t
v a n i s h f o r a n y f i n i t e i . L e t
a i = l n 5 i = l n | e ± | , T ± = l n | T 1 | . (7-46)
r
P r o m ( 7 - 4 4 ) ,
7 . 5 - 1 1
n
m o i + 1
T± + s ' 1 - J '
j = 0
o r
' i+1
T
^ + ^ m m
n
( 7 - 4 7 )
N o w , ( 7 - 4 7 ) i s i d e n t i c a l w i t h ( 3 - 3 4 ) , e x c e p t t h a t T ± / m r e p l a c e s
J± a n d s / m r e p l a c e s s . O b s e r v e t h a t
T ± I n m i f ( r ) ( a )
f W ( a )
w h e r e a s
J ± -•> I n K .
F u r t h e r m o r e , ( 3 - 4 3 ) i s r e p l a c e d b y
n
I
J = 0
m x
p - i
B y m e t h o d s a n a l o g o u s t o t h o s e u s e d i n t h e p r o o f o f T h e o r e m 3 - 3
o n e m a y s h o w t h a t
' i + 1 m i f ^ ( c t )
( p - l ) / ( r - m )
W e s u m m a r i z e o u r r e s u l t s i n
7.5-12
T H E O R E M 7-5. L e t
J = | x | x - a | ^ rj*
w h e r e a i s a z e r o o f m u l t i p l i c i t y m . L e t r = s ( n + l ) a n d
a s s u m e t h a t m < r . L e t f ( r ) b e c o n t i n u o u s a n d l e t f ( m ) f ( r )
b e n o n z e r o o n J . L e t x - x . . , . . . e J a n d d e f i n e a
o 1* n
s e q u e n c e { x ^ a s f o l l o w s : L e t P N g b e a n i n t e r p o l a t o r y
p o l y n o m i a l f o r f s u c h t h a t t h e f i r s t s - 1 d e r i v a t i v e s o f
P a r e e q u a l t o t h e f i r s t s - 1 d e r i v a t i v e s o f f a t t h e n , s
p o i n t s x i ^ x i » i > • • • > x i - n ' A s s u m e t h a t t h e r e e x i s t s a r e a l
n u m b e r , xJL+1 £ J # s u c h t h a t P N s ( x i + i ) = ° * D e f i n e <& n g b y
x l + l " ° n , s ( x i 5 x i - l ' # 0 - ' x i - n ) #
L e t e ± _ j = x ± _ j - a a n d a s s u m e t h a t e ± d o e s n o t v a n i s h f o r
a n y f i n i t e i b u t t h a t ei + 1 / e j _ ~ * ° «
T h e n
;i+ll l ei
P m l f ( r ) ( a )
r I f < m > ( a )
( p - l ) / ( r - m )
(7-48)
w h e r e p i s t h e u n i q u e r e a l p o s i t i v e r o o t o f
n
n+1 - 2- Y t J = 0 . m /_j
j = 0
7 . 5 - 1 3
N o t e t h a t (7-48) r e d u c e s t o ( 4 - 3 0 ) i f m = 1 . F o r
t h e c a s e m = 1, w e s h o w e d t h a t e ^ 0 p r o v i d e d t h e i n i t i a l
e r r o r s w e r e s u f f i c i e n t l y s m a l l . F o r t h e c a s e m > 1, w e
a s s u m e e u i / e , 0 , w h i c h i m p l i e s e . - * 0 „
7 . 5 4 D i s c u s s i o n a n d e x a m p l e s . T h e r e s u l t s o f
S e c t i o n 7 * 5 3 a r e v e r y s a t i s f y i n g ; t h e o n l y p a r a m e t e r s t h a t
a p p e a r i n t h e c o n c l u s i o n o f T h e o r e m 7~5 a r e t h e o r d e r , t h e
p r o d u c t o f t h e n u m b e r o f n e w p i e c e s o f i n f o r m a t i o n w i t h t h e
n u m b e r o f p o i n t s a t w h i c h i n f o r m a t i o n i s u s e d , a n d t h e m u l t i
p l i c i t y o f t h e z e r o . T h e e f f e c t o f t h e m u l t i p l i c i t y i s t o
r e d u c e t h e f a c t o r s , w h i c h a p p e a r s l i n e a r l y i n t h e e q u a t i o n
w h i c h d e t e r m i n e s t h e o r d e r , t o s / m . F o r n f i x e d , t h e o r d e r
d e p e n d s o n l y o n t h e r a t i o o f s t o m . T h u s t h e o r d e r i s t h e
s a m e f o r n = l , s = 1 , m = 1 ( s e c a n t I . F . ) , a s f o r n = 1 ,
s ~ 2 , m = 2 . F u r t h e r m o r e , t h e l i m i t i n g v a l u e o f t h e o r d e r
a s n - * 00 i s s i m p l y 1 + s / m .
E X A M P L E 7 - 2 . s = 3 , n = 0 . T h i s I . F . w a s d i s c u s s e d
i n S e c t i o n 5 . 3 2 f o r t h e c a s e o f s i m p l e r o o t s . I f m = 1 ,
e i
I f m = 2 ,
r
7 . 5 - 1 5
' i + 1
1 * 1
| A . ( a ) p ( p _ l ) , p ^ 1.84,
I f m = 2,
' i + 1 1 1 f'"(a 3 f"(a
p - 1
p <v 1 . 2 3 .
E X A M P L E 7 - 3 . s - 1 , n - 2 . T h i s I . F . i s d i s c u s s e d
i n S e c t i o n 1 0 . 2 1 f o r t h e c a s e o f s i m p l e r o o t s . I f m = 1 ,
T
7 - 6 - 1
7 . 6 O n e - P o i n t I . F . W i t h M e m o r y
A n u m b e r o f t e c h n i q u e s f o r c o n s t r u c t i n g o n e - p o i n t
I . F . w i t h m e m o r y f o r t h e c a s e o f m u l t i p l e z e r o s a r e g i v e n i n
t h i s s e c t i o n . ^ S i n c e t h e s e t e c h n i q u e s a r e v a r i a t i o n s o n
e a r l i e r t h e m e s w e s h a l l p a s s o v e r t h e m l i g h t l y .
T h e I . F . s t u d i e d i n S e c t i o n 7.5 a r e o n e - p o i n t w i t h
m e m o r y i f n > 0. We s h o w e d t h a t i f m < r = s ( n + l ) , t h e n
b o u n d s o n t h e o r d e r p a r e g i v e n b y
m a x 1 , —
L M
S i n c e u = f / f ' h a s o n l y s i m p l e z e r o s , w e c a n a p p l y t h e
t h e o r y w h i c h p e r t a i n s t o s i m p l e z e r o s b y r e p l a c i n g f a n d i t s
d e r i v a t i v e s b y u a n d i t s d e r i v a t i v e s . A s a n e x a m p l e , w e
c o n s i d e r d i r e c t i n t e r p o l a t i o n s t u d i e d i n S e c t i o n 4 . 2 3 . T h e
c o n c l u s i o n o f T h e o r e m 4 - 3 s t a t e s t h a t
. ( r ) ( p - l ) / ( r - l )
L e t <& Q ( u ) b e t h e I . F . g e n e r a t e d f r o m 0 b y r e p l a c i n g f n , s n , s
b y u . T h e n
x - a | p
u < r \ a
r l u ' ( a
( p - l ) / ( r - l )
I
7 - 6 - 2
I n S e c t i o n 7 . 4 , c o , ( m ) w a s d e f i n e d b y
m u
00
(x) = Y utWel> e = x - a .
1=1
H e n c e
\x-a\y
I n p a r t i c u l a r ,
l , l ( u ) - X i " U i
x i " x i - l
u i " u i - l
( 7 - 5 0 )
i s o f o r d e r § ( 1 + ^ 5 ) ^ 1 . 6 2 f o r a l l m . I t i s n o t d i f f i c u l t
t o s h o w t h a t t h e f i r s t t w o t e r m s o f t h e e r r o r s e r i e s a r e
g i v e n b y
* L , l ( u ) " a ~ - B 2 , m ( a ) e i e i - l e . e * B . - J t e i ,
i j . m ma 9
9 m
B = a J + " i - l a = f ( j )
j , m ma m 9 a j JT
( 7 - 5 1 )
7.6-3
T
^ h a s a n i n f o r m a t i o n a l u s a g e o f t w o a n d a n
i n f o r m a t i o n a l e f f i c i e n c y o f . 8 1 f o r a l l m . * ] _ ^ s u f f e r s f r o m
t h e d r a w b a c k t h a t a s t h e a p p r o x i m a n t s c o n v e r g e t o a , u
a p p r p a c h e s a o / o f o r m w h i c h m a y n e c e s s i t a t e m u l t i p l e p r e c i s i o n
a r i t h m e t i c .
I f m i s k n o w n , a n u m b e r o f o t h e r t e c h n i q u e s a r e
a v a i l a b l e . S i n c e f ^ 1 7 1 " 1 ) h a s o n l y s i m p l e z e r o s , t h e t h e o r y
w h i c h p e r t a i n s t o s i m p l e z e r o s i s a p p l i c a b l e t o t h e I . P .
g e n e r a t e d b y r e p l a c i n g f b y f ( m - 1 ^ # S u c h I . F . h a v e t h e a d v a n
t a g e t h a t n o o / o f o r m s o c c u r ; t h e y h a v e t h e d i s a d v a n t a g e t h a t
r a t h e r h i g h d e r i v a t i v e s o f f m u s t b e ' u s e d .
We c a n a l s o r e p l a c e f b y F = f 1 ^ 1 1 1 . W e c a n p e r f o r m
d e r i v a t i v e e s t i m a t i o n o n t h e n e w I . F . F o r e x a m p l e , d e f i n e
* F n a n a l o g o u s l y t o * f n [ . ( S e e S e c t i o n 6 . 2 2 . ) I t i s c l e a r
t h a t t h e I . F .
* E n 1 ( F ) = x " ( 7 " 5 2 ) *n
h a s o r d e r s r a n g i n g f r o m ^ ( 1 + ^ / 5 ) t o 2 , a s a f u n c t i o n o f n .
A s a f i n a l e x a m p l e , w e p e r f o r m d e r i v a t i v e e s t i m a t i o n
o n t h e s e c o n d o r d e r I . F . ,
f £ 2 * b x - m
7.6-4
D e f i n e
* £ n 1 = x " m ^ 7 " 5 3 ) n
I t m a y b e s h o w n t h a t t h e o r d e r o f t h i s I . F . r a n g e s f r o m 1 t o
^ ( l + v / 5 ) a s a f u n c t i o n o f n a n d m . ( s e c a n t I . F . ) i s o f
l i n e a r o r d e r f o r a l l n o n s i m p l e z e r o s . T h e e s s e n t i a l d i f f e r e n c e
b e t w e e n ( 7 - 5 3 ) a n d ( 7 - 5 2 ) i s t h a t t h e d e n o m i n a t o r o n t h e r i g h t
s i d e o f ( 7 - 5 3 ) i s a l i n e a r c o m b i n a t i o n o f ?±_y 0 < j £ n,
w h e r e a s t h e d e n o m i n a t o r o n t h e r i g h t s i d e o f ( 7 - 5 2 ) i s a
l i n e a r c o m b i n a t i o n o f f ^ j , 0 ^ j ^ n .
1!
7 . 7 - 1
7 . 7 S o m e G e n e r a l R e s u l t s
T h e f o l l o w i n g s t a t e m e n t s t y p i f y t h e r e s u l t s w h i c h
w e h a v e o b t a i n e d i n t h i s c h a p t e r :
a . E . s = 2,3*...., i s o f l i n e a r o r d e r f o r a l l n o n -s
s i m p l e z e r o s .
b . S . s = 2,3,...., i s o p t i m a l ; i t d e p e n d s e x p l i c i t l y s ^
o n m a n d i t s o r d e r i s m u l t i p l i c i t y - i n d e p e n d e n t .
c . A n o n o p t i m a l m u l t i p l i c i t y - i n d e p e n d e n t L P . m a y b e
g e n e r a t e d f r o m a n y I . F . b y r e p l a c i n g f a n d i t s
d e r i v a t i v e s b y u a n d i t s d e r i v a t i v e s .
d . $ , g e n e r a t e d b y d i r e c t i n t e r p o l a t i o n , i s m u l t i -
n , s
p l i c i t y d e p e n d e n t ^ b u t n o t o f l i n e a r o r d e r i f
m < r = s ( n + 1 ) .
I n t h i s s e c t i o n w e s h a l l m a k e s o m e g e n e r a l r e m a r k s ,
F i r s t w e s u g g e s t t h e f o l l o w i n g
^ C O N J E C T U R E . I t i s i m p o s s i b l e t o c o n s t r u c t a n o p t i m a l
I . F . w h i c h d o e s n o t d e p e n d e x p l i c i t l y o n m a n d w h i c h i s
m u l t i p l i c i t y - i n d e p e n d e n t .
7 . 7 - 2
We h a v e e x c l u d e d e x p l i c i t d e p e n d e n c e o n m b e c a u s e
o f b a n d w e h a v e r e s t r i c t e d o u r s e l v e s t o o p t i m a l I . F . b e c a u s e
o f c .
T h e f o l l o w i n g i n c o r r e c t d i s c u s s i o n b y B o d e w i g [ 7 . 7 - 1 ]
i l l u s t r a t e s t h e p i t f a l l s w h i c h o n e m a y e n c o u n t e r i n t h e s t u d y
o f m u l t i p l e r o o t s . B o d e w i g a r g u e s a s f o l l o w s : S i n c e w e
r e q u i r e c p ( a ) = a, w e t a k e
cp(x) = x - f ( x ) H ( x ) .
A n e c e s s a r y c o n d i t i o n t h a t cp b e o f s e c o n d o r d e r i s t h a t
cp/(a) = 0 = 1 - f'(a ) H(a) - f ( a ) H ' ( a ) . ( 7 - 5 * 0
S i n c e f ( a ) a n d f ' ( a ) a r e b o t h z e r o f o r m > 1 , B o d e w i g r e a s o n s
t h a t t h e s e c o n d a n d t h i r d t e r m s i n ( 7 - 5 4 ) v a n i s h i f a i s a
m u l t i p l e r o o t . T h i s i g n o r e s t h e p o s s i b i l i t y t h a t H ( x ) h a s a
s i n g u l a r i t y a t x = a . I n N e w t o n ' s I . F . , f o r e x a m p l e ,
H ( x ) = l / f ' ( x ) . H e n c e f ( a ) H ' ( a ) = (l-m,)7m w h i l e
f ' ( a ) H ( a ) = 1 a n d n o n e o f t h e t e r m s o f ( 7 - 5 * 0 v a n i s h w h e n
m > 1 . T a k i n g H ( x ) = m / f ' ( x ) o r H ( x ) = l / [ f ' ( x ) u ' ( x ) ] s h o w s
t h a t B o d e w i g ' s c o n c l u s i o n t h a t cp ' ( a ) ^ 0 f o r m u l t i p l e r o o t s
i s i n c o r r e c t .
T h e a b o v e r e a s o n i n g l e d t o d i f f i c u l t i e s b e c a u s e t h e
c o n d i t i o n cp(a) = a , w h i c h i m p l i e s f ( x ) H ( x ) 0 , p e r m i t s H ( x )
"1 m
t o b e o f 0 [ ( x - a ) " ] . S i n c e u ( x ) = 0 ( x - a ) , w e c a n a v o i d t h e
d i f f i c u l t y b y t a k i n g
7 . 7 - 3
c p ( x ) = x - u ( x ) h ( x ) .
T h e n
< p ( x ) - a = x - a - + 0 [ ( x - a ) 2 ] | h ( x )
+ h ( x ) 0 [ ( x - a ) 2 ] .
L e t h b e c o n t i n u o u s i n a c l o s e d i n t e r v a l a b o u t a . T h e n cp i s
o f s e c o n d o r d e r i f a n d o n l y i f
h ( x ) - m = O ( x - a ) . ( 7 - 5 5 )
I n p a r t i c u l a r , w e r e q u i r e
h ( a ) = ra. ( 7 - 5 6 )
I f h ' i s c o n t i n u o u s , t h e n ( 7 - 5 6 ) i m p l i e s ( 7 " 5 5 ) . E q u a t i o n ( 7 - 5 5 )
i s e q u i v a l e n t t o
h ( xu | x j m - D ^ O . ( 7 - 5 7 )
W e s u m m a r i z e o u r r e s u l t s i n
= ( x - a ) m
7.7-4
T H E O R E M 7-6. L e t c p ( x ) = x - u ( x ) h ( x ) „ L e t a b e a
r o o t o f m u l t i p l i c i t y m . I f h i s c o n t i n u o u s i n a c l o s e d i n t e r
v a l a b o u t a, t h e n cp i s o f s e c o n d o r d e r i f a n d o n l y i f
h ( x ) - m n / n
I f h ' i s c o n t i n u o u s t h e n cp i s o f s e c o n d o r d e r i f a n d o n l y i f
h ( a ) = m .
F o r I . F . o f s e c o n d o r d e r , t h e C o n j e c t u r e w h i c h w e
s t a t e d e a r l i e r i n t h i s s e c t i o n i s e q u i v a l e n t t o t h e f o l l o w i n g
L e t c p ( x ) = x - u ( x ) h ( x ) , W h e r e h d o e s n o t d e p e n d u p o n a n y
d e r i v a t i v e s o f f h i g h e r t h a n t h e f i r s t a n d d o e s n o t d e p e n d
e x p l i c i t l y o n m . L e t h ( x ) b e a c o n t i n u o u s l y d i f f e r e n t i a b l e
f u n c t i o n o f x . T h e n i t i s i m p o s s i b l e t h a t h ( a ) = m f o r a l l f
w h o s e z e r o s h a v e m u l t i p l i c i t y m .
I n S e c t i o n 7-8 w e c o n s t r u c t a n h w h i c h d e p e n d s o n l y
o n f a n d f ' a n d f o r w h i c h h - * m . B u t t h i s h i s n o t c o n t i n u
o u s l y d i f f e r e n t i a b l e a t a .
7 . 8 - 1
7 . 8 A n I . F . O f I n c o m m e n s u r a t e O r d e r
L E M M A 7 - 2 . L e t f ( m + 1 ) b e c o n t i n u o u s i n t h e n e i g h b o r
h o o d o f a z e r o a o f m u l t i p l i c i t y m . L e t
T h e n h ( x ) -»• m .
P R O O F . L e t
f ( x ) = ( x - a ) m g ( x ) . ( 7 - 5 8 )
T h e n
g ( x ) - g ( a ) - f ( mm
} , ( a ) t 0 .
L e t
a ( x ) . ( x - a ) ^ f | f .
T h e n G ( x ) = O ( x - a ) a n d
u ( x ) = * = g . ( 7 - 5 9 )
F r o m ( 7 - 5 8 ) a n d ( 7 - 5 9 )
, x _ l n l f ( x ) I _ m l n l x - a l + l n | g | _^ m
w " l n | u ( x ) I ~ l n l x - a - l n | m + G | '
7 . 8 - 2
L E M M A 7 - 3 . L e t f a n d h b e a s i n L e m m a 7 - 2 . T h e n
u h ' - * 0 .
P R O O F . We s h a l l a s s u m e f o r s i m p l i c i t y t h a t f a n d
f ! ' a r e p o s i t i v e ; t h e r e s u l t i s t r u e i n g e n e r a l . T h e n f o r
x £ a,
h,(„) - 1 , _ m f ( x ) u ' ( x )
< * > ~ u l n [ u ( x ) j { l n [ u ( x ) ] ) 2 u ( x ) '
H e n c e
( \ 1 h ( x ) u ' ( x ) '
» ( x ) h ' ( x ) - l n [ u ( x ) ] " l n f u ( x ) j -
T h e f a c t t h a t h - + m , u ' l / m , l n [ u ] -oo c o m p l e t e s t h e p r o o f
L E M M A 7 - 4 . L e t f a n d h b e d e f i n e d a s I n L e m m a 7 - 2 .
L e t
c p ( x ) = x - u ( x ) h ( x ) . ( 7 - 6 0 )
T h e n cp' ( x ) 0 .
P R O O F . F o r x ^ a , h a n d h e n c e cp a r e d i f f e r e n t i a b l e
a n d
q ) ' ( x ) = 1 - u ' ( x ) h ( x ) - u ( x ) h ' ( x ) .
N o t e t h a t h - + m f r o m L e m m a 7 - 2 , u h ' 0 f r o m L e m m a 7 - 3 , a n d
u ' l / m ; t h e c o n c l u s i o n f o l l o w s I m m e d i a t e l y .
1
7.8-3
T H E O R E M 7 - 7 . L e t h a n d f s a t i s f y t h e c o n d i t i o n s o f
L e m m a 7 - 2 a n d l e t cp = x - u h . T h e n
- l n ( m f ( m ) ( q )
m •
1 / m .
( 7 - 6 1 )
P R O O F . S i n c e
cp = x - u h = x - u I n I f l n | u | '
w e h a v e
c p - q
x - q I n I x - a = I n I x - q I 1 _ I n l f
x - q I n u
R e c a l l t h a t
n _ l n | f I m m I n I x - q I + l n l g l
l n | u | I n x - q | - l n | r a + G ' u " m + G
x - q
We h a v e s h o w n t h a t h , w h i c h d e p e n d s o n l y o n f
a n d f , h a s t h e p r o p e r t y t h a t h - * m . S i n c e cp' 0 , o n e m i g h t
h o p e t o c o n c l u d e t h a t cp i s s e c o n d o r d e r b u t t h i s i s n o t t h e
c a s e . I t m a y b e s h o w n t h a t ( h - m ) / u - + o o . T h e n a n a p p l i c a t i o n
o f T h e o r e m 7 - 6 s h o w s t h a t cp i s n o t s e c o n d o r d e r . I n s t e a d w e
h a v e
7 . 8 - 4
w h e r e
f = ( x - a ) m g , G = ( x - a ) ^ .
T h e f a c t t h a t
g ^ i ^ f l l , Q . o ( x - a ) & m i ^ — \ /
p e r m i t s t h e c o m p l e t i o n o f t h e p r o o f .
T h i s t h e o r e m s h o w s t h a t t h e o r d e r o f cp i s i n c o m
m e n s u r a t e w i t h t h e o r d e r s c a l e d e f i n e d b y ( 1 - 1 4 ) . O b s e r v e
t h a t i f e i s a n a r b i t r a r y p r e a s s i g n e d p o s i t i v e n u m b e r , t h e n
x - a 9
c p - a
x - a 1+e
We c a n w r i t e ( 7 - 6 1 ) a s
c p - q
x - a C = -
l n ^ m f ( m ) ( a )
1 / m
l n ^ m m i
l n | x - a |
T h u s cp c o n v e r g e s l i n e a r l y a n d i t s a s y m p t o t i c e r r o r c o n s t a n t
g o e s l o g a r i t h m i c a l l y t o z e r o . I t i s c l e a r t h a t t h e s e q u e n c e
o f x 1 g e n e r a t e d b y cp c o n v e r g e s i f x Q i s s u f f i c i e n t l y c l o s e t o
I n p a r t i c u l a r , l e t f ( x ) = ( x - a ) m . T h e n C = - l n ( m ) / l n | x - a |
a n d C < 1 i f | x - a | < l / m .
f
7 - 8 - 5
h = l n l f 1 h - m l f X 1 2 l n | u | + l n | h 1 | ^ n l " I n j u r
o r m o r e g e n e r a l l y
h l n | f
n i + l " l n j u j + l n j l i j ' 1 " 0 > - ^ - - - ; h0 - L
T h i s , i n t u r n , s u g g e s t s i m p l i c i t l y d e f i n i n g h. b y
^ = l n | u j n + f l n | i i | • ( 7 - 6 2 )
F o r f = ( x - a ) m , ( 7 - 6 2 ) m a y b e d e r i v e d f r o m a n o t h e r
p o i n t o f v i e w . W e h a v e u = ( x - a ) / m a n d
( m u ) m = ( x - a ) m = f ,
a n d h e n c e
m - I n 1 f 1 ~ l n | u | + l n n T
T h e a n a l y s i s o f t h e o r d e r o f I . F . o f t h e f o r m cp = x - u h ± a n d
cp = x - uti h a s n o t b e e n c a r r i e d o u t .
T h e f o l l o w i n g g e n e r a l i z a t i o n s u g g e s t s i t s e l f .
R e c a l l t h a t
h = l n f 1 = m l n l x - a l + l n l g l
I n u | l n | x - a | - l n | m + G | '
w h e r e l n | m + G | - » I n m . S i n c e h - + m , t h i s s u g g e s t s t h a t w e c a n
a c c e l e r a t e t h e c o n v e r g e n c e o f h t o m b y d e f i n i n g
7 . 8 - 6
A f t e r f a n d f ' h a v e b e e n c a l c u l a t e d , i t i s n o t
e x p e n s i v e t o c a l c u l a t e h = l n | f | / l n | u | . I n a g e n e r a l r o o t -
f i n d i n g r o u t i n e i t m i g h t b e w o r t h w h i l e t o a l w a y s c a l c u l a t e h
A f t e r t h e l i m i t o f h h a s b e e n d e t e r m i n e d t o t h e n e a r e s t
i n t e g e r , t h e r o u t i n e c a n s w i t c h t o <5 2 = x - m u .