a simple improved-accuracy normal approximation for x2

8
Austral. J. Statist., 30A, 1988, 160-167 A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2 TOBY LEWIS School of Mathematics and Physics, University of East Anglia, Norwich NR4 7T.7, U.K. S uininary The following approximation for x2 with Y degrees of freedom is presented: 1/[1 - Q In(x2/v)] is distributed approximately normally with mean 1 - (9v)-l and standard deviation (18~)-~/~+( 181~)-~/~. For use without resort to computer capability it is shown to compare favourably with various existing approximations, on grounds either of simplicity or accuracy or both. There is little to be gained from modifying this simple formula in order to secure higher-order agreement with the Cornish-Fisher expansion. The formula is readily adapted to give a simple normal approximation for noncentral x2. 1. Introduction The literature contains a number of approximations to xz in terms of normal random variables. We denote the degrees of freedom of x2 by Y, and write $v = m; we denote a N(0,l) random variable by X; and we use the symbol M to mean “is distributed approximately as”. The integrated upper-tail probability corresponding to a value of a random variable we denote by p; e.g., p = 0.975 for X = -1.96 and p = 0.025 for X = +1.96. The simplest and most familiar approximations are Fisher’s: implying, for sufficiently large rn, and Wilson & Hilferty’s (which we shall call “WH”): (x’/”’’~ N(1- (2/9~), (2/9~)) or equivalently (x’/Y)’/~ X 1 - (2/9~) + (2/9v)l/’X

Upload: toby-lewis

Post on 03-Oct-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2

Austral. J. Statist., 30A, 1988, 160-167

A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2

TOBY LEWIS

School of Mathematics and Physics, University of East Anglia, Norwich NR4 7T.7, U.K.

S uininary

The following approximation for x2 with Y degrees of freedom is presented: 1/[1 - Q In(x2/v)] is distributed approximately normally with mean 1 - (9v)-l and standard deviation ( 1 8 ~ ) - ~ / ~ + ( 1 8 1 ~ ) - ~ / ~ . For use without resort to computer capability it is shown to compare favourably with various existing approximations, on grounds either of simplicity or accuracy or both. There is little to be gained from modifying this simple formula in order to secure higher-order agreement with the Cornish-Fisher expansion. The formula is readily adapted to give a simple normal approximation for noncentral x 2 .

1. Introduction

The literature contains a number of approximations to xz in terms of normal random variables. We denote the degrees of freedom of x2 by Y , and write $v = m; we denote a N(0,l) random variable by X ; and we use the symbol M to mean “is distributed approximately as”. The integrated upper-tail probability corresponding to a value of a random variable we denote by p ; e.g., p = 0.975 for X = -1.96 and p = 0.025 for X = +1.96.

The simplest and most familiar approximations are Fisher’s:

implying, for sufficiently large rn,

and Wilson & Hilferty’s (which we shall call “WH”):

(x’/”’’~ N(1- (2/9~), ( 2 / 9 ~ ) )

or equivalently ( x ’ / Y ) ’ / ~ X 1 - ( 2 / 9 ~ ) + (2/9v)l/’X

Page 2: A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2

x 2 APPROXIMATION 161

implying

i x 2 = rn + Xm’/’ + b(X’ - 1) + &(X3 - 6X)m-’/’ + O(rn-’) .

Fisher’s approximation is omitted from the detailed comparisons in this paper, since its error is far greater than that of any of the other approximations to be discussed.

Severo & Zelen (19GO) gave the following approximation (“SZ”):

( x ~ / Y ) ’ / ~ M 1 - (2/9v) + (2/9v)’/’[X - ( A / v ) ]

where

this implies that

These and other approximations may be compared with the Cornish-Fisher ex- pansion for x 2 in powers of rn-l/’, which is (see, e.g., Goldberg & Levine (1946) or Lewis (1953)):

Clearly, for large m (large v ) , Fisher’s approximation is accurate to order m1j2, WH to order 1, and SZ to order m-ll2.

Taking the formula (4) as far as the term in rn-3/2 we get the six-term Cornish-Fisher approximation for x2, which we shall denote by “CF”. (Note that this is incorrectly called “five-term” in Greenwood & Hartley (1962) p.147, though the adjoining description of Peiser’s approximation as “four-term Cornish-Fisher” is correct. Note also the following corrections required to the CF percentile values in Goldberg & Levine (1946), Table 4: 7.4351 for 7.4020 (v = 20, p = 0.995); 39.9979 for 40.0309 (v = 20, p = 0.005).)

In a major study, Zar (1978) considered eleven approximation foriiiulae for x2 in terms of X and v, and coinpared their accuracy at fifteen different significance levels ranging from p = 0.999 t o p = 0.001. His eleven formulae included Fisher’s approximation, WII, SZ and CF; also “ESZ”, “EWH” and “PWH”. ESZ denotes an empirically derived iiiiprovenieiit of the Severo-Zelen approximation ( 2 ) , (3), where A in (2) is replaced by a tabulated quantity K (Zar (1978), Table 3). EWH

Page 3: A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2

162 TOBY LEWIS

is the extended Wilson-Ililferty approximation due to Goldstein (1973):

( ~ ' / v ) ' / ~ [I - 2 / 9 ~ + (4X4 + 16X2 - 28)/1215v2 + (6X6 + 720X4 + 321GX2 + 2904)/229G35v3]

- (3X5 + 40X3 + 45X)/5832v2 + (301X7 - 1519X5 - 32769X3 - 79349X)/787320Ov3] ;

+ (2/~)'/*[X/3 - ( X 3 - 3X)/162~

(5)

and PWH, a polynomial extension of WH, is a rather longer formula than ( 5 ) expressing (x2/v)'I3 as a sixth degree polynomial in X with terms of the form ~ - ' / ' ( a + bv-' + CV-~)X' (s = 0,1,. . . , G).

Zar's main conclusions were as follows. For routine purposes, e.g. using a hand calculator, WII is fairly good at most pvalues, unless v is very small; if greater accuracy is desired, the best approximations without resorting to com- puter capability are obtained by using CF or ESZ. Where computer capability is not limited, still greater accuracy can be achieved by using EWH or PWH.

2. A New Approximation for x 2 We present in this paper an approximation to x 2 , to the best of our knowledge

new, having the simplicity of WH but with higher accuracy. It is as follows:

I/ [ 1 - ln(x2/v)] x ~ ( p , a2)

where ,u = 1 - (1/9v) , u = (16v)-li2 + ( 1 8 ~ ) - ~ / ' .

Equivalently,

1/[1 - $ h ( x ' / ~ ) ] x 1 - (1/9~) + [ ( 1 8 ~ ) - ~ / ' + (18~)-~/~]X . We denote this by "NA" (new approximation). Equation (7) implies that

PX 2 M m + Xm'I2 + f(x2 - 1) + &x3 - 7~)721-1/~ - &(X4 + 2X2 - 8)m-1 + ,

agreeing with the Cornish-Fisher expansion (4) to order m-1/2,

3. Comparisons of Accuracy

We show in Table la some exact percentiles of x', together with the approxi- mate values given by NA and by the main competitors indicated by Zar (1976) as not requiring resort to coiiiputer capability, viz. WH and CF. (For a comparison with ESZ, which was also recommended by Zar as simple to use but which re- quires the use of an ancillary table, see Table 2.) The seven percentiles tabulated

Page 4: A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2

x2 APPROXIMATION 163

in each case are for p = 0,999, 0.99, 0.9, 0.5, 0.1, 0.01, and 0.001; the values of v are 1, 2, 4, and 8.

Zar (1978) defined the relative error (“RE”) of any approximate value as

RE = (Approximate value-Exact value)/(Exact value) .

(For example, the lower 1% point of x 2 on 4 df is 0.297 and the approximate value given by NA is 0.285 (see Table la) , so RE = (0.285 - 0.297)/(0.297) = -0.04.) Zar effected his comparisons of accuracy by tabulating (Table 2 of his paper) the minimum degrees of freedom v necessary to achieve a specified value (0.01, 0.005, 0.001, 0.0005) of IREl. In our Table 2 below, which is adapted from Zar’s Table 2, we extend Zar’s coinparisons to our present approximation NA, retaining from the 11 approsiination formulae in Zar’s table just the 5 recommended “best buys” WH, CF, ESZ (the “simple” group) and EWH, PWH (requiring computer capability). We have reduced the 15 pvalues in Zar’s table to 11 by omission of 0.995, 0.975, 0.025 and 0.005, and the 4 specified levels of (RE1 to 3 by omission of 0.0005.

TABLE l a 2 Comparative table of various approximate and exact percentiles of X

p =0.999 0.99 0.9 0.5 0.1 0.01 0.001

v = 1

v = 2

u = 4

v = 8

Exact NA WH CF

Exact NA WH CF

Exact NA WH CF

Exact NA WH CF

0.05157 0.01’8 -ve -ve

0.0*200 0.0343 -ve -ve

0.0908 0.0770 0.0403 0.0900

0.S57 0.839 0.764 0363

0.03157 0.0158 0.0~2 0.0108 -ve 0.0052 -ve 0.1354

0.0201 0,211 0.0136 0.205 0.0029 0.197 0.0773 0.237

0.297 1.061 0.285 1.063 0.249 1 .O6O 0.320 1.068

1.646 3.490 1.633 3.491 1.597 3.493 1 A53 3.490

-

0.455 0.472 0.471 0.412

1.386 1.405 1.405 1.373

3.357 3.370 3.370 3.353

7.344 7.352 7.352 7.343

2.706 2.807 2 639 2.656

4.605 4.657 4.559 4.602

7.779 7.804 7.747 7.779

13.362 13.373 13.340 13.362

6.635 6.766 6.586 6.811

9.210 9.257 9.221 9.263

13.277 13.290 13.306 13.292

20.090 20.093 20.121 20.094

10.828 10.811 11.157 11.360

13.816 13.763 14.133 13.967

18.467 18.423 18.724 18.508

26.1 25 26.096 26.318 26.135

NA = new approximation (equations (6)); WH = Wilson-Hilferty approximation (equation (1)); CF = six-term Cornish-Fisher approximation (equation (4)).

Page 5: A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2

164 TOBY LEWIS

Tables la and 2 indicate that the simple approximation formula NA presented here is by and large more accurate than Wilson & Hilferty's approximation. For many practical purposes, when a level of accuracy corresponding to IRE1 = 0.01 or 0.005 is adequate, NA compares reasonably in performance with the six-term Cornish-Fisher formula and with the empirically modifed Severo-Zelen approxi- mation.

TABLE l b

p =0.999 0.99 0.9 0.5 0.1 0.01 0.001

u = 1 MNA 0.Ol21 0.0'3 0.0091 0.432 2.702 6.933 11.988

u = 4 MNA 0.0809 0.287 1.058 3.354 7.783 13.318 18.610 v = 8 MNA 0.846 1.640 3.487 7.344 13.363 20.105 26.173

u = 2 MNA 0.0364 0.0142 0.200 1.378 4.610 9.324 14.230

MNA = new approximation with SevereZelen type modification (equation (10)).

4. A Severo-Zelen Type Modification of NA?

As noted above, Wilson & Hilferty's approximation (1) and Severo & Zelen's approximation (2) agree with tlie Cornish-Fisher expansion (4) to orders 1 and m-ll2 respectively. Essentially, SZ is a straightforward modification of WH by addition of a suitably constructed extra term. On the whole SZ gives higher numerical accuracy than WH (see Zar (1978)).

The question arises as to whether the new approximation (7), which agrees with CF to order m-1/2, could usefully be modified in a similar way so as to agree with CF to order m-l. One must not expect too much from this enterprise, since CF itself, taken as far as the term in m-3/2 , has its limitations (see Tables la and 2) and gives some poor results along the lower tail for small values of u. However, it is of interest to explore the possibility.

Let us modify (7) in the spirit of Severo & Zelen's approach by writing

1/[1- Qln(x2/v)] x 1 - (l/9u) + [(18v)-'l2 + ( 1 8 ~ ) - ~ / ~ ] X + a ~ - ~ , (8)

where a is a quantity of order 1 which remains to be determined. Equation (8) implies that

3 x 2 x n + Xrn'/2 + i ( X ' - 1) + h ( X 3 - 7x)m-'/2

- [&(X4 + 2X2 - 8) - $a]m-' + O(m-3/2). (9)

For agreement of (9) with (4) to order m-' we choose a so that

- 2:s(x4 + 2 X 2 - 6 ) - :a = &(3X4 + 7X2 - 16) .

Page 6: A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2

x 2 APPROXIMATION 165

TABLE 2 Accuracy of various approximation formulae. The quantity tabulated is the minimum Y neces- sary to acfu’eve the specified level of

p =0.999 0.99 0.95 0.9 0.75 0.5 0.25 0.1 0.05 0.01 0.W1

IRE1 = 0.01 NA 11 7 4 3 2 3 3 3 2 2 1 WH 25 14 6 4 4 3 1 2 2 1 6 CF 8 7 5 4 2 2 2 1 1 2 3 ESZ 9 6 5 4 2 1 2 2 1 1 1

EWH 6 4 3 2 1 1 2 2 2 2 2 PWH 1 2 2 2 2 2 1 2 2 2 2

]RE/ = 0.005 NA 14 9 5 3 4 4 4 4 3 3 1 WH 36 19 8 4 6 4 2 4 4 2 12 CF 9 8 6 4 2 3 2 3 2 3 3 ESZ 9 7 6 5 3 1 2 2 1 1 1

EWH 7 5 3 2 2 2 2 2 2 2 3 PWH 4 3 2 2 2 2 1 2 2 2 3

pI.E/ = 0.001 NA 26 15 7 4 8 9 8 8 7 5 9 W H 87 43 14 5 15 9 9 12 9 14 47 CF 13 12 8 7 4 5 4 3 3 5 6 ESZ 12 10 7 6 4 2 4 5 1 2 4

EWH 9 6 4 3 2 2 3 3 3 3 4 PWH 5 4 2 3 - 4 3 3 2 2 3 3

The entries in this table for approximation formulae WH, CF, ESZ, EWH and PWH have been taken from Table 2 of Zar (1978).

With this value of a (8) takes the following form, which we denote by “MNA” (modifed NA):

where usual

The comparisons of accuracy in Table la are extended to MNA in Table lb , which shows the approximate values given by MNA for the percentiles of x 2 at the same

Page 7: A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2

166 TOBY LEWIS

values of p and v as in Table la. For values of p near 1 , MNA performs rather better than the unmodified NA, but for values of p near 0 it is noticeably less accurate. Not unexpectedly, the agreement with the Cornish-Fisher expansion to the further term in rn-l has not guaranteed an improvement in the numerical results produced by the simple approximation.

Similar results are found when (8) is extended by a further term of the form bv-S/2, but details will not be given here.

5. A Normal Approximation for Noncentral x2 We denote noncentral x 2 with v degrees of freedom and noncentrality param-

eter X by ~ ” ( v , A). Many approximations for noncentral x2 have been proposed, including a number of approximations in terms of central x 2 , of which the two simplest are due respectively to Patnaik and to Pearson, and a number of normal approximations. A detailed account is given in Johnson & Kotz (1970), pages 139-143. Johnson & Kotz recommend Pearson’s approximation in terms of x2 as the only reasonably simple approximation which gives reliable results over a wide range of values of A. It is as follows (Pearson (1959)):

where x2(f) denotes (central) x2 with f degrees of freedom, and where

b = - ~ 2 / ( ~ + 3x1 , = (. + 3 ~ ) / ( ~ 3- 2 4 , f = (. + 2 ~ ) 3 / ( ~ + 3 ~ ) z . (12)

This suggests that the present normal approxiination (7) can be applied to ~ ’ ( f ) in (11) to give a normal approsimation for x t a ( v , ~ ) . We have

where = 1 - ( i / s f ) , uf = ( i ~ f ) - l / ~ + ( i ~ f ) - ~ / ~ . (13)

Replacing x2(f)/f by [x”(v ,X) - b ] / c f we get the following formula “NCNA” (new approximation to noncentral x2):

where Johnson 8~ Kotz (1970), page 142, Table 1, give the errors of seven different

approximations to the 0.95 and 0.05 percentiles of x ” ( Y , A), including Pearson’s approximation ( l l ) , (12), for v = 2, 4, 7; A = 1, 4, 16, 25. The errors of Pear- son’s approximation are less than 0.05 in absolute value in 20 of the 24 cases, the absolute errors in the other 4 cases being 0.06, 0.06, 0.09, 0.12. When the

, cj are given by (13) with f = ( v + 2X)3/ (v + 3X)2 .

Page 8: A SIMPLE IMPROVED-ACCURACY NORMAL APPROXIMATION FOR x2

x APPROXIMATION 167

corresponding 24 percentile values are calculated from our NCNA approximation (14), their absolute errors are essentially the same as those of the Pearson approx- imation, differing from them by only 0.00 or 0.01 in 21 cases, 0.02 in 2 cases and 0.03 (NCNA error 0.01 versus Pearson error 0.04) in the remaining one. Evidently NCNA could well be useful as a simple normal approximation to noncentral x2, though one would first need to assess its performance for pvalues other than 0.95 and 0.05.

Acknowledgements

I am most grateful to the referee for valuable comments and for suggesting the developments to my first draft which have been effected in Sections 4 and 5 .

References GOLDBERG, H. & LEVINE, H. (1946). Approximate formulas for the percentage points and

GOLDSTEIN, R.B. (1973). Chi-square quantiles. Algorithm 451. Commun. Assoc. Cornp. Mach.

GREENWOOD, J.A. & HARTLEY, H.O. (1962). Guide to Tables in Mathematical Statistics,

JOHNSON, N.L. & KOTZ, S. (1970). Continuous Univariate Distributions - 2 , Boston:

LEWIS, T. (1953). 99.9 and 0.1% points of the x 2 distribution. Biometrika 40, 421426. PEARSON, E.S. (1959). Note on an approximation to the distribution of non-central x2. SEVERO, N.C. & ZELEN, M. (1960). Normal approximation to the chi-square and non-central

ZAR, J.H. (1978). Approximations for the percentage points of the chi-squared distribution.

normalization o f t and x2. Ann. Math. Statist. 17, 216-225.

16, 482-485.

Princeton: Princeton University Press.

Houghton Mimin Company.

Biometrika 46, 364.

F probability functions. Biometrika 47, 411-416.

Appl . Statist. 27, 280-290.

Received 9 July 1987; revised 30 September 1987