saddlepoint approximation in the linear structural relationship model

This article was downloaded by: [Universitaetsbibliothek Wuerzburg]On: 18 October 2014, At: 17:41Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK

Communications in Statistics - Simulation andComputationPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/lssp20

Saddlepoint approximation in the linear structuralrelationship modelSpiridon Penev aa Department of Statistics, School of Mathematics , University of New South Wales ,Sydney, NSW, 2052, AustraliaPublished online: 27 Jun 2007.

To cite this article: Spiridon Penev (1995) Saddlepoint approximation in the linear structural relationship model,Communications in Statistics - Simulation and Computation, 24:2, 349-366, DOI: 10.1080/03610919508813246

To link to this article: http://dx.doi.org/10.1080/03610919508813246

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”)contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensorsmake no representations or warranties whatsoever as to the accuracy, completeness, or suitabilityfor any purpose of the Content. Any opinions and views expressed in this publication are the opinionsand views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy ofthe Content should not be relied upon and should be independently verified with primary sources ofinformation. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands,costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial orsystematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distributionin any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/lssp20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/03610919508813246

http://dx.doi.org/10.1080/03610919508813246

http://www.tandfonline.com/page/terms-and-conditions

COMMUN. STATIST.-SIMULA., 24(2), 349-366 (1995)

SADDLEPOINT APPROXIMATION IN THE LINEAR STRUCTURAL RELATIONSHIP MODEL

Spiridon PENEV

Department of Statistics, School of Mathematics, University of New South Wales, Sydney,2052 NSW

Australia

Key words and phrases: Saddlepoint approximation; Linear structural

relationship; Maximum likelihood estimator; M- estimator; marginal

distribution.

ABSTRACT

It is shown that the joint maximum likelihood estimator of slope

and intercept of the regression line in the classical (known error-

variance ratio) linear structural relationship model can be represented

as a solution of a two- dimensional M- equation. Therefore, it is

possible to use a general saddlepoint approximation for multidimensional

M- equations. Under normality assumptions we express the solution of the

implicit multivariate "centering equation" in an explicit form. This

allows a considerable saving of computing time. By integrating out

numerically an unwanted variable one is also able to find the

saddlepoint appro~imation for the slope- estimator. Numerical examples

illustrate the efficiency of the approximation.

Copyright O 1995 by Marcel Dekker, Inc Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

1. INTRODUCTION

PENEV

The purpose of this paper is to derive a saddlepoint

approximation for the distribution of the main parameters of interest in

the linear structural relationship model. This model has been studied

for more than one hundred years. Its history begins with the paper by

Adcock (1877). Some new aspects of the model like identifiability,

asymptotic and small- sample properties of estimators, and robustness,

have received interest again over the last decade. These aspects are

reviewed in the monograph Fuller (1987).

Let n be the sample size. The proof of the joint asymptotically

normal distribution of the n'I2- normed maximum likelihood estimators

of the slope and intercept is given in Fuller (1987),p. 32. As is well

known (Barndorff- Nielsen and Cox (1989)), the saddlepoint technique

usually gives more accurate asymptotic approximations. These

approximations are good down to very small sample sizes. Although

originally derived by Daniels (1954) for approximating the distribution

of the sample mean, the saddlepoint approximation method has been

successfully extended to M- estimators (Field (1982)) and to some other

special non- linear statistics by Daniels and Young (1991). The M-

estimators form a very large class and hence the generalization from the

sample mean to the case of M-estimator turns out to be a very

significant step. The price of this generalization is, however, that

usually the saddlepoint approximation can not be computed explicitly.

For every fixed argument value one has to iteratively solve a

complicated implicit non-linear equation system. As pointed out in Field

and Ronchetti (1990) (see page log), in terms of computational efforts,

solving such a set of non- linear systems is the dominant feature.

Moreover, the root finding procedures seem to often fail in two- or more

dimensional problems at least at some points of the grid.

Another obstacle on the way of practical implementation of the

method is that the implicit solution is given in terms of the cumulant

generating function of the observations, which is rarely known in

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

SADDLEPOINT APPROXIMATION 351

practice. Yet, there are cases where under special distribution

assumptions (e.g.normality) on the observations, the implicit equations

have explicit solutions. This allows a considerable saving in

computation. Such cases are of interest for practical applications

because for them, the cumulant generating function is also known. The

problem of finding the saddlepoint approximation of the distribution of

the slope and the intercept in the classical linear structural

relationship model falls into this category.

Let us point out that in view of the normality assumption and the

fact that canonically parameterized exponential family models have

closed form solutions for their saddlepoint equations, it is not too

surprising that an explicit solution of the implicit equations system

can be found. On the other hand, however, the presence of the

measurement error makes the problem hard enough and no obvious

presentation for this model as a canonically parameterized exponential

family model is possible. Our numerical results show that the

approximation we get significantly out performs the usual normal- based

asymptotic approximation for small sample sizes 11115.

2. BRIEF MODEL DESCRIPTION

The linear structural relationship model is given by the

following system of equations:

Yi =Po + P,ti + ei (1)

x. = 6. + v. , i=1,2 ,..., n 1 1 1

(2) where n is the sample size. Here yi and xi (i=1,2, ..., n) are the

recorded observations of the dependent and independent variables. Their

"true" scores are Po + P1.ki and t i , respectively. The random error

disturbances in the dependent and the independent variables are ei and

vi, respectively. The notion of a linear structural relationship in the

model given by the equations (1) and (2) refers to the situation where

ki, 2 n are independent and identically distributed random

variables (e.g. Kendall and Stuart (1979, Chapter 29)).

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

352 PENEV

We shall assume independence and normal distributions for v, e,

and 6, with zero marginal means for v and e and variances o2 and oi,

respectively, and mean p and variance a2 for 5. If all the barameters 5 Po,Pl,p, at, a;, and d were unknown.then the most important parameter

of interest, the regression slope PI, is non- identifiable (see Reiersol

(1950)). In the remaining sections of this paper we shall focus on the

classical way of resolving the non-identifiability problem. This is to 2 2 . assume that the ratio 3L = cre/ov 1s known. The most interesting case is

when k 1, for otherwise the observations and the parameters can be

suitably scaled to achieve this. From now on we shall assume that 31;. 1.

Let us note that k1 is a very common case in the studies concerning the

validity of the Law of Initial Value (see Jin (1992), page 178). Also

the Gaussian distributional assumption seems to be reasonable for such

studies.

The maximum likelihood estimators of Po and PI can be shown to be

the intercept and the slope of that line in the plane, which minimizes

the sum of the squared perpendicular distances of the observed points

[xi,yi]' to the fitted line q({)= Po+ , i.e.

Those estimators can be calculated from (3) as functions of the

empirical means and covariances. If sxxt syy,sxy are the empirical

variances of X,Y and their empirical covariance, respectively, then

A A

Next, Po can be found using the value of P obtained and the 1

empirical means K ,L of X and Y :

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014


The maximum likelihood estimators of the remaining parameters are A - A 2 A " 2 " 2 A 2 A 2

given by p=x , o -s /PI, o,=o,=s -(PI) .O . (Note that to obtain the 5- XY YY 5 maximum likelihood estimators, one should use lln rather than l/(n-1)

as a normalizing constant when computing s ,s and s ). xx YY X Y

Let us point out that the non-linear nature of the objective

function in (3) gives rise to non-normally distributed estimators Bo A

and PI. As our simulations show, the deviation from normality is

sensible especially for small sample sizes such as n=5 to n=15.

The key observation which allows the computation of a saddlepoint

approximation for the common density of the vector statistic of the

slope and the intercept estimators discussed, is that this statistic is

the solution of a two- dimensional M-equation. Let us define the vector

function

A A

Then the vector statistic [po,P1]' is the solution of the two-

dimensional equation system

A

Moreover if we assume that PI and s should have the same sign X Y

(which is quite a natural assumption) then this solution is even

unique. Note that in case where s =O some additional difficulties XY

arise. But first such a situation occurs with zero probability and A A

secondly,in such a case it should be assumed that Pl=O and then Po=<

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

354 PENEV

3. THE SADDLEPOINT APPROXIMATION FOR A GENERAL MULTIVARIATE M-STATISTICS

Daniels's original idea of saddlepoint approximation of the

density of the mean was first extended to the case of one- dimensional

M-estimators of location by Field and Hampel (1982). Later Field (1982)

used different reasoning to see that general multidimensional M-

estimators can be approximated. His idea was to write the M-estimator

locally as a mean and then to use the saddlepoint approximation for the

mean. The approach of Field is closely related to the tilted Edgeworth

expansions. In the remaining sections we adopt the notation used in the

monograph Field and Ronchetti(l990). To approximate the density of the A A

M- estimator [Po,PIIf at some fixed point t=[to,t,]', one first has to

recenter the original density of the observations [y,x]' by embedding

it into an exponential family. For a fixed value t=[to,t,]' the new

density (the conjugate density) has the form:

The original density f(y,x) in the model considered is bivariate normal

with parameters: mean vector equal to [po+PIp,p]' and covariance matrix

The functions a l ( t ) and a2(t) should be chosen in such a way that

the following equalities

should hold for j=1,2.

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014


Having found the solutions a l ( t ) and a$t) from (9), one can

substitute them into (7) to obtain the tilted density. Then one can get

the determinants of the 2x2 matrices A= E{ayr(y,x,t)/at) and

Note that all expected values are computed with respect to the

conjugate density ht(y,x). After all these preparations one is able to

compute the saddlepoint approximation for the joint density p(to,tl) of A A A A

the statistics [Po,PI]' at the point t = [ t , t I f as follows ( e.g. 0 1

Field(1982) , Theorem 1 or Field and Ronchetti(l990), Theorem 4.5):

p(to,tl) = & . ~ - ~ ( t ~ , t ~ ) . 1 det A 1 . Jdet 11-"2.[l t O(lin)] (10)

4. CALCULATING THE MATRICES A AND 1

For the approximation (10) to hold some conditions should be

satisfied (see p. 62 of Field and Ronchetti (1990)). Because of the

normality assumptions these conditions are satisfied. Now denote by

Eh(.) the expected value with respect to the shifted density ht(y,x)

and by o i l , 022 and o12 the variance of Y, the variance of X and the

covariance of X and Y computed with respect to the same density. Note

that their dependence on t has been suppressed in the notation. Then

after some easy transformations it can be seen that the two equations

in (9) are equivalent to the following relations:

Next we observe that because of the special form of the functions

vl and qr2 in our problem (containing no more than a second degree of x

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

356 PENEV

and y) , it could be expected that the shifted density ht(y,x) is again

normal. Let b=Eh(x). Then for every fixed t this density should be

proportional to the expression

The coefficients in front of the same powers of x2,y2,xy,x and y

in the exponents in (7) and (13) have to be equal. This leads to

following equation system:

Here we have substituted ~=a'.a;(l+~:)+< 5 for the determinant

det(B). The six equations (12) and (14)-(18) can be solved with respect

to the six unknown quantities ~ , o ~ ~ , ~ ~ ~ , c s ~ ~ , c ~ ~ and a2. Details about

the way to find the solution are available from the author. Here we

shall only give the final result.

If t1& then we can find recurrently the quantities involved by

first computing

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

SADDLEPOINT APPROXIMATION

Then a12 can be substituted to get the values of o l , :

and of oZ2:

The result for a, can be obtained then as follows:

where ~ = o , , ~ , - o : ~ > ~ . Next the solution for b can be obtained:

The final step is to substitute the value of b to get :

Note that special care is needed for the case tl=O, as then the

results obtained for 011,022 and u2 are undetermined. But in this case 2 2 we see from equation (12) that o12=0. Then (15) gives al,=A/(o +oV), 5

(14) gives A / ( ~ ~ o ~ + o : ) and (16) gives %= P,G$A. Further on we 1 5

obtain from (17) that b = R ( t o ~ l ~ ~ - ~ o ~ l ~ ~ p o ~ ) / ~ and from (18) that

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

358 PENEV

All expressions we get in case tl=O can be easily seen to be just the

limits of the corresponding general formulae for tl#0 when t 1 4 .

Hence the implicit equation system (9) has an explicit solution.

Utilizing this solution we can also find explicit expressions for the

matrices A and 1. The matrix A has the form

All expected values can easily be computed. The final result is: det A=

1+t; -.oI2. In addition, the case tl=O needs a special consideration.

Using equation (19) we see that the limit of det A when t 1 4 is

Similar but more involved computations help us to calculate the

expected values of the elements of the matrix 1. Finding E,,(y,y2) and

E ($) requires the knowledge of higher order moments of the random h

variables Y and X with respect to the shifted density ht(y,x). The

normality assumption is essentially used for this part. The details of

the computation are available from the author. The final result is:

det l= (tig2-2tla12+oll)[(20~2+011022)(t~-41) +3t;(G;]+ 0:2)

It can be shown that 1 is non- generate, i.e. det PO holds.

5, SIMULATION STUDY

Looking at formula (lo), we see that the only quantity which

remains to be computed is the function c-"to,tl). For every fixed

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014


value of t we have to evaluate the double integral in (8). As the

integrand is known, this may be done numerically. The corresponding

NAG- Library routine DOlDAF has proven to be safe enough to perform a

reliable two- dimensional integration. Also the normalizing constant

should be determined numerically in such a way that the density

approximation integrates to one. This renormalized saddlepoint density

gives an even better improvement (Field and Ronchetti (1990)).

Our numerical calculations show the effectiveness of the

saddlepoint approximation for the joint density for sample sizes 11515

as compared to the normal asymptotic approximation. The advantage

disappears quickly then with an increase in sample size. For the sake

of illustration we fit the joint density of the maximum likelihood

estimators of the parameters Po=3 and PI=1.5 of the model 1)-2) with 2 2 n=5,p=l,o -2.5 and a =o =0.25. Figure I shows the asymptotic normal t - v e A A

approximation for the density of the vector [Po$,]'. It has mean

[Po,P,]' and a covariance matrix

J

r=[(l +p:)02 +02] .02102 (compare Fuller(1987).p. 32). Figure 2 shows the 5 e 5 renormalized saddlepoint approximation for the same density. Figure 3 shows the "true" (based on 600 000 simulations) density. The graphs

show the contours of the density surfaces. In addition, pseudocolor

plot is superimposed on the same graphs to enable better viewing. The

densities values are calculated on a grid of 50x50 equidistant points

around the vector parameter [3,1.5]'. Numerical values of the density

approximations show a significant overall advantage of the saddlepoint

approximation as compared to the asymptotic normal approximation. A

careful study of the density surfaces in figures 1,2 and 3 confirms

this advantage.

The normal approximation is much more stretched whereas the

saddlepoint approximation almost coincides with the true non- symmetric

density surface for all values of the argument [to,tl]'. Note that the

possibility to avoid iterations by explicitly solving the "centering

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

360 PENEV

FIG. 1. The asymptotic normal approximation for the joint density A A

of the vector [Po$,]' with n=5, p0=3, P,=1.5, p=1, a2-2.5 and 5-

equations" (9) makes it easy to obtain density approximations on a

dense grid of values [to,t,]' and hence, to obtain very accurate

contours of the confidence regions for [Po,P,]'. The unknown parameters 2

02=02, p and a involved in the construction can be substituted by v e 5

their estimators discussed in Section 2.

In most applications the density of the estimator PI of the slope

is of primary interest. To obtain it, one has to integrate out the

unwanted variable to in the bivariate density approximation. This can

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014


FIG. 2. The saddlepoint approximation for the joint density of A A

the vector [p ,p 1' with n=5, P0=3, P,=1.5, p=l, o -2.5 and 02=02=0.25. 0 I i- e

be done either numerically or by a Laplace approximation. As pointed

out by Daniels and Young (1991), the numerical integration seems to be

the safest procedure. We applied the numerical integration method after

computing first the joint density on a grid of 50x50 equidistant

points. The results for n=5 are shown on Figure 4.

Note that the saddlepoint approximation clearly out performs the

normal approximation especially in the tails (which are most important

for constructing confidence intervals). The saddlepoint approximation

catches very accurately the skewness of the true density. This means A

that for small n, the confidence intervals for P I , based on the

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

362 PENEV

FIG. 3. The "true" (based on 500 000 simulations) density of the A A 2 2 7 vector [p ,P 1' with n=5, p =3, p = I S , y=l, aS=2.5 and av=o;=0.25.

0 1 0 1

saddlepoint approximation, will have a better coverage accuracy as

compared to those based on asymptotic normality. Numerical values for

the density approxin~ations for n=5 are given in Table 1.

6. CONCLUSION

This paper derives the explicit solution to the implicit

"centering equation" for the saddlepoint approximation of the density.

The model considered is the classical (known error- variance ratio)

univariate Gaussian linear structural relationship model and the

estimators are the maximum- likelihood estimators of slope and

intercept. Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014


FIG. 4. "True" marginal density of the slope- estimator, dashed line, the saddlepoint approximation, solid line, and the asymptotic normal approximation, dash- dotted line, with n=5, Pn=3, P,=1.5, p=l,

b2-2.5 and a~=0:=0.25. 5-

TABLE I Marginal density of the slope- estimator for n=5, P =1.5, P =3, p=1,

I 0 $=2S and o:=b:=0.25.

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

364 PENEV

The approach discussed in Sections 2, 3 and 4 can be generalized

for the multidimensional case. In $29.24 of Kendall and Stuart (1979)

the general model

k

Laj.$ =O (22) j= I

with 5 =X = l is considered. Here, apart from $, the other 5's are 0 0

subject to errors, so that one observes X given by

x.,= k..+ 6.. , i=1,2 ,..., n ; j=1,2 ,..., k I' 1' J'

(23) One assumes that the 6's are normally distributed with zero- mean,

independently of 5 and of each other and that all 6's have the same

unknown variance. Maximization of the Likelihood Function leads to an

easily interpretable geometric minimization problem of the kind

discussed in (3) and the solution is obtained when the 5's are the feet

of the perpendiculars from the X's on to the hyperplane given in (22). k

2 Utilizing this fact and imposing the constraint I a i = l , one can see

i=O

then, that the Maximum Likelihood Estimator [ao,a, ,...,a,)' is as a

solution of an equation system

Here is a Lagrange multiplier and can be determined in advance as the

smallest eigenvalue of a certain positive definite matrix (details can

be found in 529.25 of the above monograph). The system (24) can be seen

as a generalization of the system (6). One can write down the centering

equations which generalize (9) and can find their explicit solution.

The calculations are rather messy, though.

Finally, let us point out that the maximum likelihood estimators

for slope and intercept are not qualitatively robust. The density

approximations we derive, are very accurate when the Gaussian

assumptions for the model hold. This is the merit of the saddlepoint

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014


approximation. If these assumptions are under suspicion, we suggest

using robust estimators ( as developed in Cheng and Van Ness (1992)).

ACKNOWLEDGEMENTS

This work was supported by the Australian Research Council. The

author is indebted to Prof. E. Ronchetti for his valuable comments.

BIBLIOGRAPHY

Adcock,R.J. (1887). "Note on the method of least squares," Analyst, 4,183- 184.

Barndorff-Nielsen,O.E. and Cox,D.R.(1989). Asymptotic Techniques for Use in Statistics. Chapman& Hall, London.

Bickel, P. J. and Ritov, Y. (1987). "Efficient estimation in the errors-in-variables model," Annals of Statistics, 15, 513-540.

Cheng, Chi-Lun and Van Ness,J.W. (1992). "Generalized M- estimators for errors- in- variables regression," Annals of Statistics, 20. 385-397.

Daniels,H.E. and Young,G.A. (1991). "Saddlepoint approximation for the studentized mean,with an application to the bootstrap," Biometrika,78, 169-179.

Daniels,H.E. (1954). "Saddlepoint approximations in statistics," Ann. Math. Statistics,25, 63 1-650.

Field,C.A.(1982). "Small sample asymptotic expansions for multivariate M- estimates," Annals of Statistics,lO, 672-689.

Field,C.A. and Hampe1,F.R. (1982). "Small sample asymptotic distributions of M-estimators of Location," Biornetrika, 69, 29-46.

Field,C.A. and Ronchetti,E.(1990). Small sample asymptotics, IMS Lecture Notes- Monograph Series. Vol. 13.

Fuller, W. A.(1987). Measurement Error Models, Wiley, New York.

Jin, P. (1992). "Toward a Reconceptualization of the Law of Initial Value," Psychological Bulletin. Vol. 111, No 1, 176-184.

Kendall, M. G. and Stuart, A. (1979). The Advanced Theory of Statistics. Vol. 2. Inference and Relationship, (Fourth Edition), Griffin, London.

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

366 PENEV

Reiers~1,O. (1950). "Identifiability of a linear relation between variables which are subject to error," Econornetrica, 18, 575-589.

Received August, 1994; Revised October, 1994.

Dow

nloa

ded

by [

Uni

vers

itaet

sbib

lioth

ek W

uerz

burg

] at

17:

41 1

8 O

ctob

er 2

014

saddlepoint approximation in the linear structural relationship model

Documents