saddlepoint approximation in the linear structural relationship model
TRANSCRIPT
This article was downloaded by: [Universitaetsbibliothek Wuerzburg]On: 18 October 2014, At: 17:41Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK
Communications in Statistics - Simulation andComputationPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/lssp20
Saddlepoint approximation in the linear structuralrelationship modelSpiridon Penev aa Department of Statistics, School of Mathematics , University of New South Wales ,Sydney, NSW, 2052, AustraliaPublished online: 27 Jun 2007.
To cite this article: Spiridon Penev (1995) Saddlepoint approximation in the linear structural relationship model,Communications in Statistics - Simulation and Computation, 24:2, 349-366, DOI: 10.1080/03610919508813246
To link to this article: http://dx.doi.org/10.1080/03610919508813246
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”)contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensorsmake no representations or warranties whatsoever as to the accuracy, completeness, or suitabilityfor any purpose of the Content. Any opinions and views expressed in this publication are the opinionsand views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy ofthe Content should not be relied upon and should be independently verified with primary sources ofinformation. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands,costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial orsystematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distributionin any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions
COMMUN. STATIST.-SIMULA., 24(2), 349-366 (1995)
SADDLEPOINT APPROXIMATION IN THE LINEAR STRUCTURAL RELATIONSHIP MODEL
Spiridon PENEV
Department of Statistics, School of Mathematics, University of New South Wales, Sydney,2052 NSW
Australia
Key words and phrases: Saddlepoint approximation; Linear structural
relationship; Maximum likelihood estimator; M- estimator; marginal
distribution.
ABSTRACT
It is shown that the joint maximum likelihood estimator of slope
and intercept of the regression line in the classical (known error-
variance ratio) linear structural relationship model can be represented
as a solution of a two- dimensional M- equation. Therefore, it is
possible to use a general saddlepoint approximation for multidimensional
M- equations. Under normality assumptions we express the solution of the
implicit multivariate "centering equation" in an explicit form. This
allows a considerable saving of computing time. By integrating out
numerically an unwanted variable one is also able to find the
saddlepoint appro~imation for the slope- estimator. Numerical examples
illustrate the efficiency of the approximation.
Copyright O 1995 by Marcel Dekker, Inc Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
1. INTRODUCTION
PENEV
The purpose of this paper is to derive a saddlepoint
approximation for the distribution of the main parameters of interest in
the linear structural relationship model. This model has been studied
for more than one hundred years. Its history begins with the paper by
Adcock (1877). Some new aspects of the model like identifiability,
asymptotic and small- sample properties of estimators, and robustness,
have received interest again over the last decade. These aspects are
reviewed in the monograph Fuller (1987).
Let n be the sample size. The proof of the joint asymptotically
normal distribution of the n'I2- normed maximum likelihood estimators
of the slope and intercept is given in Fuller (1987),p. 32. As is well
known (Barndorff- Nielsen and Cox (1989)), the saddlepoint technique
usually gives more accurate asymptotic approximations. These
approximations are good down to very small sample sizes. Although
originally derived by Daniels (1954) for approximating the distribution
of the sample mean, the saddlepoint approximation method has been
successfully extended to M- estimators (Field (1982)) and to some other
special non- linear statistics by Daniels and Young (1991). The M-
estimators form a very large class and hence the generalization from the
sample mean to the case of M-estimator turns out to be a very
significant step. The price of this generalization is, however, that
usually the saddlepoint approximation can not be computed explicitly.
For every fixed argument value one has to iteratively solve a
complicated implicit non-linear equation system. As pointed out in Field
and Ronchetti (1990) (see page log), in terms of computational efforts,
solving such a set of non- linear systems is the dominant feature.
Moreover, the root finding procedures seem to often fail in two- or more
dimensional problems at least at some points of the grid.
Another obstacle on the way of practical implementation of the
method is that the implicit solution is given in terms of the cumulant
generating function of the observations, which is rarely known in
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
SADDLEPOINT APPROXIMATION 351
practice. Yet, there are cases where under special distribution
assumptions (e.g.normality) on the observations, the implicit equations
have explicit solutions. This allows a considerable saving in
computation. Such cases are of interest for practical applications
because for them, the cumulant generating function is also known. The
problem of finding the saddlepoint approximation of the distribution of
the slope and the intercept in the classical linear structural
relationship model falls into this category.
Let us point out that in view of the normality assumption and the
fact that canonically parameterized exponential family models have
closed form solutions for their saddlepoint equations, it is not too
surprising that an explicit solution of the implicit equations system
can be found. On the other hand, however, the presence of the
measurement error makes the problem hard enough and no obvious
presentation for this model as a canonically parameterized exponential
family model is possible. Our numerical results show that the
approximation we get significantly out performs the usual normal- based
asymptotic approximation for small sample sizes 11115.
2. BRIEF MODEL DESCRIPTION
The linear structural relationship model is given by the
following system of equations:
Yi =Po + P,ti + ei (1)
x. = 6. + v. , i=1,2 ,..., n 1 1 1
(2) where n is the sample size. Here yi and xi (i=1,2, ..., n) are the
recorded observations of the dependent and independent variables. Their
"true" scores are Po + P1.ki and t i , respectively. The random error
disturbances in the dependent and the independent variables are ei and
vi, respectively. The notion of a linear structural relationship in the
model given by the equations (1) and (2) refers to the situation where
ki, 2 n are independent and identically distributed random
variables (e.g. Kendall and Stuart (1979, Chapter 29)).
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
352 PENEV
We shall assume independence and normal distributions for v, e,
and 6, with zero marginal means for v and e and variances o2 and oi,
respectively, and mean p and variance a2 for 5. If all the barameters 5 Po,Pl,p, at, a;, and d were unknown.then the most important parameter
of interest, the regression slope PI, is non- identifiable (see Reiersol
(1950)). In the remaining sections of this paper we shall focus on the
classical way of resolving the non-identifiability problem. This is to 2 2 . assume that the ratio 3L = cre/ov 1s known. The most interesting case is
when k 1, for otherwise the observations and the parameters can be
suitably scaled to achieve this. From now on we shall assume that 31;. 1.
Let us note that k1 is a very common case in the studies concerning the
validity of the Law of Initial Value (see Jin (1992), page 178). Also
the Gaussian distributional assumption seems to be reasonable for such
studies.
The maximum likelihood estimators of Po and PI can be shown to be
the intercept and the slope of that line in the plane, which minimizes
the sum of the squared perpendicular distances of the observed points
[xi,yi]' to the fitted line q({)= Po+ , i.e.
Those estimators can be calculated from (3) as functions of the
empirical means and covariances. If sxxt syy,sxy are the empirical
variances of X,Y and their empirical covariance, respectively, then
A A
Next, Po can be found using the value of P obtained and the 1
empirical means K ,L of X and Y :
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
SADDLEPOINT APPROXIMATION 353
The maximum likelihood estimators of the remaining parameters are A - A 2 A " 2 " 2 A 2 A 2
given by p=x , o -s /PI, o,=o,=s -(PI) .O . (Note that to obtain the 5- XY YY 5 maximum likelihood estimators, one should use lln rather than l/(n-1)
as a normalizing constant when computing s ,s and s ). xx YY X Y
Let us point out that the non-linear nature of the objective
function in (3) gives rise to non-normally distributed estimators Bo A
and PI. As our simulations show, the deviation from normality is
sensible especially for small sample sizes such as n=5 to n=15.
The key observation which allows the computation of a saddlepoint
approximation for the common density of the vector statistic of the
slope and the intercept estimators discussed, is that this statistic is
the solution of a two- dimensional M-equation. Let us define the vector
function
A A
Then the vector statistic [po,P1]' is the solution of the two-
dimensional equation system
A
Moreover if we assume that PI and s should have the same sign X Y
(which is quite a natural assumption) then this solution is even
unique. Note that in case where s =O some additional difficulties XY
arise. But first such a situation occurs with zero probability and A A
secondly,in such a case it should be assumed that Pl=O and then Po=<
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
354 PENEV
3. THE SADDLEPOINT APPROXIMATION FOR A GENERAL MULTIVARIATE M-STATISTICS
Daniels's original idea of saddlepoint approximation of the
density of the mean was first extended to the case of one- dimensional
M-estimators of location by Field and Hampel (1982). Later Field (1982)
used different reasoning to see that general multidimensional M-
estimators can be approximated. His idea was to write the M-estimator
locally as a mean and then to use the saddlepoint approximation for the
mean. The approach of Field is closely related to the tilted Edgeworth
expansions. In the remaining sections we adopt the notation used in the
monograph Field and Ronchetti(l990). To approximate the density of the A A
M- estimator [Po,PIIf at some fixed point t=[to,t,]', one first has to
recenter the original density of the observations [y,x]' by embedding
it into an exponential family. For a fixed value t=[to,t,]' the new
density (the conjugate density) has the form:
The original density f(y,x) in the model considered is bivariate normal
with parameters: mean vector equal to [po+PIp,p]' and covariance matrix
The functions a l ( t ) and a2(t) should be chosen in such a way that
the following equalities
should hold for j=1,2.
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
SADDLEPOINT APPROXIMATION 355
Having found the solutions a l ( t ) and a$t) from (9), one can
substitute them into (7) to obtain the tilted density. Then one can get
the determinants of the 2x2 matrices A= E{ayr(y,x,t)/at) and
Note that all expected values are computed with respect to the
conjugate density ht(y,x). After all these preparations one is able to
compute the saddlepoint approximation for the joint density p(to,tl) of A A A A
the statistics [Po,PI]' at the point t = [ t , t I f as follows ( e.g. 0 1
Field(1982) , Theorem 1 or Field and Ronchetti(l990), Theorem 4.5):
p(to,tl) = & . ~ - ~ ( t ~ , t ~ ) . 1 det A 1 . Jdet 11-"2.[l t O(lin)] (10)
4. CALCULATING THE MATRICES A AND 1
For the approximation (10) to hold some conditions should be
satisfied (see p. 62 of Field and Ronchetti (1990)). Because of the
normality assumptions these conditions are satisfied. Now denote by
Eh(.) the expected value with respect to the shifted density ht(y,x)
and by o i l , 022 and o12 the variance of Y, the variance of X and the
covariance of X and Y computed with respect to the same density. Note
that their dependence on t has been suppressed in the notation. Then
after some easy transformations it can be seen that the two equations
in (9) are equivalent to the following relations:
Next we observe that because of the special form of the functions
vl and qr2 in our problem (containing no more than a second degree of x
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
356 PENEV
and y) , it could be expected that the shifted density ht(y,x) is again
normal. Let b=Eh(x). Then for every fixed t this density should be
proportional to the expression
The coefficients in front of the same powers of x2,y2,xy,x and y
in the exponents in (7) and (13) have to be equal. This leads to
following equation system:
Here we have substituted ~=a'.a;(l+~:)+< 5 for the determinant
det(B). The six equations (12) and (14)-(18) can be solved with respect
to the six unknown quantities ~ , o ~ ~ , ~ ~ ~ , c s ~ ~ , c ~ ~ and a2. Details about
the way to find the solution are available from the author. Here we
shall only give the final result.
If t1& then we can find recurrently the quantities involved by
first computing
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
SADDLEPOINT APPROXIMATION
Then a12 can be substituted to get the values of o l , :
and of oZ2:
The result for a, can be obtained then as follows:
where ~ = o , , ~ , - o : ~ > ~ . Next the solution for b can be obtained:
The final step is to substitute the value of b to get :
Note that special care is needed for the case tl=O, as then the
results obtained for 011,022 and u2 are undetermined. But in this case 2 2 we see from equation (12) that o12=0. Then (15) gives al,=A/(o +oV), 5
(14) gives A / ( ~ ~ o ~ + o : ) and (16) gives %= P,G$A. Further on we 1 5
obtain from (17) that b = R ( t o ~ l ~ ~ - ~ o ~ l ~ ~ p o ~ ) / ~ and from (18) that
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
358 PENEV
All expressions we get in case tl=O can be easily seen to be just the
limits of the corresponding general formulae for tl#0 when t 1 4 .
Hence the implicit equation system (9) has an explicit solution.
Utilizing this solution we can also find explicit expressions for the
matrices A and 1. The matrix A has the form
All expected values can easily be computed. The final result is: det A=
1+t; -.oI2. In addition, the case tl=O needs a special consideration.
Using equation (19) we see that the limit of det A when t 1 4 is
Similar but more involved computations help us to calculate the
expected values of the elements of the matrix 1. Finding E,,(y,y2) and
E ($) requires the knowledge of higher order moments of the random h
variables Y and X with respect to the shifted density ht(y,x). The
normality assumption is essentially used for this part. The details of
the computation are available from the author. The final result is:
det l= (tig2-2tla12+oll)[(20~2+011022)(t~-41) +3t;(G;]+ 0:2)
It can be shown that 1 is non- generate, i.e. det PO holds.
5, SIMULATION STUDY
Looking at formula (lo), we see that the only quantity which
remains to be computed is the function c-"to,tl). For every fixed
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
SADDLEPOINT APPROXIMATION 359
value of t we have to evaluate the double integral in (8). As the
integrand is known, this may be done numerically. The corresponding
NAG- Library routine DOlDAF has proven to be safe enough to perform a
reliable two- dimensional integration. Also the normalizing constant
should be determined numerically in such a way that the density
approximation integrates to one. This renormalized saddlepoint density
gives an even better improvement (Field and Ronchetti (1990)).
Our numerical calculations show the effectiveness of the
saddlepoint approximation for the joint density for sample sizes 11515
as compared to the normal asymptotic approximation. The advantage
disappears quickly then with an increase in sample size. For the sake
of illustration we fit the joint density of the maximum likelihood
estimators of the parameters Po=3 and PI=1.5 of the model 1)-2) with 2 2 n=5,p=l,o -2.5 and a =o =0.25. Figure I shows the asymptotic normal t - v e A A
approximation for the density of the vector [Po$,]'. It has mean
[Po,P,]' and a covariance matrix
J
r=[(l +p:)02 +02] .02102 (compare Fuller(1987).p. 32). Figure 2 shows the 5 e 5 renormalized saddlepoint approximation for the same density. Figure 3 shows the "true" (based on 600 000 simulations) density. The graphs
show the contours of the density surfaces. In addition, pseudocolor
plot is superimposed on the same graphs to enable better viewing. The
densities values are calculated on a grid of 50x50 equidistant points
around the vector parameter [3,1.5]'. Numerical values of the density
approximations show a significant overall advantage of the saddlepoint
approximation as compared to the asymptotic normal approximation. A
careful study of the density surfaces in figures 1,2 and 3 confirms
this advantage.
The normal approximation is much more stretched whereas the
saddlepoint approximation almost coincides with the true non- symmetric
density surface for all values of the argument [to,tl]'. Note that the
possibility to avoid iterations by explicitly solving the "centering
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
360 PENEV
FIG. 1. The asymptotic normal approximation for the joint density A A
of the vector [Po$,]' with n=5, p0=3, P,=1.5, p=1, a2-2.5 and 5-
equations" (9) makes it easy to obtain density approximations on a
dense grid of values [to,t,]' and hence, to obtain very accurate
contours of the confidence regions for [Po,P,]'. The unknown parameters 2
02=02, p and a involved in the construction can be substituted by v e 5
their estimators discussed in Section 2.
In most applications the density of the estimator PI of the slope
is of primary interest. To obtain it, one has to integrate out the
unwanted variable to in the bivariate density approximation. This can
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
SADDLEPOINT APPROXIMATION 361
FIG. 2. The saddlepoint approximation for the joint density of A A
the vector [p ,p 1' with n=5, P0=3, P,=1.5, p=l, o -2.5 and 02=02=0.25. 0 I i- e
be done either numerically or by a Laplace approximation. As pointed
out by Daniels and Young (1991), the numerical integration seems to be
the safest procedure. We applied the numerical integration method after
computing first the joint density on a grid of 50x50 equidistant
points. The results for n=5 are shown on Figure 4.
Note that the saddlepoint approximation clearly out performs the
normal approximation especially in the tails (which are most important
for constructing confidence intervals). The saddlepoint approximation
catches very accurately the skewness of the true density. This means A
that for small n, the confidence intervals for P I , based on the
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
362 PENEV
FIG. 3. The "true" (based on 500 000 simulations) density of the A A 2 2 7 vector [p ,P 1' with n=5, p =3, p = I S , y=l, aS=2.5 and av=o;=0.25.
0 1 0 1
saddlepoint approximation, will have a better coverage accuracy as
compared to those based on asymptotic normality. Numerical values for
the density approxin~ations for n=5 are given in Table 1.
6. CONCLUSION
This paper derives the explicit solution to the implicit
"centering equation" for the saddlepoint approximation of the density.
The model considered is the classical (known error- variance ratio)
univariate Gaussian linear structural relationship model and the
estimators are the maximum- likelihood estimators of slope and
intercept. Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
SADDLEPOINT APPROXIMATION 363
FIG. 4. "True" marginal density of the slope- estimator, dashed line, the saddlepoint approximation, solid line, and the asymptotic normal approximation, dash- dotted line, with n=5, Pn=3, P,=1.5, p=l,
b2-2.5 and a~=0:=0.25. 5-
TABLE I Marginal density of the slope- estimator for n=5, P =1.5, P =3, p=1,
I 0 $=2S and o:=b:=0.25.
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
364 PENEV
The approach discussed in Sections 2, 3 and 4 can be generalized
for the multi- dimensional case. In $29.24 of Kendall and Stuart (1979)
the general model
k
Laj.$ =O (22) j= I
with 5 =X = l is considered. Here, apart from $, the other 5's are 0 0
subject to errors, so that one observes X given by
x.,= k..+ 6.. , i=1,2 ,..., n ; j=1,2 ,..., k I' 1' J'
(23) One assumes that the 6's are normally distributed with zero- mean,
independently of 5 and of each other and that all 6's have the same
unknown variance. Maximization of the Likelihood Function leads to an
easily interpretable geometric minimization problem of the kind
discussed in (3) and the solution is obtained when the 5's are the feet
of the perpendiculars from the X's on to the hyperplane given in (22). k
2 Utilizing this fact and imposing the constraint I a i = l , one can see
i=O
then, that the Maximum Likelihood Estimator [ao,a, ,...,a,)' is as a
solution of an equation system
Here is a Lagrange multiplier and can be determined in advance as the
smallest eigenvalue of a certain positive definite matrix (details can
be found in 529.25 of the above monograph). The system (24) can be seen
as a generalization of the system (6). One can write down the centering
equations which generalize (9) and can find their explicit solution.
The calculations are rather messy, though.
Finally, let us point out that the maximum likelihood estimators
for slope and intercept are not qualitatively robust. The density
approximations we derive, are very accurate when the Gaussian
assumptions for the model hold. This is the merit of the saddlepoint
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014
SADDLEPOINT APPROXIMATION 365
approximation. If these assumptions are under suspicion, we suggest
using robust estimators ( as developed in Cheng and Van Ness (1992)).
ACKNOWLEDGEMENTS
This work was supported by the Australian Research Council. The
author is indebted to Prof. E. Ronchetti for his valuable comments.
BIBLIOGRAPHY
Adcock,R.J. (1887). "Note on the method of least squares," Analyst, 4,183- 184.
Barndorff-Nielsen,O.E. and Cox,D.R.(1989). Asymptotic Techniques for Use in Statistics. Chapman& Hall, London.
Bickel, P. J. and Ritov, Y. (1987). "Efficient estimation in the errors-in-variables model," Annals of Statistics, 15, 513-540.
Cheng, Chi-Lun and Van Ness,J.W. (1992). "Generalized M- estimators for errors- in- variables regression," Annals of Statistics, 20. 385-397.
Daniels,H.E. and Young,G.A. (1991). "Saddlepoint approximation for the studentized mean,with an application to the bootstrap," Biometrika,78, 169-179.
Daniels,H.E. (1954). "Saddlepoint approximations in statistics," Ann. Math. Statistics,25, 63 1-650.
Field,C.A.(1982). "Small sample asymptotic expansions for multivariate M- estimates," Annals of Statistics,lO, 672-689.
Field,C.A. and Hampe1,F.R. (1982). "Small sample asymptotic distributions of M-estimators of Location," Biornetrika, 69, 29-46.
Field,C.A. and Ronchetti,E.(1990). Small sample asymptotics, IMS Lecture Notes- Monograph Series. Vol. 13.
Fuller, W. A.(1987). Measurement Error Models, Wiley, New York.
Jin, P. (1992). "Toward a Reconceptualization of the Law of Initial Value," Psychological Bulletin. Vol. 111, No 1, 176-184.
Kendall, M. G. and Stuart, A. (1979). The Advanced Theory of Statistics. Vol. 2. Inference and Relationship, (Fourth Edition), Griffin, London.
Dow
nloa
ded
by [
Uni
vers
itaet
sbib
lioth
ek W
uerz
burg
] at
17:
41 1
8 O
ctob
er 2
014