estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights...
TRANSCRIPT
![Page 1: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/1.jpg)
Estimating a spatial autoregressive model with an endogenous
spatial weight matrix
Xi Qu�, Lung-fei Lee
The Ohio State University
October 29, 2012
Abstract
The spatial autoregressive model (SAR) is a standard tool to analyze data with spatial correlation.
Conventional estimation methods rely on the key assumption that the spatial weight matrix W is strictly
exogenous, which is likely to be violated in empirical analyses. This paper presents the speci�cation and
estimation of the SAR model with an endogenous spatial weight matrix. The outcome equation is a
cross-sectional SAR model where W has a model structure on its entries. Endogeneity of W comes from
the correlation between error terms in its entries and the disturbances in the SAR outcome equation. We
explore three estimation methods: two-stage instrumental variable (2SIV) method, maximum likelihood
estimation (MLE) approach, and general method of moments (GMM), for this model with an endogenous
spatial weights matrix. We then establish the consistence of the estimators from these methods and derive
their asymptotic distributions. Finite sample properties of these estimators are investigated by the Monte
Carlo simulation.
Keywords: Spatial autoregressive model; Endogenous spatial weight matrix; Estimation methods
1 Introduction
The SAR model is of great interest to economists because it has a game structure and can be interpreted as
a reaction function. It is widely used in spatial econometrics and for modeling social networks. In spatial
econometrics, the SAR model has been applied to cases where outcomes of a spatial unit at one location
depend on those of its neighbors. The corresponding spatial weight matrix is a measure of distance between
di¤erent locations. Consequently, the spatial dependence parameter provides a multiplier for the spillover
e¤ect. SAR models can also be used to model social networks. For example, a student�s behavior (such as
smoking or academic achievement) can be directly a¤ected by his/her friends�behaviors. The weight matrix
�Corresponding author: Department of Economics, Ohio State University, 310 Arps Hall, Columbus, OH 43210, [email protected].
1
![Page 2: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/2.jpg)
can then be constructed by using the friendship relation, and the network (spatial) dependence parameter can
be interpreted as the strength of peer e¤ects. Since spillover and peer e¤ects are important to governments
on school policies, how to consistently estimate the spatial dependence parameter draws econometricians�
attention.
Spatial econometrics has well established estimation procedures for the SAR model with an exogenous
spatial weight matrix: for example, see Ord (1975) and Lee (2004) for the MLE, and see Anselin (1980) and
Kelejian and Prucha (1998, 1999) for the IV methods and for the GMM see Lee (2007), Lee and Liu (2010),
Lin and Lee (2010), and Liu et al. (2010). Consistency and asymptotic normality of these estimators rely
on the key assumption that the spatial weight matrix is strictly exogenous. This may be true when spatial
weights are constructed using predetermined geographic distances, for example, between di¤erent cities or
countries. However, if �economic distance� like the relative GDP or trade volume is used to construct the
weight matrix, it is highly possible that these elements are correlated with the �nal outcome. Similarly, in
the social network framework, some unobserved characteristics may a¤ect both the friendship relationship
and behavioral outcomes (Hsieh and Lee 2011). Even in the pure physical distance case, the spatial weight
matrix may be endogenous for individuals if the location involves a decision process when we consider
neighborhood choices or migration of an individual. Therefore, in an empirical work, the exogenous spatial
weight assumption is likely to be violated.
Many researchers have realized this problem but due to the complication engendered by spatial models
with an endogenous spatial weight matrix, so far no research has been done along this line. In Pinkse
and Slade (2009), they pointed out future directions of spatial econometrics. Among several problems they
emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are
still waiting for good solutions" and the endogeneity problem �can admittedly be challenging."
In this paper, we attempt to tackle the endogenous weights issue. By making explicit assumptions on
the source of endogeneity, we get two sets of equations � one is the SAR outcome, and the other is the
equation for entries of the spatial weight matrix. The disturbances in the SAR outcome equation and
the error terms in the entry equation can be correlated. When the correlation coe¢ cient between them
is nonzero, it introduces endogeneity to the spatial weights. We focus on estimation issues for this type
of SAR model. Assuming a jointly normal distribution, conditional on the sources of spatial weights, the
disturbances in the SAR outcome equation present unobserved control variables. Exploring the unobservable
control variables for endogeneity in the outcome SAR equation, we consider three estimation methods. The
�rst estimation method is a two-stage instrumental variables (2SIV) approach. In the �rst stage estimation,
we consistently estimate the entry equation; in the second stage, we substitute the residuals of the entry
equation into the outcome equation to replace the unobserved control variables and then use the standard
IV methods to estimate the SAR outcome equation. The second method, the ML approach, is based on
the joint normal distribution function, thus all the parameters can be jointly estimated via the likelihood
function of the equation system. The third method is a GMM approach, in which an outcome equation with
control variables for endogeneity introduces additional quadratic moments for identi�cation and the best
2
![Page 3: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/3.jpg)
moment conditions can be found.
The main task of this paper is to show the consistency and asymptotic normality of these three estimators.
The estimates involve statistics with linear-quadratic forms of disturbances, but the quadratic matrix involves
the spatial weight matrix. The di¢ culty arises from the fact that the entries in the quadratic matrix are
non-linear functions of disturbances. So those statistics are not of really quadratic forms with nonstochastic
quadratic matrices. Therefore, the standard results of linear-quadratic forms do not directly apply to the
situation here. For the asymptotic theories to kick in, we assume that each agent can be directly a¤ected by
its physical neighbors but the magnitude of its neighbor�s e¤ect depends on their endogenous entries.
The rest of this paper is organized as follows. In Section 2, we present the model speci�cation of the
outcome equation and the entries of its spatial weight matrix. In Section 3, we explain the 2SIV and ML
estimation methods for this model. consistency and asymptotic normality of estimates from these methods
are derived in Section 4 and Section 5. In Section 6, we relax some of the assumptions used and also
brie�y discuss the best GMM estimation with linear and quadratic moments. In Section 7, Monte Carlo
simulations are employed to investigate �nite sample properties and compare the performance of the proposed
estimation methods with traditional methods. The log likelihood, its �rst and second derivatives and some
related expressions are collected in Appendx A. Proofs for all the lemmas, propositions, and theorems are
given in Appendix B.
2 The model
2.1 Model speci�cation
Following Jenish and Prucha (2009), we consider spatial processes located on a (possibly) unevenly spaced
lattice D � Rd, d � 1. Asymptotic methods we employ are increasing domain asymptotics: growth of thesample is ensured by an unbounded expansion of the sample region. The lattice satis�es the following basic
assumption.
Assumption 1 The lattice D � Rd, d � 1, is in�nite countable. All elements in D are located at distances
of at least �0 > 0 from each other, i.e., 8i; j 2 D : �(i; j) � �0; w.l.o.g. we assume that �0 = 1.
Let f("i;n; vi;n); i 2 Dn, n 2 Ng be a triangular double array of real random variables de�ned on a
probability space (; F ; P ), where the index sets Dn is a �nite subset of D, and D satis�es Assumption 1.
Zn = Xn� + "n; (2.1)
where Xn is an n � k matrix with its elements fxi;n; i 2 Dn; n 2 Ng being deterministic and bounded inabsolute value for all i and n, � is a k � p matrix of coe¢ cients, "n = ("1;n; :::"n;n)0 is an n � p matrix ofdisturbances with "i;n = (�i1; :::�ip)
0 being p dimensional column vectors, and Zn = (z1; :::zn)0 is an n � p
matrix with zi = (zi1; :::zip)0. Wn = (wij;n) is an n � n non-negative matrix with its elements constructed
3
![Page 4: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/4.jpg)
by Zn: wij;n = hij(Zn) for i; j = 1; :::; n.1 Yn = (y1;n:; ::; yn;n)0 is an n� 1 vector from a cross-sectional SAR
model speci�ed as
Yn = �WnYn +Xn� + Vn; (2.2)
where Vn = (v1;n; :::; vn;n)0, � is a scaler, and � = (�1; :::; �k)0 is a k � 1 vector of coe¢ cients.
2.2 Model interpretation
We consider n agents in an area, each endowed with a predetermined location i. Any two agents are separated
away by a distance of at least 1: Due to some spillover e¤ects, each agent i has an outcome yi;n directly
a¤ected by its neighbors�outcomes y0j;ns: The outcome equation is yi;n = �P
j 6=i wij;nyj;n + x0i;n� + vi;n,
where the spatial weight wij;n is a measure of relationship between agents i and j, and the spatial coe¢ cient
� provides a multiplier for the spillover e¤ects. However, the spatial weight wij;n is not predetermined but
depends on some observable variable Zn: We can think of zi;n as some economic variables at location i such
as GDP, consumption, economic growth rate, etc.
This speci�cation introduces endogeneity into the spatial weight matrix and has been used in the lit-
erature. For example, Anselin and Bera (1997) stated that, in economic applications the use of weights
based on "economic" distance has been suggested, and provided several examples. In Case et al. (1993),
weights (before row normalization) of the form wij = 1=(zi � zj) were speci�cally suggested, where zi andzj are observations on "meaningful" socioeconomic characteristics. In Conway and Rork (2004), they used
migration �ow data to construct a spatial weight matrix. Another example is in Crabbé and Vandenbussche
(2008), where in addition to the physical distance, spatial weight matrices were constructed by inverse trade
share and inverse distance between GDP per capita.
2.3 Source of endogeneity
In the following analysis, we suppress the subscript n in the triangular array for convenience. The endogeneity
ofWn comes from the correlation between vi and �i, which is assumed to be multivariate normally distributed,
i.e., (vi; �i) � i:i:d N 0;
�2v �0v�
�v� �"
!!; where �2v is a scalar variance, covariance �v� = (�v�1 ; :::�v�p)
0 is
a p dimensional vector, and �" is a p � p matrix. If �v� is zero, then the spatial weight matrix Wn can be
strictly exogenous and we can apply conventional methodology of SAR models for estimation. However, in
general, �v� is not zero and Wn becomes an endogenous spatial weights matrix. Thus, Wn will be correlated
with the disturbances in Vn. Under the joint normality assumption, conditional on "n, we have
vij�i � N(�0v���1" �i; �2v � �0v���1" �v�)
1Here we simplify the notation by regarding the subscripts i and j as integer values to indicate entries in a vector or matrixeven though i and j refer formally in Assumption 1 as locations in the lattice D contained in the d-dimensional Euclidean spaceRd.
4
![Page 5: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/5.jpg)
and (2.2) becomes
Yn = �WnYn +Xn� + (Zn �Xn�)� + �n; (2.3)
where � = ��1" �v� is a p dimensional column vector of parameters, �n � N�0; �2�In
�with the variance
�2� = �2v � �0v���1" �v�, and �n is independent of "n in equation (2.2). Our subsequent asymptotic analysis
will mainly rely on equation (2.3), where the added variables (Zn�Xn�) are control variables to control theendogeneity of Wn.
3 Estimation methods
3.1 The two-stage IV estimation
In the �rst stage, we estimate Zn = Xn�+"n by ordinary least squares (OLS) method, so b� = (X 0nXn)
�1X 0nZn.
Then, in the second stage by substituting b� for � in (2.3), we haveYn = �WnYn +Xn� + (Zn �Xnb�)� + b�n; (3.1)
where b�n = �n +Xn(b�� �)� = �n + Pn"n� with Pn = Xn(X 0nXn)
�1X 0n. Since Zn �Xnb� = P?n Zn = P?n "n
with P?n = In � Pn, (3.1) can be explicitly rewritten as
Yn = (WnYn; Xn; P?n Zn)�+ (�n + Pn"n�); (3.2)
where � =�� �0 �0
�0: For estimation, with the control variables (Zn �Xn�) added in (2.3) or Zn in
(3.2), Wn can be treated as predetermined or exogenous. However, WnYn remains to be endogenous in (2.3)
and (3.2). So for an IV estimation, we need instruments for WnYn. Let Qn be an n�m matrix of IVs, then
an IV estimator of � with Qn will be
b� = [(WnYn; Xn; P?n Zn)
0Qn(Q0nQn)
�1Q0n(WnYn; Xn; P?n Zn)]
�1(WnYn; Xn; P?n Zn)
0Qn(Q0nQn)
�1Q0nYn:
3.2 The maximum likelihood estimation
Denote � = (�; �0; vec(�)0; �2v; �0; �0v�)
0; where � is a J dimensional column vector of distinct parameters of
elements in �" (J � p(p+ 1)=2), the log likelihood function can be written as
lnLn(�jYn; Zn) = �n ln(2�)�n
2ln(�2v � �0v���1" �v�)�
n
2ln j�"j+ ln jSn(�)j
� 12vec(Zn �Xn�)0(��1" In)vec(Zn �Xn�)�
1
2(�2v � �0v���1" �v�)(3.3)
� [Sn(�)Yn �Xn� � (Zn �Xn�)��1" �v�]0[Sn(�)Yn �Xn� � (Zn �Xn�)��1" �v�];
5
![Page 6: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/6.jpg)
where Sn(�) = In � �Wn and Sn = Sn(�0). The MLE b� maximizes the log likelihood function. A necessarycondition is @ lnLn(b�jYn;Zn)
@� = 0, where the �rst order derivatives of the log likelihood function are listed in
Appendix A.2.
4 Asymptotic properties of estimators: consistency
4.1 Key statistics
The 2SIV
For the 2SIV estimator b�,b�� �0 = [(WnYn; Xn; P
?n Zn)
0Qn(Q0nQn)
�1Q0n(WnYn; Xn; P?n Zn)]
�1(WnYn; Xn; P?n Zn)
0Qn(Q0nQn)
�1Q0nb�n
= [(WnYn; Xn; P?n Zn)
0Qn(Q0nQn)
�1Q0n(WnYn; Xn; P?n Zn)]
�1(WnYn; Xn; P?n Zn)
0Qn(Q0nQn)
�1Q0n(�n + Pn"n�0);
where the subscript 0 on parameters denotes their true values. Here WnYn = Wn(In � �0Wn)�1(Xn�0 +
"n�0 + �n) = Gn(Xn�0 + "n�0 + �n);where Gn(�) = Wn(In � �Wn)�1 and Gn = Gn(�0). Therefore, for
(WnYn; Xn; P?n Zn)
0Qn(Q0nQn)
�1Q0n(WnYn; Xn; P?n Zn), we have the quadratic terms with disturbances
("n�0 + �n)0G0nQn(Q
0nQn)
�1Q0nGn("n�0 + �n);
("n�0 + �n)0G0nQn(Q
0nQn)
�1Q0nP?n "n; and "
0nP
?n Qn(Q
0nQn)
�1Q0nP?n "n;
the linear terms
X 0nG
0nQn(Q
0nQn)
�1Q0nGn("n�0 + �n);
X 0nG
0nQn(Q
0nQn)
�1Q0nP?n "n; and X
0nQn(Q
0nQn)
�1Q0nP?n "n;
and other terms
X 0nG
0nQn(Q
0nQn)
�1Q0nGnXn; X0nG
0nQn(Q
0nQn)
�1Q0nXn; and X0nQn(Q
0nQn)
�1Q0nXn:
As for (WnYn; Xn; P?n Zn)
0Qn(Q0nQn)
�1Q0n(�n + Pn"n�0), we have the quadratic terms with disturbances
("n�0 + �n)0G0nQn(Q
0nQn)
�1Q0n(�n + Pn"n�0) and "0nP
?n Qn(Q
0nQn)
�1Q0n(�n + Pn"n�0),
and the linear terms
X 0nG
0nQn(Q
0nQn)
�1Q0n(�n + Pn"n�0) and X0nQn(Q
0nQn)
�1Q0n(�n + Pn"n�0):
6
![Page 7: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/7.jpg)
Each of these terms needs to be carefully analyzed. If we choose Qn = [WnXn; Xn; P?n Zn], then
Q0nQn =
0B@ X 0nW
0nWnXn X 0
nW0nXn X 0
nW0nP
?n "n
X 0nXn 0
"0nP?n "n
1CA ;and many terms can be simpli�ed. For example, Qn(Q0nQn)
�1Q0nPn = Pn, Qn(Q0nQn)�1Q0nP
?n Zn =
P?n Zn = P?n "n, and "
0nP
?n Qn(Q
0nQn)
�1Q0nP?n "n = "
0nP
?n "n.
To summarize, for the 2SIV estimator, the terms we need to analyze are
1
nX 0nW
0nWnXn,
1
nX 0nW
0nXn;
1
nX 0nG
0nWnXn;
1
nX 0nG
0nXn,
1
n"0nG
0n"n,
1
n"0nG
0nXn,
1
nX 0nW
0nGn"n;
1
nX 0nG
0nGn"n,
1
n�0nG
0n"n,
1
n�0nG
0nXn,
1
nX 0nW
0nGn�n,
1
nX 0nW
0n�n, and
1
n"0nP
?n �n.
The MLE
To show the consistency of the MLE b�, �rst we need to show the uniform convergence of the log likelihoodfunction, i.e., the uniform convergence of the sample averages of ln jSn(�)j; vec(Zn�Xn�)0(��1" In)vec(Zn�Xn�); and [Sn(�)Yn�Xn�� (Zn�Xn�)��1" �v�]
0[Sn(�)Yn�Xn�� (Zn�Xn�)��1" �v�]: Since Zn�Xn� =Xn(�0��)+ "n; the convergence of 1nvec(Zn�Xn�)
0(��1" In)vec(Zn�Xn�) can be directly derived fromthe linear and quadratic forms of "n. The complication comes from the last term in the likelihood function.
As Yn = S�1n (Xn�0 + "n�0 + �n); we have
Sn(�)Yn �Xn� � (Zn �Xn�)� = (S�1n � �Gn)(Xn�0 + "n�0 + �n)�Xn[� + (�0 � �)�]� "n�;
so in the log likelihood function, the terms we need to analyze are
1
nX 0nS
�10n S�1n Xn,
1
nX 0nS
�10n GnXn;
1
nX 0nG
0nGnXn,
1
nX 0nS
�10n Xn,
1
nX 0nG
0nXn;
1
nX 0nS
�10n S�1n "n;
1
nX 0nG
0nGn"n,
1
nX 0nS
�10n Gn"n;
1
nX 0nS
�10n "n,
1
nX 0nG
0n"n,
1
n"0nS
�10n S�1n "n,
1
n"0nS
�10n Gn"n;
1
n"0nG
0nGn"n;
1
nX 0nS
�10n �n,
1
nX 0nG
0n�n,
1
nX 0nS
�10n S�1n �n,
1
nX 0nG
0nGn�n,
1
nX 0nS
�10n Gn�n
1
n�0nS
�10n "n,
1
n�0nG
0n"n;
1
n"0nS
�10n S�1n �n,
1
n"0nG
0nGn�n,
1
n"0nS
�10n Gn�n
1
n�0nS
�10n S�1n �n,
1
n�0nS
�10n Gn�n; and
1
n�0nG
0nGn�n.
For the special term 1n ln jSn(�)j, we have the Taylor expansion,
1
nln jSn(�)j =
1
nln jIn � �Wnj = �
1
n
nXi=1
" 1Xl=1
�l
l(W l
n)ii
#:
In summary, for the consistency of the estimators, the terms we need to analyze can be written in a
7
![Page 8: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/8.jpg)
general form as
1
nX 0nM
0nXn;
1
n"0nMn"n ,
1
n�0nMn�n;
1
n"0nMn�n;
1
nX 0nMn�n; and
1
n
nXi=1
" 1Xl=1
�l
l(W l
n)ii
#;
where Mn can be Wn, Gn; S�1n , W 0nWn, G0nWn, S�10n S�1n , S�10n Gn, and G0nGn. To analyze these terms, we
need some additional assumptions and topological structures.
4.2 Assumptions and topological structures
Assumption 2 2.1). The parameter � = (�; �0; vec(�)0; �2v; �0; �0v�)
0 is in a compact set � in the Euclidean
space Rk� , where k� = k + 2 + kp + p + J ; k is.the dimension of �, p is the dimension of �v�, kp is the
number of parameters in �, and J is the number of distinct parameters in ��(�). In this set, �2v > 0 and
��(�) is positive de�nite. The true parameter �0 is contained in the interior of �.
2.2). The matrix Sn is nonsingular.
2.3). The sequences of matrices fWng are bounded in row and column sums norms2 . � satis�es thatsup� jj�Wnjj1 < 1 and sup� jj�Wnjj1 < 1.2.4). limn!1(
1nX
0nXn)
�1 exists and is nonsingular.
2.5). vi and "i have the joint normal distribution:
(vi; �i) � i:i:d:N(0;�); where � = �2v �0v�
�v� �"
!is positive de�nite:
2.6). The spatial weight wij = 0 if �(i; j _) > k0 for some k0 > 0, i.e., if the distance between two locations
exceeds the threshold k0, they cannot be neighbors. We focus on two structures of Wn constructed by the
elements w�ij = g(zi; zj)I(�(i; j_) � k0):
Case 1. Without row normalization: wij = w�ij = g(zi; zj)I(�(i; j) � k0);Case 2. Row normalized weight matrix: wij =
w�ijPj w
�ij:
From these constructions, we can see the spatial weight considered in this paper is a combination of phys-
ical distance and economic distance: whether agents are neighbors depends on the predetermined physical
distance �(i; j), but the magnitude of neighbors�e¤ects depends on the economic similarity g(zi; zj). Denote
Bi(h) as a ball centered at the location i with a radius h. The di¤erence between Case 1 and 2 is that
in Case 1, wij is only related to two points within the ball Bi(k0) and hence the values zi and zj ; but in
Case 2, wij may be related to all �nite points within the ball Bi(k0) and corresponding values of z at those
2We say the sequence of n� n matrices fPng is bounded in row and column sums norms if sup1�i�n;n�1Pni=1 jpij;nj <1
and sup1�i�n;n�1Pni=1 jpji;nj <1:
8
![Page 9: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/9.jpg)
locations. Similar to Case 1, wij and wkl are uncorrelated if the distance between locations i and k exceeds
2k0 (twice of the threshold). This is a generalization of m-dependence in the time series literature to our
spatial setting. Therefore, the analysis of these two cases will be similar.
The assumed topological structure in Assumption 2.6) has the following implications:
Claim 1 There exists a constant C < 1 such that for h > 1, there are at most Chd elements within the ball
Bi(h), where d is the dimension of the location space of units.
This is Lemma A.1(ii) in Jenish and Prucha (2009). Also we have the following claim.
Claim 2 The (i; j)th element of the b-fold product W bn of Wn is not zero only if j 2 Bi(bk0). Furthermore,
the (i; j)th element of W0an W
bn is not zero only if j 2 Bi((a+ b)k0):
4.3 Probability convergence of key statistics
Terms in key statistics have common features in the analysis of their convergence, so we would like to have
some general propositions, which can be applied to those terms.
De�nition 1 A �nite dimensional square matrix A is said to be a spatial h�dependence matrix of �n if each(i; j)th entry aij of A is a function of �s�s with its (location) s 2 Bi(h) and aij = 0 if j 62 Bi(h).
Example 1 1.1) An =W ln, l = 1; 2; ::: :
aij =
nXj1=1
� � �nX
jl�1=1
wij1wj1j2 � � �wjl�1j :
As (wij1 ; wj1j2 ; � � � ; wjl�1j) with all nonzero elements is a "forward walk" of length l from i to j, this An is
a spatial lk0-dependence matrix of �n.
1.2) An =W0ln , l = 1; 2; ::: :
aij =nX
i1=1
� � �nX
il�1=1
wji1wi1i2 ; � � � ; wil�1i:
As (wji1 ; wi1i2 ; � � � ; wil�1i) where all elements are nonzero is a "backward walk" of length l from i to j, this
An is a spatial lk0-dependence matrix of �n.
1.3) An =W0lnW
mn , l;m = 1; 2; ::::
aij =nX
i1=1
� � �nX
il�1=1
wki1wi1i2 � � �wil�1inX
j1=1
� � �nX
jm�1=1
wkj1wj1j2 � � �wjm�1j :
This has "backward walk"s of length l from i to any k, and then continues "forward walk"s of length m from
an k to j. This An is a spatial (l +m)k0-dependence matrix of �n.
9
![Page 10: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/10.jpg)
De�nition 2 A sequence of square matrices fBng has an increasing dependence geometric series expansionif Bn =
P1l=0 alAnl under jj:jj1 where al�s are scalar coe¢ cients and fAnlg is a sequence of conformable
spatial (l+s)k0-dependence matrices of �n with s being a speci�ed non-negative integer, such that fjja0An0jj1gis uniformly bounded, and jjalAnljj1 � c0(l + c1)
s1cl for all l = 1; 2; � � � with c0 > 0, c1 and s1 are some
positive constants, and 0 < c < 1.
Under Assumption 2.3), we have following examples:
Example 2 2.1) Bn = Gn(�) =P1
l=0 �lW l+1
n .
2.2) Bn = G0n(�) =P1
l=0 �lW
0(l+1)n .
2.3) Bn =P1
l=1�l
l Wln.
2.4) Bn = G0n(�)Wn =P1
l=0 �lW
0(l+1)n Wn where Al =W
0(l+1)n Wn is (l + 2)k0-dependence (with s = 2).
2.5) Bn = G0n(�)Gn(�) =P1
i=0
P1j=0 �
i+jW0(i+1)n W
(j+1)n =
P1l=0 �
lAl where Al =Pl
i=0W0(i+1)n W
(l�i+1)n
with s = 2.
2.6) Bn = G0n(�)Gn(�)G0n(�)Gn(�) =
P1k=0
P1m=0 �
k+mPki=0
Pmj=0W
0(i+1)n W
(k�i+1)n W
0(j+1)n W
(m�j+1)n =P1
l=0 �lAl where Al =
Plk=0
Pki=0
Pl�kj=0W
0(i+1)n W
(k�i+1)n W
0(j+1)n W
(l�k�j+1)n with s = 4.
With these de�ned matrices, we would like to have some general convergence results:
(i) �0nAn�n; where An is a spatial lk0-dependence matrix of �n;
(ii) �0nBn�n; where Bn has an increasing dependence series expansion with al = �l for all l = 0; 1; : : : ,
which has the property jjalAljj1 � c0(l + c1)s1cl for l = 1; 2; � � � , with c0 > 0, c1 and s1 are some positiveconstants, and 0 < c < 1.
Proposition 1 Suppose that �i�s in "n = (�1; � � � ; �n)0 are i.i.d. with zero mean and its fourth moment existsand is �nite; the n-dimensional square matrix An is spatial h-dependence of "n where all the entries aij�s
are uniformly bounded. Then, jE( 1n"0nAn"n)j = O(1) and 1
n ["0nAn"n � E("0nAn"n)] = op(1).
Proposition 2 Suppose Bn has an increasing dependence geometric series expansion, Bn =P1
l=0 alAnl with
fAnlg being a sequence of spatial (l+s)k0-dependence matrices of �n; �i�s in "n are i.i.d. with zero mean andits fourth moment exists and is �nite; Then, E( 1n j�
0nBn�nj) = O(1) and 1
n [�0nBn�n � E(�0nBn�n)]= op(1).
From the proofs of above two propositions, we have also the convergence of terms with �n replaced by a
vector of exogenous variables.
Corollary 1 Suppose that all the elements of the vectors of exogenous variables xn = (xn1; � � � ; xnn)0
and qn = (qn1; � � � ; qnn)0 are uniformly bounded for all n. Under the setting in the previous proposi-
tions, E( 1n j�0nBnxnj) = O(1); E( 1n jq
0nBnxnj) = O(1), 1
n (�0nBnxn � E(�0nBnxn)) = op(1) and 1
n (q0nBnxn �
E(q0nBnxn)) = op(1).
10
![Page 11: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/11.jpg)
For our statistics with a general form 1n"
0nMn�n, because �n is conditional independent of Zn (and Xn),
we have E(�njMn) = 0 and
V ar(1
n"0nMn�n) = E(V ar(
1
n"0nMn�njZn)) + V ar(E(
1
n"0nMn�njZn)) =
1
n�2�E(
1
n"0nMnM
0n"n):
From Proposition 2, we have E("0nMnM0n"n) = O(n), then 1
n"0nMn�n is op(1). Similar arguments can be
applied to show 1nX
0nMn�n = op(1); E ln jSn(�)j = O(n) and 1
n [ln jSn(�)j � E ln jSn(�)] = op(1):
4.4 Consistency of estimators
As
b�� �0 = [(WnYn; Xn; P?n Zn)
0Qn(Q0nQn)
�1Q0n(WnYn; Xn; P?n Zn)]
�1
�(WnYn; Xn; P?n Zn)
0Qn(Q0nQn)
�1Q0n(�n + Pn"n�0);
and
limn!1
1
nQ0n(WnYn; Xn; P
?n Zn) = lim
n!1
1
n[Q0nGn(Xn�0 + "n�0 + �n); Q
0nXn; Q
0nP
?n "n]
p! limn!1
1
n[E(Q0nGn)Xn�0 + E(Q
0nGn"n)�0; E(Q
0n)Xn; E(Q
0n"n)];
to show the consistency of the 2SIV estimator, in addition to the convergence of each separated terms, we
need some rank conditions.
Assumption 3 3.1) limn!1
E( 1nQ0nQn) exists and is nonsingular;
3.2) limn!1
1n [E(Q
0nGn)Xn�0+E(Q
0nGn"n)�0; E(Q
0n)Xn; E(Q
0n"n)]
0[E(Q0nGn)Xn�0+E(Q0nGn"n)�0; E(Q
0n)Xn; E(Q
0n"n)]
is nonsingular.
Proposition 3 Under Assumptions 1-3, the 2SIV estimator b� p! �0:
Now for the consistency of the MLE, we need additional assumptions.
Assumption 4 limn!1
1n [Xn; E(Gn)Xn�0+E(Gn�n)�0]
0[Xn; E(Gn)Xn�0+E(Gn�n)�0] exists and is nonsin-
gular.
Lemma 1 (Unique maximizer) Under Assumptions 1, 2, and 4, �0 is the unique maximizer of limn!1
1nE lnLn(�).
This lemma, the uniform convergence that sup�2�1n
��lnLn(�)� E 1n lnLn(�)
�� p! 0, and the equicontinuity
of limn!1
1nE lnLn(�0) together imply the consistency of the MLE.
Theorem 1 Under Assumptions 1, 2, and 4, the MLE b� p! �0.
11
![Page 12: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/12.jpg)
5 Asymptotic distributions of estimators
For the 2SIV estimator, we need the CLT for the form 1pn(a0Q0n�n + b
0X 0n"n�), where a and b are nonsto-
chastic matrices, speci�cally,
a = [S0(plimn!1
Q0nQnn
)�1S]�1S0(plimn!1
Q0nQnn
)�1 and b = [S0(plimn!1
Q0nQnn
)�1S]�1T 0( limn!1
X 0nXnn
)�1;
with S = limn!1
1n [E(Q
0nGn)Xn�0 + E(Q
0nGn"n)�0; E(Qn)
0Xn; E(Q0n"n)] and T = lim
n!11n [X
0nE(Gn)Xn�0 +
X 0nE(Gn"n)�0; X
0nXn; 0].
For the MLE, from the mean value theorem we have
pn(b� � �0) = � 1
n
@2 lnLn(e�)@�@�0
!�11pn
@ lnLn(�0)
@�;
where e� lies between b� and �0. Therefore, we need the uniform convergence of the second order derivative1n@2 lnL(�)@�@�0 and the CLT for 1p
n@ lnL(�0)
@� ; whose expressions can be found in Appendix A.3 and Appendix
A.2.
In general, we need to consider the CLT for the form of
Rn =1pn
�A0n&n +B
0n�n + �
0nCn�n + &
0nDn&n + �
0nEn&n � E(�0nCn�nj&n)� E(& 0nDn&n)
�;
where &n and �n are independent random variables with zero means and, respective nonzero variances �&and ��, (&i; �i) are identically distributed across n, An and Dn are n dimensional non-stochastic vector and
matrix, but elements in Bn, Cn, and En may contain &n.
Assumption 5 5.1) E(�4+�Ri ) and E(&4+�Ri ) exist.
5.2) For vectors An and Bn, elements are uniformly bounded; matrices Cn; Dn and En are bounded in
both row and column sums.
5.3) For stochastic matrices Cn and En, each is either a spatial h-dependence matrix of &n or has an
increasing dependence geometric series expansion.
5.4) For the stochastic vector Bn, it can be expressed as a product of a bounded non-stochastic vector and
a matrix either spatial h-dependence of &n or has an increasing dependence geometric series expansion.
Lemma 2 Denote �2Rnas the variance of Rn. Under Assumption 5 and assume that �2Rn
is bounded away
from zero,3 then Rn=�Rn
d! N(0; 1):
3For example, if limn!11n
Pni=1
Pnj 6=i(d
2ij+d
2ji) > 0 or limn!1
1n
Pni=1
Pnj 6=i E(c
2ij+c
2ji) > 0, then �
2Rn
will be boundedaway from zero.
12
![Page 13: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/13.jpg)
From this lemma, the asymptotic normality of 2SIV is straightforward,
1pn(b�� �0) d! N(0;�IV );
where
�IV = �2�0[S
0(plimn!1
Q0nQnn
)�1S]�1+�00�"0�0[S0(plimn!1
Q0nQnn
)�1S]�1T 0( limn!1
X 0nXnn
)�1T [S0(plimn!1
Q0nQnn
)�1S]�1:
As for the MLE, we have the following lemma and theorem.
Lemma 3 Under Assumptions 1, 2 and 4, the information matrix I� = limn!1
E( 1n@ lnLn(�0)
@�@ lnLn(�0)
@�0 ) is
nonsingular.
Expression of the information matrix I� can be found in Appendix A.3.
Theorem 2 Under Assumptions 1, 2 and 4,pn(b� � �0) d! N
�0; I�1�
�:
6 Some extensions
6.1 The GMM estimation
With (2.3), the SAR equation implies that
Yn = (In � �Wn)�1[Xn(� � ��) + Zn� + �n]: (6.1)
BecauseWn is a function of Zn, and �n is conditional independent of Zn andXn, we have that E(�njZn;Wn; Xn) =
E(�njZn; Xn) = 0. This suggests additional moments for the estimation such as E(Z 0n�n) = 0, and
E((WnXn)0�n) = 0, etc. From (6.1), we have WnYn = GnXn(�0 � �0�0) + GnZn�0 + Gn�n:This suggests
that the possible set of linear moments may consist of
E(X 0n"n) = 0; E(X
0n�n) = 0; E((GnXn)
0�n) = 0, E((GnZn)0�n) = 0:
In addition to the linear moments, as motivated by Lee (2007), the quadratic moment may be
E
��0n[Gn �
1
ntr(Gn)In]
0�n
�= 0:
The corresponding empirical moments are:
(1) X 0n(Zn �Xn�);
(2) X 0n[Yn � �WnYn �Xn(� � ��)� Zn�];
13
![Page 14: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/14.jpg)
(3) X 0nG
0n[Yn � �WnYn �Xn(� � ��)� Zn�];
(4) Z 0nG0n[Yn � �WnYn �Xn(� � ��)� Zn�]; and
(5) [Yn � �WnYn �Xn(� � ��)� Zn�]0[Gn � 1n tr(Gn)In]
0[Yn � �WnYn �Xn(� � ��)� Zn�].By comparing these moments with the �rst order condition (FOC) of the ML approach, we can see that
the FOC consists of linear combinations of these empirical moments. Thus, in place of the ML estimation,
one may base estimation on these moments. Within the optimum GMM framework, these moments involving
Gn and the optimum distance matrix need to be feasibly estimated with an initial estimate b�, i.e., usingbGn =Wn(In�b�Wn)�1 to replace Gn. With the same reason in Lee (2007), these feasible IV�s and quadratic
matrix would provide asymptotic equivalent GMM estimators as if these are known. With these moments,
the optimum GMM estimators are asymptotically equivalent to the MLE.
This GMM approach is, in particular, valuable if we have more than one spatial weight matrix in the
outcome equation, say
Yn = �W1nYn + �W2nYn +Xn� + Vn;
as the GMM and 2SIV estimation can be easily extended without additional computational complication.
6.2 Control function method
For simplicity, we assume z is a scalar in this section. The model is the same, Yn = �WnYn+Xn�+Vn with
Zn = Xn + "n: We want to relax distributional assumptions on (vi; "i). Still Xn is strictly exogenous, i.e.,
(vi; "i) is independent of Xn: Instead of the joint normal, we assume
E(vijXn; Zi) = E(vijXn; "i) = E(vij"i) = �"i:
Then we have
E(YnjXn; "n) = �WnE(YnjXn; "n) +Xn� + �"n:
Ordinary control function approach means running the OLS regression of Yn on WnYn, Xn; and b"n, whereb"n represents the reduced form residuals. But here the OLS cannot work because even controlling for "n,
we still have endogeneity coming from WnYn. Hence, we run the IV regression of Yn on WnYn, Xn; andb"n, using WnXn as an instrumental variable for WnYn. This yields the same estimation method as 2SIV
estimation but it relaxes the joint normality assumption.
Instead of joint normality, if we maintain the independence of the disturbances �n and "n in (2.1) and
(2.2), the log likelihood function in (3.3) can be regarded as a quasi-likelihood function. The MLE would be a
quasi-MLE (QMLE). As the consistency of the QMLE depends only on the �rst two moments of disturbances,
the QMLE would be consistent. The asymptotic variance would be in a "sandwich" form, which will involve
additional third and fourth moments of �n and "n because relevant conditional and unconditional linear and
quadratic forms consist of those disturbances (see, Lee (2004) for the QML of the SAR model and White
(1982)).
14
![Page 15: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/15.jpg)
6.3 Spatial correlation in the economic distance
Still z is a scalar, but now we consider the case with a spatial structure in the formation of z. The outcome
equation is the same, Yn = �WnYn + Xn� + Vn. But for the z equation, there is an MA structure in
disturbance term: Zn = Xn + un with un = �2Wdnun + "n, where Wn has the same structure as in
Assumption 2 and Wdn is the spatial weight matrix constructed by predetermined physical distances, i.e.,
Wdn(i; j) = I(�(i; j _) � k0). The joint normal assumption still holds: (vi; �i) � i:i:d N 0;
�2v �v�
�v� �2"
!!:
With joint normality, the log likelihood function is
lnLn(�jYn; Zn) = �n ln(2�)�n
2ln[�2"(�
2v � �2"�2)] + ln jSn(�)j+ ln j(In � �2Wdn)�1j
� 1
2�2"(Zn �Xn )0(In � �2Wd0n)�1(In � �2Wdn)�1(Zn �Xn )
� 1
2(�2v � �2"�2)�Sn(�)Yn �Xn� � (In � �2Wdn)�1(Zn �Xn )�
�0��Sn(�)Yn �Xn� � (In � �2Wdn)�1(Zn �Xn )�
�:
As Yn = S�1n (Xn�0 + "n�0 + �n);
Sn(�)Yn �Xn� � �(In � �2Wdn)�1(Zn �Xn )
= (S�1n � �Gn)(Xn�0 + "n�0 + �n)�Xn� � �(In � �2Wdn)�1[Xn( 0 � ) + (In � �20Wdn)"n]:
Let Gdn(�2) = Wdn(In � �2Wdn)�1 and Gdn = Gdn(�20). Because Gdn is exogenous and is uniformally
bounded, that would not add complication in our analysis, so the most complicated term in the log likelihood
function for analysis is still 1n"0nGn(�)
0Gn(�)"n. The di¤erence between this and previous analysis is that
now the spatial weight in the outcome equation
wij = g(zi; zj)I(�(i; j _) � k0) = f(ui; uj)I(�(i; j _) � k0) with ui = "i � �2X
k2Bi(k0)
"k:
Thus, wij is a function of all "0s within Bi(2k0). For two nonzero weights wij and wkl, if distance betweentwo locations i and k exceeds 4k0, they will not contain common "0s and hence, they are independent.
Therefore, previous arguments of the consistency and asymptotic normality still hold.
15
![Page 16: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/16.jpg)
7 Monte Carlo Simulations
7.1 Data generating process
In this section, we evaluate four estimation methods of a spatial autoregressive model with an endogenous
Wn. The data generating process (DGP) is
Yn = (In � �Wn)�1(Xn� + Vn);
where xi = (xi1; xi2)0 with xi1;n being the intercept and xi2 � N(0; 1); �1 = �2 = 1. We construct the
row-normalized Wn = (wij) in this way:
1. Generate bivariate normal random variables (vi; �i) from i:i:d N
0;
1 �
� 1
!!as disturbances in
the outcome equation and the spatial weight equation.
2. Construct the spatial weight matrix as the Hadamard product Wn = W dn �W e
n, i.e., wij = wdijweij ,
where W dn is a predetermined matrix based on geographic distance w
dij : w
dij = 1 if the two locations
are neighbors and otherwise 0; W en is a matrix based on economic similarity w
eij : w
eij = 1=jzi � zj j if
i 6= j and weii = 0; where elements of z is generated by zi = 1 + 0:8xi2 + �i.
3. Row-normalize Wn.
For the predeterminedW dn , we use four examples. First, the U.S. states spatial weight matrixWS(49�49);
based on the contiguity of the 48 contiguous states and D.C.; second, the Toledo spatial weight matrix
WO(98�98); based on the 5 nearest neighbors of 98 census tracts in Toledo, Ohio; the third one is the Iowa"Adjacency" spatial weight matrix WA(361 � 361), based on the adjacency of 361 school districts in Iowain 2009; and the last one is the Iowa "County" spatial weight matrix WC(361� 361), based on whether the361 school districts are in the same county in Iowa in 2009.
Four estimation methods we consider are conventional IV, 2-stage IV, conventional MLE of SAR, and
the MLE we proposed in section 2.4. A conventional method refers to the case of treating the spatial weight
matrices as exogenous ones. We refer to these four methods as IV, 2SIV, SAR, and MLE in tables. Here
2SIV and MLE treat Wn as endogenous: 2SIV estimates z equation by the OLS in the 1st stage and then
estimates the conditional outcome equation Yn = �WnYn+Xn�+(Zn�Xnb�)�+b�n by IV methods choosingWnXn as an IV for WnYn; the MLE estimates all the coe¢ cients (�; �
0; 0; �2� ; �2v; �)
0 jointly by maximizing
the log likelihood function
lnL(�jYn; Zn) = �n ln(2�)�n
2ln[�2��
2v(1� �2)] + ln jSn(�)j �
1
2�2v(Zn �Xn�)0(Zn �Xn�)
� 1
2�2v(1� �2)[Sn(�)Yn �Xn� � (Zn �Xn�)��v=��]0[Sn(�)Yn �Xn� � (Zn �Xn�)��v=��];
16
![Page 17: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/17.jpg)
IV and SARmethods treatWn as exogenous and estimate the outcome equation (z equation is not estimated).
From these, we want to see how misspeci�cation of a spatial weight matrix being exogenous would behave
when the actual one is endogenous. We choose correlation coe¢ cients � = 0:2, 0:5, and 0:9 to control for
di¤erent magnitudes of endogeneity. We also let the spatial correlation to be � = 0:2 or 0:4 to see how the
spatial correlation parameter a¤ects estimates. 1000 replications are carried out for each setting4 .
7.2 Monte Carlo results
Table 1: Estimates from the U.S. states spatial weight matrix
WS(49) � = 0:2 � = 0:4
� = 0:2 b� b�1 b�2 b 1 b 2 b� b�1 b�2 b 1 b 2IV 0:119 1:1325 0:9658 0:2804 1:199 0:9717
(0:2336) (0:3996) (0:1494) (0:3358) (0:5931) (0:185)2SIV 0:1849 1:0278 0:9739 1:0013 0:8027 0:3728 1:0458 0:9811 1:003 0:8001
(0:2361) (0:3968) (0:1486) (0:144) (0:1388) (0:3391) (0:592) (0:1809) (0:1395) (0:1651)SAR 0:1354 1:1077 0:9779 0:3078 1:1589 0:9925
(0:1278) (0:2537) (0:1425) (0:1367) (0:2865) (0:1632)MLE 0:177 1:0415 0:9826 1:0013 0:8027 0:3548 1:0812 0:9965 1:003 0:8001
(0:1338) (0:2564) (0:1428) (0:144) (0:1388) (0:1402) (0:2858) (0:1634) (0:1395) (0:1651)� = 0:5IV �0:0067 1:254 0:9614 0:2285 1:2959 0:991
(0:2595) (0:3745) (0:1722) (0:2334) (0:4437) (0:1626)2SIV 0:1819 1:0213 0:9833 1:0028 0:8017 0:3948 1:0142 0:9886 1:0008 0:8084
(0:235) (0:3244) (0:166) (0:1404) (0:1572) (0:2021) (0:3787) (0:1607) (0:1409) (0:155)SAR 0:0686 1:1607 0:9804 0:2739 1:2176 1:0014
(0:1358) (0:2383) (0:1594) (0:1262) (0:2727) (0:158)MLE 0:1753 1:0287 0:991 1:0028 0:8017 0:3711 1:0538 0:9979 1:0008 0:8084
(0:1293) (0:2133) (0:1595) (0:1404) (0:1572) (0:1125) (0:2413) (0:1571) (0:1409) (0:155)� = 0:8IV �0:0756 1:3012 0:8888 0:2135 1:3561 0:9659
(0:2401) (0:354) (0:1686) (0:1728) (0:3907) (0:1263)2SIV 0:1859 1:0147 0:9937 0:9969 0:8046 0:3943 1:0098 1:0013 0:9978 0:8042
(0:1929) (0:2806) (0:1529) (0:15) (0:1326) (0:0988) (0:2392) (0:1172) (0:1434) (0:1145)SAR 0:0183 1:1971 0:9287 0:2643 1:2592 0:9797
(0:1269) (0:2351) (0:1378) (0:1175) (0:2928) (0:1178)MLE 0:1874 1:0129 0:9952 0:9972 0:804 0:3881 1:0223 1:0029 0:9978 0:805
(0:0892) (0:1894) (0:1438) (0:1557) (0:1373) (0:0756) (0:2115) (0:1213) (0:1494) (0:1216)
Note: Observations n = 49, �1 = �2 = 1 = 1; and 2 = 0:8:
4We try the DGP of some other speci�cations of � and . The results are similar.
17
![Page 18: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/18.jpg)
Table 2: Estimates from the Toledo spatial weight matrix
WO(98) � = 0:2 � = 0:4
� = 0:2 b� b�1 b�2 b 1 b 2 b� b�1 b�2 b 1 b 2IV 0:107 1:124 0:989 0:2916 1:1851 0:9719
(0:1826) (0:2638) (0:1155) (0:1903) (0:3392) (0:1175)2SIV 0:1969 1:0076 0:9967 1:0001 0:7974 0:382 1:036 0:9905 0:9974 0:8004
(0:1854) (0:2618) (0:1148) (0:0991) (0:1147) (0:1871) (0:3318) (0:1154) (0:1034) (0:1013)SAR 0:1313 1:0914 0:9956 0:3214 1:1365 0:9813
(0:1042) (0:1733) (0:1119) (0:1012) (0:2051) (0:1071)MLE 0:1839 1:0233 0:9998 1:0001 0:7974 0:3776 1:0432 0:9928 0:9974 0:8004
(0:1087) (0:1732) (0:1122) (0:0991) (0:1147) (0:1018) (0:2015) (0:1076) (0:1034) (0:1013)� = 0:5IV �0:0114 1:2686 0:955 0:2088 1:3263 0:9808
(0:1625) (0:2468) (0:1123) (0:1522) (0:2937) (0:1042)2SIV 0:1896 1:0121 0:9938 1:0009 0:7982 0:3869 1:0244 0:9961 1:0022 0:8
(0:1419) (0:2068) (0:1067) (0:1063) (0:1048) (0:1293) (0:2423) (0:1013) (0:1019) (0:1031)SAR 0:0549 1:1829 0:9706 0:2663 1:2297 0:989
(0:102) (0:1758) (0:103) (0:0976) (0:209) (0:1004)MLE 0:1792 1:0247 0:994 1:0009 0:7982 0:3791 1:0381 0:9981 1:0022 0:8
(0:0939) (0:1547) (0:1028) (0:1063) (0:1048) (0:0844) (0:1765) (0:1001) (0:1019) (0:1031)� = 0:8IV �0:0796 1:3667 0:9404 0:1588 1:4063 0:991
(0:1453) (0:2483) (0:1064) (0:1542) (0:2904) (0:1226)2SIV 0:1982 0:9988 0:9968 1:0006 0:796 0:3953 1:0093 0:995 1:0016 0:7966
(0:0866) (0:1581) (0:0978) (0:103) (0:1002) (0:0902) (0:1799) (0:1195) (0:0967) (0:1182)SAR �0:0145 1:2798 0:9562 0:2044 1:3304 0:9966
(0:1011) (0:1931) (0:0986) (0:102) (0:2129) (0:1196)MLE 0:1931 1:0062 0:9958 1:001 0:7951 0:3906 1:0209 0:9979 1:0037 0:7976
(0:0634) (0:14) (0:0971) (0:1057) (0:1008) (0:0681) (0:1497) (0:1242) (0:1068) (0:124)
Note: Observations n = 98, �1 = �2 = 1 = 1; and 2 = 0:8:
18
![Page 19: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/19.jpg)
Table 3: Estimates from the Iowa "Adjacency" spatial weight matrix
WA(361) � = 0:2 � = 0:4
� = 0:2 b� b�1 b�2 b 1 b 2 b� b�1 b�2 b 1 b 2IV 0:1328 1:0825 0:9892 0:3278 1:1225 0:9903
(0:0823) (0:1195) (0:0563) (0:0766) (0:1441) (0:0519)2SIV 0:2018 0:994 0:9999 1:0007 0:7993 0:3956 1:0064 0:9947 1:0024 0:7998
(0:0792) (0:1136) (0:0558) (0:0552) (0:054) (0:0743) (0:1375) (0:0519) (0:0509) (0:0516)SAR 0:1548 1:0545 0:9937 0:3535 1:0782 0:9931
(0:0526) (0:0878) (0:0546) (0:0478) (0:0979) (0:0516)MLE 0:1964 1:0012 1 1:0007 0:7993 0:3958 1:0058 0:9958 1:0024 0:7998
(0:0517) (0:0851) (0:0548) (0:0552) (0:054) (0:0470) (0:0944) (0:0518) (0:0509) (0:0516)� = 0:5IV 0:0259 1:2331 0:9717 0:2488 1:2546 0:9811
(0:0743) (0:1182) (0:0547) (0:0755) (0:1439) (0:0561)2SIV 0:1933 1:0103 0:9978 1 0:8021 0:4004 0:9955 1:0001 0:9994 0:7998
(0:0645) (0:1004) (0:0532) (0:0519) (0:0504) (0:0603) (0:1140) (0:0547) (0:055) (0:0536)SAR 0:0878 1:1505 0:9822 0:3048 1:1592 0:989
(0:0509) (0:0872) (0:0525) (0:05) (0:1055) (0:0544)MLE 0:1942 1:0088 0:9987 1 0:8021 0:3958 1:0037 1 0:9995 0:7998
(0:0452) (0:0767) (0:0525) (0:0519) (0:0504) (0:0409) (0:0875) (0:0543) (0:055) (0:0546)� = 0:8IV �0:0814 1:3369 0:9509 0:1342 1:4259 0:9746
(0:0795) (0:1201) (0:0615) (0:0784) (0:148) (0:0603)2SIV 0:1989 1:001 0:9955 0:9989 0:7962 0:3987 1:0017 0:9955 0:9989 0:7962
(0:046) (0:076) (0:058) (0:0537) (0:0572) (0:0421) (0:0852) (0:0578) (0:0537) (0:0572)SAR �0:0011 1:2409 0:9648 0:2303 1:2721 0:9832
(0:055) (0:0963) (0:0586) (0:0537) (0:1124) (0:0584)MLE 0:1968 1:0036 0:9955 0:9989 0:7963 0:3962 1:006 0:9945 0:9987 0:7956
(0:0331) (0:0671) (0:0577) (0:0538) (0:0573) (0:0359) (0:0856) (0:0802) (0:0631) (0:08)
Note: Observations n = 361, �1 = �2 = 1 = 1; and 2 = 0:8:
19
![Page 20: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/20.jpg)
Table 4: Estimates from the Iowa "County" spatial weight matrix
WC(361) � = 0:2 � = 0:4
� = 0:2 b� b�1 b�2 b 1 b 2 b� b�1 b�2 b 1 b 2IV 0:1578 1:0522 0:9927 0:3627 1:0584 1:0004
(0:0629) (0:098) (0:0518) (0:0591) (0:1087) (0:0586)2SIV 0:1967 1:003 0:9946 1:0024 0:7998 0:3955 1:0068 0:9947 0:9983 0:7978
(0:0618) (0:0952) (0:0518) (0:0509) (0:0516) (0:057) (0:1049) (0:0588) (0:0509) (0:0579)SAR 0:1778 1:0268 0:995 0:3789 1:0332 0:9992
(0:0389) (0:0723) (0:0517) (0:0353) (0:0794) (0:0582)MLE 0:1977 1:0015 0:9959 1:0024 0:7998 0:3969 1:0048 0:996 0:9983 0:7978
(0:0381) (0:0706) (0:0518) (0:0509) (0:0516) (0:0336) (0:0765) (0:0582) (0:0509) (0:0579)� = 0:5IV 0:1033 1:1129 0:994 0:3121 1:1484 1:0019
(0:0658) (0:0973) (0:0581) (0:06) (0:1143) (0:0503)2SIV 0:1967 1:0036 0:9946 0:9985 0:7969 0:3959 1:0088 0:9994 1:0026 0:8
(0:0552) (0:0841) (0:0579) (0:0521) (0:0574) (0:0469) (0:0909) (0:0503) (0:0533) (0:0522)SAR 0:1478 1:0611 0:9955 0:3601 1:0687 1:0016
(0:0422) (0:077) (0:0578) (0:0346) (0:0782) (0:0501)MLE 0:1973 1:003 0:9956 0:9985 0:7969 0:397 1:0071 1:0003 1:0026 0:8
(0:0356) (0:069) (0:0578) (0:0521) (0:0574) (0:0296) (0:0688) (0:0501) (0:0533) (0:0522)� = 0:8IV 0:0511 1:1872 0:9783 0:2731 1:1989 1:0093
(0:0601) (0:0996) (0:0529) (0:0579) (0:1122) (0:0582)2SIV 0:1975 1:0018 0:9958 1:0005 0:7977 0:3982 1:0025 0:9958 0:9989 0:7962
(0:0357) (0:068) (0:0519) (0:0508) (0:0516) (0:0312) (0:0726) (0:058) (0:0537) (0:0572)SAR 0:1104 1:112 0:9861 0:3351 1:1018 1:0034
(0:0406) (0:0774) (0:0519) (0:0361) (0:0843) (0:058)MLE 0:1981 1:0015 0:9967 1:0005 0:7986 0:3988 1:0024 0:9942 0:9994 0:7943
(0:0272) (0:0765) (0:061) (0:0659) (0:0597) (0:0255) (0:0664) (0:0692) (0:058) (0:0729)
Note: Observations n = 361, �1 = �2 = 1 = 1; and 2 = 0:8:
20
![Page 21: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/21.jpg)
Tables 1-4 list the empirical mean and standard error (in parentheses) of estimators from 1000 iterations
using WS; WO; WA; and WC as the predetermined spatial weight matrices. In each table, the left panel
shows the results where � = 0:2 and the right panel is for � = 0:4. To see how the di¤erent estimation
methods react to the magnitude of endogeneity, we have three sets of results in each table corresponding to
� = 0:2, 0:5, and 0:8.
There are several patterns in the estimation results in all four tables:
1. For the accuracy of estimation, we consider the empirical standard error: IV is close to 2SIV; SAR is
close to MLE. In general, the likelihood estimation methods behave better than the IV methods.
2. For the bias, 2SIV and MLE behave well in all cases. With regard to conventional IV and SAR methods,
the more serious the endogeneity is, i.e., the larger the correlation coe¢ cient � is, the more bias they
create. Most bias comes from the estimator of the spatial correlation b�. In general, the estimators for� are much less biased. b� from IV and SAR su¤ers severe downward bias when � achieves 0:5, in some
cases exceeding 100%. Although both create large bias, it is clear that conventional SAR creates less
bias than conventional IV.
3. For di¤erent cases of the spatial correlation, patterns in the results are similar. When the true � is
relatively small (in our case 0.2), the estimation does not perform well compared to larger � (in our
case 0.4). More speci�cally, the bias is more serious and the standard error is larger.
From Table 1 to Table 4, as sample size increases, the bias and standard error of estimators decrease.
When the sample size is 361, such as in the case of WA and WC, the estimates appear quite precise and
close to the true parameters.
8 Conclusion
In this paper, we consider the speci�cation and estimation of a cross-sectional SAR model with an endogenous
spatial weight matrix. First, we present the model speci�cation for two sets of equations: one is the SAR
outcome, and the other is the equation for entries of the spatial weight matrix. Source of endogeneity
comes from the correlation between the disturbances in the SAR outcome equation and the errors in the
entry equation. Second, with the model speci�cation and a jointly normal assumption on disturbances, we
consider three estimation methods: the 2SIV approach, the MLE, and the GMM.
In deriving the consistency and asymptotic normality of estimators from these estimation methods, we
need to consider statistics with linear-quadratic forms of disturbances with the quadratic matrix involving
the spatial weight matrix, such as "0nWn"n and "0nGn"n. These statistics are di¤erent from the standard
quadratic forms with nonstochastic quadratic matrices because entries in the quadratic matrix are non-linear
functions of disturbances. To prove these asymptotic properties, we impose some restrictions on the spatial
weights, i.e., whether or not agents are neighbors depends only on the predetermined (physical) distance, but
21
![Page 22: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/22.jpg)
magnitudes of neighbors�e¤ects depend on the economic distance. Therefore, the spatial weight considered
in this paper is a combination of (physical) distance and economic distance with the predetermined physical
distances playing an important role in our analysis.
Then we consider three extensions: having more than one spatial weight matrix in the SAR model,
relaxing the jointly normal distribution of disturbances and introducing spatial correlation in the entries of
the spatial weights. Without much modi�cation, the estimation methods we use for the basic model can be
applied to these situations.
To examine the behavior of our proposed estimation methods in �nite samples, we use the Monte Carlo
simulation. The simulation results indicate serious downward bias if we treat Wn as exogenous where the
true DGP is endogenous. As sample size increases, the estimates approach the true parameters.
This paper focuses on estimating a cross-sectional SAR model with a speci�ed source of endogeneity from
the spatial weight matrix. In future research, we will also be interested in estimating the SAR model in a
spatial panel data setting where the spatial weight matrix varies over time due to some endogenous reasons.
Another topic that needs further research is to consider an endogenous spatial weight matrix constructed
purely using the economic distance. This is a challenging issue because as the number of observations
increases, there may be no strict rule to keep them apart from each other. In this setting, each agent can be
highly correlated so alternative large sample theorems need to be developed.
References
[1] Anselin, L. (1980), Estimation methods for spatial autoregressive structures, Regional Science Disser-
tation and Monograph Series. Cornell University, Ithaca, NY.
[2] Anselin, L. and A. Bera, (1997), Spatial dependence in linear regression models with an introduction to
spatial econometrics, Journal of Public Economics 52, 285-307.
[3] Case, A., H. Rosen, and J. Hines, (1993), Budget spillovers and �scal policy interdependence: Evidence
from the states, in Handbook of Economic Statistics. D. Giles and A. Ullah, Eds.., Marcel Dekker, NY.
[4] Conway, K. and and J. Rork, (2004), Diagnosis Murder: The Death of State Death Taxes. Economic
Inquiry 42, 537�559.
[5] Crabbé, K. and H. Vandenbussche (2008), Spatial tax competition in the EU15, Working Paper.
[6] Hsieh, C. and L. Lee (2011), A social interactions model with endogenous friendship, Working Paper.
[7] Kelejian, H. and I. Prucha (1998), A generalized spatial two stage least squares procedures for estimating
a spatial autoregressive model with autoregressive disturbances, Journal of Real Estate Finance and
Economics 17, 99-121.
22
![Page 23: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/23.jpg)
[8] Kelejian, H. and I. Prucha (1999), A generalized moments estimator for the autoregressive parameter
in a spatial model, International Economic Review 40, 509-533.
[9] Jenish, N., and I. Prucha (2009), Central limit theorems and uniform laws of large numbers for arrays
of random �elds. Journal of Econometrics 150, 86�98.
[10] Lee, L. (2004), Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregres-
sive models, Econometrica 72, 1899-1925.
[11] Lee, L. (2007), GMM and 2SLS estimation of mixed regressive, spatial autoregressive models, Journal
of Econometrics 137, 489-514.
[12] Lee, L. and X. Liu (2010), E¢ cient GMM estimation of high order spatial autoregressive models with
autoregressive disturbances, Econometric Theory 26, 187-230.
[13] Lin, X. and L. Lee (2010), GMM estimation of spatial autoregressive models with unknown heteroskedas-
ticity, Journal of Econometrics 157, 34-52.
[14] Liu, X., L. Lee, and C. Bollinger. (2010), Improved e¢ cient quasi maximum likelihood estimator of
spatial autoregressive models, Journal of Econometrics 159, 303-319.
[15] Ord J. (1975), Estimation methods for models of spatial interaction, Journal of the American Statistical
Association 70, 120�126.
[16] Pinkse, J. and M. Slade (2010), The future of spatial econometrics, Journal of Regional Science 50,
103-117.
[17] Qu, X., and L. Lee (2012), LM tests for spatial correlation in spatial models with limited dependent
variables. Regional Science and Urban Economics 42, 430�445.
[18] White, H. (1982), Maximum likelihood estimation in misspeci�ed models, Econometrica 50, 1-26.
Appendix A: Log likelihood function and its �rst and second order
derivatives
A.1 Log likelihood function and its expectation
As �2� = �2v � �0v���1" �v�, the log likelihood function in (3.3) can be written as:
lnLn(�) = �n ln(2�)�n
2ln�2��
n
2ln j�"j+ln jSn(�)j�
1
2vec(Zn�Xn�)0(��1" In)vec(Zn�Xn�)�
1
2�2��n(�)
0�n(�);
where �n(�) = Sn(�)Yn �Xn� � (Zn �Xn�)��1" �v�. The corresponding expectation is
23
![Page 24: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/24.jpg)
1
nE(lnLn(�)) = � ln(2�)�
1
2ln(�2�)�
1
2ln j�"j �
1
2n
nXi=1
x0i(�0 � �)��1" (�0 � �)0xi
+1
nE(ln jSn(�)j)�
1
2tr(��1" �"0)�
�2�02�2�
tr[E1
n(S�1n � �Gn)0(S�1n � �Gn)]
� 1
2�2�E1
n(Xn�0 + "n�0)
0(S�1n � �Gn)0(S�1n � �Gn)(Xn�0 + "n�0)
� 1
2�2�[� + (�0 � �)�]0
X 0nXnn
[� + (�0 � �)�]�1
2�2��0�"0�
+1
�2�E1
n(Xn�0 + "n�0)
0(S�1n � �Gn)0fXn[� + (�0 � �)�] + "n�g; (A.1)
where � = ��1" �v�.
A.2 First order derivatives and scores
The �rst order derivatives are:
@ lnLn(�jYn; Zn)@�
=1
�2�X 0n�n(�);
@ lnLn(�jYn; Zn)@�
=1
�2�(WnYn)
0�n(�)� tr[WnS�1n (�)];
@ lnLn(�jYn; Zn)@vec(�)
= (��1" X 0n)vec(Zn �Xn�)�
1
�2�[(��1" �v�)X 0
n]�n(�);
@ lnLn(�jYn; Zn)@�2v
= � n
2�2�+
1
2�4��n(�)
0�n(�);
@ lnLn(�jYn; Zn)@�j
= �n2tr(��1"
@�"@�j
) +1
2vec(Zn �Xn�)0(��1"
@�"@�j
��1" In)vec(Zn �Xn�)
� 1
�2��n(�)
0(Zn �Xn�)��1"@�"@�j
��1" �v� �n
2�2��0v��
�1"
@�"@�j
��1" �v�
+1
2�4��0v��
�1"
@�"@�j
��1" �v��n(�)0�n(�);
where �j is the jth element of �;
@ lnLn(�jYn; Zn)@�v�
=n��1" �v��2�
+1
�2�[(Zn �Xn�)��1" ]0�n(�)�
��1" �v��4�
�n(�)0�n(�):
24
![Page 25: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/25.jpg)
The corresponding scores are:
@ lnLn(�0)
@�=
1
�2�0X 0n�n;
@ lnLn(�0)
@�= �tr(WnS
�1n ) +
1
�2�0(WnYn)
0�n;
@ lnLn(�0)
@vec(�)= (��1"0 X 0
n)vec("n)�1
�2�0[(��1"0 �v�0)X 0
n]�n;
@ lnLn(�0)
@�2v= � n
2�2�0+
1
2�4�0�0n�n;
@ lnLn(�0)
@�j= �n
2tr(��1"0
@�"0@�j
) +1
2vec("n)
0(��1"0@�"0@�j
��1"0 In)vec("n)��0n"n�
�1"0
�2�0
@�"0@�j
��1"0 �v�0
�n�0v�0�
�1"0
2�2�0
@�"0@�j
��1"0 �v�0 +1
2�4�0�0v�0�
�1"0
@�"0@�j
��1"0 �v�0�0n�n;
@ lnLn(�0)
@�v�=
n��1"0 �v�0�2�0
+��1"0�2�0
"0n�n ���1"0 �v�0�4�0
�0n�n:
25
![Page 26: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/26.jpg)
A.3 Second order derivatives and the expectation
Expressions of the second order derivatives are:
@2 lnLn(�jYn; Zn)@�@�0
= � 1
�2�X 0nXn;
@2 lnLn(�jYn; Zn)@�@�
= � 1
�2�X 0nWnYn;
@2 lnLn(�jYn; Zn)@�@vec(�)0
=1
�2�[(��1" �v�)X 0
n]Xn;
@2 lnLn(�jYn; Zn)@�@�2v
= � 1
�4�X 0n�n(�);
@2 lnLn(�jYn; Zn)@�@�j
=1
�2�X 0n(Zn �Xn�)��1"
@�"@�j
��1" �v� ��0v��
�1"
�4�
@�"@�j
��1" �v�X0n�n(�);
@2 lnLn(�jYn; Zn)@�@�0v�
=2
�4�X 0n�n(�)�
0v��
�1" � 1
�2�X 0n(Zn �Xn�)��1" ;
@2 lnLn(�jYn; Zn)@�@�
= �tr[WnS�1n (�)]2 � 1
�2�(WnYn)
0WnYn;
@2 lnLn(�jYn; Zn)@�@vec(�)0
=1
�2�[(��1" �v�)X 0
n]WnYn;
@2 lnLn(�jYn; Zn)@�@�2v
= � 1
�4�(WnYn)
0�n(�);
@2 lnLn(�jYn; Zn)@�@�j
=1
�2�(WnYn)
0(Zn �Xn�)��1"@�"@�j
��1" �v� ��0v��
�1"
�4�
@�"@�j
��1" �0v��n(�)0WnYn;
@2 lnLn(�jYn; Zn)@�@�0v�
=2��1" �v��4�
�n(�)0(WnYn)�
1
�2�[(Zn �Xn�)��1" ]0(WnYn);
@2 lnLn(�jYn; Zn)@vec(�)@vec(�)0
= ���1" (X 0nXn)�
1
�2�[(��1" �v��
0v��
�1" ) (X 0
nXn)];
@2 lnLn(�jYn; Zn)@vec(�)@�2v
=1
�4�[(��1" �v�)X 0
n]�n(�);
@2 lnLn(�jYn; Zn)@�2v@�
2v
=n
2�4�� 1
�6��n(�)
0�n(�);
@2 lnLn(�jYn; Zn)@�2v@�v�
= �n��1" �v��4�
� [(Zn �Xn�)��1" ]0
�4��n(�) +
2��1" �v��6�
�n(�)0�n(�);
26
![Page 27: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/27.jpg)
@2 lnLn(�jYn; Zn)@vec(�)@�j
= �(��1"@�"@�j
��1" X 0n)vec(Zn �Xn�) +
�0v���1"
�4�
@�"@�j
��1" �v�[(��1" �v�)X 0
n]�n(�)
+1
�2�[(��1"
@�"@�j
��1" �v�)X 0n]�n(�)�
1
�2�[(��1" �v�)X 0
n][(Zn �Xn�)��1"@�"@�j
��1" �v�];
@2 lnLn(�jYn; Zn)@vec(�)@�0v�
=1
�2�(��1" �v�)X 0
n[(Zn �Xn�)��1" ]� ��1"
�2� (X 0
n�n(�))�2
�4�[(��1" �v�)X 0
n]�n(�)�0v��
�1" ;
@2 lnLn(�jYn; Zn)@�2v@�j
=�n(�)
0
�4�(Zn �Xn�)��1"
@�"@�j
��1" �v� +n�0v��
�1"
2�4�
@�"@�j
��1" �v� ��0v��
�1"
�6�
@�"@�j
��1" �0v��n(�)0�n(�);
@2 lnLn(�jYn; Zn)@�j@�k
=1
2vec(Zn �Xn�)0
�[��1" (
@2�"@�j@�k
� 2@�"@�k
��1"@�"@�j
)��1" ] In�vec(Zn �Xn�)
+n
2tr
���1" (
@�"@�k
��1"@�"@�j
� @2�"@�j@�k
)
�+n�0v��
�1"
2�4�
�@�"@�k
��1" �v��0v��
�1"
@�"@�j
� �2�(@2�"@�j@�k
� 2@�"@�k
��1"@�"@�j
)
���1" �v�
+2�0v��
�1"
�4�
@�"@�j
��1" �v��n(�)0[(Zn �Xn�)��1"
@�"@�k
��1" �v�]
��0v��
�1"
�2�
�[(Zn �Xn�)��1" (
@2�"@�j@�k
� 2@�"@�k
��1"@�"@�j
)]0[Sn(�)Yn �Xn� � (Zn �Xn�)]
+[(Zn �Xn�)��1"@�"@�k
]0(Zn �Xn�)��1"@�"@�j
���1" �v�
+�0v��
�1"
2�6�
��2�(
@2�"@�j@�k
� 2@�"@�k
��1"@�"@�j
)� 2@�"@�k
��1" �v��0v��
�1"
@�"@�j
���1" �v��n(�)
0�n(�);
@2 lnLn(�jYn; Zn)@�v�@�0v�
=2n��1" �v��
0v��
�1"
�4�+n��1"�2�
� ��1" (Zn �Xn�)0
�2�(Zn �Xn�)��1"
+4��1" �v��4�
�n(�)0(Zn �Xn�)��1" �
4��1" �v��0v��
�1" + �2��
�1"
�6��n(�)
0�n(�);
@2 lnLn(�jYn; Zn)@�v�@�j
= ��0v��
�1"
�4�
@�"@�j
��1" �v�[(Zn �Xn�)��1" ]0�n(�)�n��1" �v��
0v��
�1"
�4�
@�"@�j
��1" �v�
+��1"�2�
�(Zn �Xn�)0(Zn �Xn�)��1"
@�"@�j
��1" �v� � [(Zn �Xn�)��1"@�"@�j
]0�n(�)
��2�
�1" �v��4�
�n(�)0[(Zn �Xn�)��1"
@�"@�j
��1" �v�]�n��1"�2�
@�"@�j
��1" �v�
+1
�6�[2��1" �v��
0v��
�1" + �2��
�1" ]@�"@�j
��1" �v��n(�)0�n(�):
27
![Page 28: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/28.jpg)
Expectation of the second order derivative matrix is
E
�@2 lnLn(�0)
@�@�0
�=
1
�2�0
0BBBBBBBBB@
I�� I 0�� I 0�� I�v I 0�� I��
� �X 0nXn I�� 0 0 0
� � I�� 0 0 0
� 0 0 Ivv Iv� Iv�
� 0 0 � I�� I��
� 0 0 � � I��
1CCCCCCCCCA;
where
I�� = ��2�0tr[E(G2n +GnG0n)]� �00X 0nE(G
0nGn)Xn�0 � �00E("0nG0nGn"n)�0 � 2�00E("0nG0nGn)Xn�0;
I�� = �X 0nE(GnXn�0 +Gn"n�0);
I�� = (�0 X 0n)E(GnXn�0 +Gn"n�0); I�v = �E[tr(Gn)];
(I��)j = �00@�"0@�j
��1"0 E["0nGn(Xn�0 + "n�0)� �"0�0tr(Gn)];
I�� = 2�0E[tr(Gn)]� ��1"0 E["0nGn(Xn�0 + "n�0)];
I�� = �00 (X 0nXn); I�� = �(��1"0 �2�0 + �0�
00) (X 0
nXn) = �(��1"0 �2v0) (X 0nXn);
Ivv = � n
2�2�0; Iv� = �
n
2�2�0�00@�"0@�j
�0; Iv� =n�0�2�0
;
(I��)ij = �n�2�02tr
�@�"0@�j
��1"0@�"0@�k
��1"0
�� n�00
@�"0@�k
��1"0@�"0@�j
�0 �n�002�2�0
@�"0@�k
�0�00
@�"@�j
�0, i; j = 1; :::J ;
I�� =n
�2�0(��1"0 �
2�0 + �0�
00)@�"0@�j
�0 =n
�2�0��1"0 �
2v0
@�"0@�j
�0; I�� = �n(��1"0 +2�0�
00
�2�0):
Appendix B: Proofs
Proof of Claim 2. For any b > 2; the (i; j)th element ofW bn isW
bn(i; j) =
Pi1� � �P
ia�1wii1wi1i2 � � �wi(b�1)j .
From Lemma 2 in Qu and Lee (2012), locations i and i2 can have common neighbors only if i2 2 Bi(2k0),therefore by induction wi(b�1)j 6= 0 only if j 2 Bi(bk0).Similarly, the (i; j)th element of (W 0
n)aW b
n is (W0n)aW b
n(i; j) =Pn
k=1 gki(a)gkj(b); where
gki(a) =P
i1� � �P
ia�1wki1wi1i2 � � �wi(a�1)i with a; b � 2. So (W 0
n)aW b
n(i; j) 6= 0 only if i 2 Bk(ak0) andj 2 Bk(bk0). Therefore, j 2 Bi((a+ b)k0): Q.E.D.
Proof of Proposition 1. For the mean,
"0nAn"n =nXi=1
nXj=1
aij�i�j =nXi=1
Ti;
28
![Page 29: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/29.jpg)
where Ti =Pn
j=1 aij�i�j =P
j2Bi(h)aij�i�j ; because aij = 0 when j 62 Bi(h). As aij�s are uniformly
bounded, there exists a �nite constant ca such that jaij j � ca. It follows that EjTij � ca(�2+Chd�12) <1;where �2 = E(�
2i ) and �12 = E
2(j�ij). Hence, E( 1n"0nAn"n) = O(1).
For the variance, var("0nAn"n) =Pn
i=1
Pnj=1 cov(Ti; Tj) =
Pni=1
Pj2Bi(2h)
cov(Ti; Tj), because Tj =Pk2Bj(h)
ajk�j�k and there are no common ��s in Ti and Tj when i and j are apart by more than 2h
distance. As jTij � caj�ijP
j2Bi(h)j�j j, jTij2 � c2aj�ij2(
Pj2Bi(h)
j�j j)2 = c2aj�ij2P
j2Bi(h)
Pk2Bi(h)
j�j jj�kj. Itfollows that EjTij2 � c2a(Chd)2(Ej�ij4+2E(j�3i j)Ej�ij+E2j�2i j+Ej�2i jE2j�ij), as Bj(2h) can have spatial unitsonly less than C(2h)d from j, which is �nite and not depending on n. In consequence, supiEjTij2 < 1,which implies in turn that supi var(Ti) <1. Hence, we have
var(1
n"0nAn"n) =
1
n2
nXi=1
Xj2Bi(2h)
cov(Ti; Tj) = O(1
n)
which goes to zero as n!1. Q.E.D.
Proof of Proposition 2. For any 0 < c < 1 and a �nite integerm, the seriesP1
l=0 lmcl is convergent by the
ratio test. Then the absolute mean property of 1nEj�0nBn�nj follows by the proof of the �rst part of Proposition
1. De�ne A�nl =Anl
jjAnljj1 so that all its entries have ja�ij;nlj � 1. As 1n j�
0nA
�nl�nj � 1
n
Pni=1
Pnj=1 ja�ij jj�ijj�j j �
1n
Pni=1
Pj2Bi((l+s)k0)
j�ijj�j j; we have E 1n j�
0nA
�nl�nj � c�(l+s)d�
+12, where c
� = Ckd0 and �+12 = �2+�12 with
�2 = E(�2i ) and �12 = E2j�ij = Ej�ijEj�j j when i 6= j. Therefore, E 1n j�
0nBn�nj = E 1
n jP1
l=0 al�0nAnl�nj �P1
l=0 kalAnlk1E 1n j�
0nA
�nl�nj �
P1l=0 c0(l + c1)
s1clc�(l + s)d�+12 = O(1).
To show 1n [�
0nBn�n � E(�0nBn�n)] = op(1), decompose �0nBn�n =
PL�1l=0 al�
0nAnl�n +
P1l=L al�
0nAnl�n for
some large L. For the �rst �nite summation, by Proposition 1, its deviation from its mean will converge to
zero in probability for any �nite L. So it remains to investigate the convergence of the second in�nite sum-
mation for large enough L. As E 1n j�
0nA
�nl�nj � c�(l+ s)d�
+12, for any positive value M and a speci�ed integer
s, P ( 1n j�0nAnl�nj �M(l + s)d+2+s) � 1
M(l+s)d+2+sEj 1n�
0nAnl�nj � 1
M(l+s)d+2+sc�(l + s)d�+12 =
c��+12M(l+s)2+s .
For a given small value � > 0, there exists an M� such thatc��+12M�
P1l=0
1(l+s)2+s < �. Then, for a joint
probability of events below,
P (1
nj�0nA�nl�nj < M�(l + s)
d+2+s;8l = 1; 2; � � � )
= 1� P ( 1nj�0nA�nl�nj �M�(l + s)
d+2+s; for some l)
� 1�1Xl=0
P (1
nj�0nA�nl�nj �M�(l + s)
d+2+s; for some l)
� 1� c��+12M�
1Xl=0
1
(l + s)2+s
> 1� �:
29
![Page 30: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/30.jpg)
The event 1n j�
0nA
�nl�nj < M�(l + s)
d+2;8l = 1; 2; � � � , implies that
j 1n
1Xl=L
al�0nAnl�nj �
1Xl=L
jaljjjAnljj1j1
n�0nA
�nl�nj �
1Xl=L
c0(l + c1)s1clj 1
n�0nA
�nl�nj
� c0M�
1Xl=L
[l +max(s; c1)]d+2+s1cl � �
occurs with probability greater than (1 � �) for some large L�. Furthermore, Ej 1nP1
l=L al�0nAnl�nj �P1
l=L c0(l + c1)s1clEj 1n�
0nA
�nl�nj � c0c��
+12
P1l=L[l +max(s; c1)]
d+2+s1cl < � for some large L�.
From the above, we have for any arbitrary small � > 0 that
P (j 1n
1Xl=0
al[�0nAnl�n � E(�0nAnl�n)]j > 3�)
� P (j 1n
L��1Xl=0
al[�0nAnl�n � E(�0nAnl�n)]j > �) + P (j
1
n
1Xl=L�
al�0nAnl�nj > �) � 2�, as n!1:
Hence, j 1nP1
l=0 al[�0nAnl�n � E(�0nAnl�n)]j = op(1). Q.E.D.
Proof of Lemma 1 (the unique maximizer). With information inequality, E (lnLn(�)) � 1nE (lnLn(�0)).
For the uniqueness, we only need to show if limn!1[1nE (lnLn(�)) �
1nE (lnLn(�0))] = 0, then it must be
� = �0: From (A.1), the di¤erence of expectation of log likelihood function is
1
nE (lnLn(�))�
1
nE (lnLn(�0))
= �12ln�2��2�0
� 12lnj�"jj�"0j
� 1
2n
nXi=1
x0i(�0 � �)��1" (�0 � �)0xi
+1
nE(ln
jSn(�)jjSnj
)� 12tr(�
1=2"0 �
�1" �
1=2"0 � Ip)�
1
2�2��0�"0� +
1
2�2�0�00�"0�0
��2�02�2�
tr[E1
n(S�1n � �Gn)0(S�1n � �Gn)] +
1
2
� 1
2�2�E1
n(Xn�0 + "n�0)
0(S�1n � �Gn)0(S�1n � �Gn)(Xn�0 + "n�0) +1
2�2�0E1
n(Xn�0 + "n�0)
0(Xn�0 + "n�0)
� 1
2�2�[� + (�0 � �)�]0
X 0nXnn
[� + (�0 � �)�] +1
2�2�0�00X 0nXnn
�0
+1
�2�E1
n(Xn�0 + "n�0)
0(S�1n � �Gn)0fXn[� + (�0 � �)�] + "n�g �1
�2�0E1
n(Xn�0 + "n�0)
0(Xn�0 + "n�0):
30
![Page 31: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/31.jpg)
Since S�1n � �0Gn = In, we have S�1n Sn(�) = S�1n � �Gn = In � (�� �0)Gn and
1
n[E (lnLn(�))� E (lnLn(�0))]
= �12ln�2��2�0
� 12lnj�"jj�"0j
+1
nE(ln
jSn(�)jjSnj
)� 12tr(�
1=2"0 �
�1" �
1=2"0 � Ip)
� 1
2n
nXi=1
x0i(�0 � �)��1" (�0 � �)0xi �1
2n�2�Ef[(�� �0)Hn � Jn]0[(�� �0)Hn � Jn]g
��2�02n�2�
trE[(In � (�� �0)Gn)0(In � (�� �0)Gn)] +1
2
= �12[tr(�
1=2"0 �
�1" �
1=2"0 )� ln j�
1=2"0 �
�1" �
1=2"0 j � p]
� 1
2n
nXi=1
x0i(�0 � �)��1" (�0 � �)0xi �1
2n�2�Ef[(�� �0)Hn � Jn]0[(�� �0)Hn � Jn]g
� 1
2nE
"tr
�2�0�2�[In � (�� �0)Gn]0[In � (�� �0)Gn]
!� ln
������2�0�2� [In � (�� �0)Gn]0[In � (�� �0)Gn]������ n
#;
where Hn = Gn(Xn�0+ "n�0) and Jn = Xn[�0��� (�0��)�]� "n(�0� �): It is easy to check that for anyx > 0, the function f(x) = x� lnx� 1 � 0 and is minimized only at x = 1. Also for any symmetric, positivede�nite real value matrix M , the function f(M) = tr(M) � ln jM j �m =
Pmi=1('i � ln'i � 1) � 0 and is
minimized only at M = Im, where m is the dimension of M and '0is (i = 1; :::m) are eigenvalues of M .
If limn!1[1nE (lnLn(�))�
1nE (lnLn(�0))] = 0 then it must be �
1=2"0 �
�1" �
1=2"0 = Ip, (�0��)��1" (�0��) =
0 because limn!1X0nXn
n is nonsingular; limn!11nEf[(���0)Hn�Jn]
0[(���0)Hn�Jn]g = 0 and�2�0�2�[In�
(���0)Gn]0[In� (���0)Gn] = In with probability close to 1 for large enough n. Apparently, �" = �"0 and� = �0, so limn!1
1nEf[(���0)Gn(Xn�0+"n�0)�Xn(�0��)+"n(�0��)]
0[(���0)Gn(Xn�0+"n�0)�Xn(�0��)+"n(�0��)] = 0: As the square matrix of mean vector is less than the second moment matrix, this implieslimn!1
1n [(�� �0)E(GnXn�0 +Gn"n�0)�Xn(�0 � �)][(�� �0)E(GnXn�0 +Gn"n�0)�Xn(�0 � �)]
0 = 0.
Under Assumption 4, it must be � = �0 and � = �0. Then�2�0�2�= 1 and (�0 � �)0�"0(�0 � �) = 0, so it must
be � = �0, i.e., �v� = �v�0. These gives us � = �0 if limn!11n [E (lnLn(�))� E (lnLn(�0))] = 0. Therefore,
we can conclude that �0 is the unique maximizer of limn!11nE (lnLn(�)) : Q.E.D.
Proof of Theorem 1. With the compact parameter space, once we have �0 is the unique maximizer, we
need check two conditions for the consistency of the MLE.
Step 1: Uniform convergence of the log likelihood function. All terms in the the log likelihood function
can be expressed in our general terms with Mn, the pointwise convergence is straightforward:
1
nlnLn(�jYn; Zn)� E[
1
nlnLn(�jYn; Zn)]
p! 0:
31
![Page 32: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/32.jpg)
All parameters are bounded in their parameter spaces and they enter the log likelihood function polynomially
except for the term ln jSn(�)j, so we need to show the stochastic equicontinuity of 1n ln jSn(�)j to have theuniform convergence. Applying the mean value theorem,���� 1n (ln jSn(�1)j � ln jSn(�2)j)
���� = ����(�2 � �1) 1ntr(Gn(�))���� � j�2 � �1jCse; (A.2)
where � is a value between �1 and �2 and Cse is a constant not depending on n. This inequality is implied
by the fact Gn(�) is bounded in both row and column sums uniformly in our compact parameter space under
Assumption 2.3). From this, we can conclude the uniform convergence that
sup�2�
�1
nlnLn(�jYn; Zn)� E[
1
nlnLn(�jYn; Zn)]
�p! 0:
Step 2: Uniform equicontinuity of the limiting sample average of the expected log likelihood function
limn!1E�1n lnLn(�jYn; Zn)
�.
From (A.1), with inequality in (A.2), and variance parameters being bounded away from zero in compact
parameter spaces, earlier results of E (1n j�0nBn�n j) = O(1 ) and E( 1n jX
0nBn�nj) = O(1), we can conclude that
E�1n lnLn(�jYn; Zn)
�is uniformly equicontinuous in � 2 �.
From Lemma 1, �0 is the unique maximizer of limn!1E[1n lnLn(�jYn; Zn)]: These together imply b� p! �0:
Q.E.D.
Proof of Lemma 2 (the CLT). We prove the CLT for the form of
Rn =1pn
�A0n&n +B
0n�n + �
0nCn�n + &
0nDn&n + �
0nEn&n � E(�0nCn�nj&n)� E(& 0nDn&n)
�;
The entries &i�s of &n are i.i.d with mean zero and variance �2& , and �n is independent of Bn, Cn, and
En,5 so
Rn =1pn
nXi=1
0@ai&i + bi�i + �i nXj
cij�j � E(�inXj
cij�j j&n) + &idijnXj
&j � E(&idijnXj
&j) + �i
nXj
eij&j
1A=
1pn
nXi=1
0@ai&i + bi�i + 2�i i�1Xj=1
cij�j + cii(�2i � �2�) + 2&i
i�1Xj=1
dij&j + dii(&2i � �2& ) + �i
nXj
eij&j
1A ;5Here we treat Cn and Dn as symmetric. If not, we can always replace Cn with (Cn + C0n)=2.
32
![Page 33: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/33.jpg)
LetRn = 1pn
P2nl=1Rn;l, then we can de�neRn;l and the corresponding �-�led Fn;l with following expressions:
Rn;1 =1pn[a1&1 + d11(&
21 � �2& )]; Fn;1 =< &1 >;
Rn;2 =1pn[a2&2 + 2&2d21&1 + d22(&
22 � �2& )]; Fn;2 =< &1; &2 >;
......
Rn;n =1pn[an&n + 2&n
n�1Xj=1
dnj&j + dnn(&2n � �2& )]; Fn;n =< &1; � � � &n >;
Rn;n+1 =1pn[b1�1 + �1
nXj
e1j&j + c11(�21 � �2�)], Fn;n+1 =< &n; �1 >;
Rn;n+1 =1pn[b2�2 + �2
nXj
e2j&j + 2�2c21�1 + c22(�22 � �2�)], Fn;n+1 =< &n; �1; �2 >;
...,...;
Rn;2n =1pn[bn�n + �n
nXj
enj&j + 2�n
n�1Xj=1
cij�j + cnn(�2n � �2�)], Fn;n+1 =< &n; �1; � � � �n > :
Because &�s and ��s are mutually independent, E(Rn;ljFn;l�1) = 0, for l = 2; :::2n. Then f(Rn;l;Fn;l)j1 �l � 2ng forms a martingale di¤erence double array. We have V ar(
P2nl=1Rn;l) =
P2nl=1ER
2n;l as Rn;l�s are
martingale di¤erences. De�ne this variance as �2Rnand the normalized variables R�n;l = Rn;l=�Rn
. In order
for the martingale central limit theorem to be applicable, we shall show that there exists a �R > 0 such that
2nXl=1
EjR�n;lj2+�R ! 0 as n!1;
2nXl=1
E(R�2n;ljFn;l�1)� 1p! 0:
Denote �2R1;nand �2R2;n
as the variance of Rd1;n and Rd2;n, where
Rd1;n =1pnA0n&n + &
0nDn&n � E(& 0nDn&n) and Rd2;n =
1pnB0n�n + �
0nCn�n + �
0nEn&n � E(�0nCn�nj"n);
33
![Page 34: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/34.jpg)
then Rn = Rd1;n +Rd2;n and �2Rn= �2R1;n
+ �2R2;n. The expressions are
�2R1;n=
�2&nA0nAn +
�4&ntr(D2
n +DnD0n) + [E(&
4i )� 3�4& ]
1
n
nXi=1
d2ii +2
nE(&3i )
nXi=1
aidii
and �2R2;n=
�2�nE(B0nBn) +
�4�ntr[E(C2n + CnC
0n)] +
�2�nE(& 0nE
0nEn&n) +
2�2�nE(B0nEn&n)
+[E(�4i )� 3�4� ]1
n
nXi=1
E(c2ii) +2
nE(�3i )
nXi=1
E(bicii) +2
nE(�3i )
nXi=1
E(ciieij&j):
As �2&Pn
i=1 a2i + V ar(&
2i )Pn
i=1 d2ii � 2
q�2&V ar(&
2i )Pn
i=1 jaidiij � 2jE(&3i )jPn
i=1 jaidiij, we have
�2R1=�2&n
nXi=1
a2i +�4&n
Xi
Xj 6=i(d2ij + d
2ji) +
V ar(&2i )
n
nXi=1
d2ii +2
nE(&3i )
nXi=1
aidii ��4&n
Xi
Xj 6=i(d2ij + d
2ji):
So a su¢ cient condition for �2R1;nto be bounded away from zero is that limn!1
1n
Pni=1
Pnj 6=i(d
2ij+d
2ji) > 0.
Similarly, a su¢ cient condition for �2R2;nto be bounded away from zero is that limn!1
1n
Pni=1
Pnj 6=iE(c
2ij+
c2ji) > 0. Therefore, a su¢ cient condition for �2Rn> 0 is that limn!1
1n
Pni=1
Pnj 6=i(d
2ij + d
2ji) > 0 or
limn!11n
Pni=1
Pnj 6=iE(c
2ij + c
2ji) > 0.
From Propositions 1-2 and the corollary, we know that E(B0nBn), E(C2n + CnC
0n), E(&
0nE
0nEn&n), and
E(B0nEn&n) are O(n). Therefore, �2R1;n
; �2R2;n, and �2Rn
are O(1). For l = 1; � � � ; n, Rn;l = 1pn(ai&i +
2&iPi�1
j=1 dij&j + dii(&2i � �2& )). Since Rn;l=�R1;n
is conventional linear quadratic form, the two conditions
for the summation of �rst n terms are easy to check. We havePn
l=1EjR�n;lj2+�R ! 0 as n ! 1 andPnl=1E(R
2n;ljFn;l�1)=�2R1
� 1 p! 0:
34
![Page 35: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/35.jpg)
For l = n+ 1; � � � ; 2n, Rn;l = 1pn[bi�i + 2�i
Pi�1j=1 cij�j + cii(�
2i � �2�) + �i
Pnj eij&j ], where i = l � n, so
E(R2n;ljFn;l�1) =1
nE[b2i �
2i + 4�
2i
i�1Xj=1
cij�j
i�1Xk=1
cik�k + c2ii(�
2i � �2�)2 + �2i
nXj=1
eij&j
nXk=1
eik&k + 4bi�2i
i�1Xj=1
cij�j
+2bicii(�3i � �i�2�) + 2bi�2i
nXj
eij&j + 4cii(�3i � �i�2�)
i�1Xj=1
cij�j + 4�2i
i�1Xj=1
cij�j
nXk=1
eik&k + 2cii(�3i � �i�2�)
nXj
eij&j ]
=1
n[b2i�
2� + 4�
2�
i�1Xj=1
cij�j
i�1Xk=1
cik�k + c2iiV ar(�
2i ) + �
2�
nXj=1
eij&j
nXk=1
eik&k + 4bi�2�
i�1Xj=1
cij�j
+2biciiE(�3i ) + 2bi�
2�
nXj
eij&j + 4ciiE(�3i )
i�1Xj=1
cij�j + 4�2�
i�1Xj=1
cij�j
nXk=1
eik&k + 2ciiE(�3i )
nXj
eij&j ];
E(R2n;l) =1
n[E(b2i )�
2� + 4�
4�
i�1Xj=1
E(c2ij) + E(c2ii)V ar(�
2i ) + �
2�
nXj=1
nXk=1
E(eijeik&j&k)
+2E(bicii)E(�3i ) + 2�
2�
nXj
E(bieij&j) + 2E(�3i )
nXj
E(ciieij&j)]:
Therefore for l = n+ 1; � � � ; 2n and i = l � n,
E(R2n;ljFn;l�1)� E(R2n;l) =1
n[�2�(b
2i � E(b2i ) + 4�2�
i�1Xj=1
cij�j
i�1Xk=1
cik�k � �2�E(c2ij)!+ (c2ii � E(c2ii))V ar(�2i )
+�2�
nXj=1
nXk=1
(eij&jeik&k � E(eij&jeik&k)) + 4bi�2�i�1Xj=1
cij�j + 2E(�3i )(bicii � E(bicii)) + 2�2�
nXj
(bieij&j � E(bieij&j))
+4ciiE(�3i )
i�1Xj=1
cij�j + 4�2�
i�1Xj=1
cij�j
nXk=1
eik&k + 2E(�3i )
nXj
(ciieij&j � E(ciieij&j))]
=1
nf�2� [b2i � E(b2i )] + V ar(�2i )[c2ii � E(c2ii)] + 2E(�3i )[bicii � E(bicii)]g
+4
n
i�1Xj=1
"�2�
cij�j
i�1Xk=1
cik�k � �2�E(c2ij)!+ bi�
2�cij�j + ciiE(�
3i )cij�j + �
2�cij�j
nXk=1
eik&k
#
+1
n
nXj=1
"�2�
nXk=1
(eij&jeik&k � E(eij&jeik&k)) + 2�2�(bieij&j � E(bieij&j)) + 2E(�3i )(ciieij&j � E(ciieij&j))#;
and hence,
2nXl=n+1
[E(R2n;ljFn;l�1)� E(R2n;l)] = H1n +H2n +H3n +H4n +H5n +H6n +H7n +H8n +H9n +H10n;
35
![Page 36: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/36.jpg)
where
H1n =�2�n
nXi=1
[b2i � E(b2i )]; H2n =V ar(�2i )
n
nXi=1
[c2ii � E(c2ii)]; H3n =2E(�3i )
n
nXi=1
[bicii � E(bicii)];
H4n =4�2�n
nXi=1
i�1Xj=1
cij�j
i�1Xk=1
cik�k � �2�E(c2ij)!; H5n =
4�2�n
nXi=1
i�1Xj=1
bicij�j ; H6n =4E(�3i )
n
nXi=1
i�1Xj=1
ciicij�j ;
H7n =4�2�n
nXi=1
i�1Xj=1
cij�j
nXk=1
eik&k; H8n =2�2�n
nXi=1
nXj=1
(bieij&j � E(bieij&j));
H9n =2E(�3i )
n
nXi=1
nXj=1
(ciieij&j � E(ciieij&j)); H10n =�2�n
nXi=1
nXj=1
nXk=1
(eij&jeik&k � E(eij&jeik&k)):
We need to showHjn = op(1) for all j = 1; :::; 10. AsH1n =�2�n
Pni=1[b
2i�E(b2i )] =
�2�n (B
0nBn�E(B0nBn)) and
H10n =�2�n
Pni=1
Pnj=1
Pnk=1(eij&jeik&k � E(eij&jeik&k)) =
�2�n [&
0nE
0nEn&n � E(& 0nE0nEn&n)]; if Bn, Cn, and
En satisfy conditions in Assumption 5, then from Propositions 1-2 and Corollary 1, H10n = op(1). Similar
arguments can be applied to Hjn for j = 2; :::; 9: These imply thatP2n
l=n+1E(R2n;ljFn;l�1)=�2R2;n
� 1 p! 0:
Together withPn
l=1E(R2n;ljFn;l�1)=�2R1;n
� 1 p! 0, we have
2nXl=1
E(R2n;ljFn;l�1)� (�2R1;n+ �2R2;n
)p! 0; i.e.,
2nXl=1
E(R�2n;ljFn;l�1)� 1p! 0:
The next step is to showP2n
l=n+1EjRn;lj2+�R ! 0 as n ! 1. For l = n + 1; � � � ; 2n, let i = l � n thenfor any positive constants p and q such that 1=p+ 1=q = 1, we have
jRn;lj =1pnjbi�i + 2�i
i�1Xj=1
cij�j + cii(�2i � �2�) + �i
nXj
eij&j j
� 1pn(jbijj�ij+ 2j�ij
i�1Xj=1
jcij jj�j j+ jciijj�2i � �2� j+ j�ijnXj
jeij jj&j j)
=1pn(jbij
1p jbij
1q j�ij+ 2j�ij
i�1Xj=1
jcij j1p jcij j
1q j�j j+ jciij
1p jciij
1q j�2i � �2� j+ j�ij
nXj
jeij j1p jeij j
1q j&j j):
36
![Page 37: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/37.jpg)
By Holder�s inequality, we can get
jRn;ljq � n�q=2
240@(jbij 1p )p + iXj=1
(jcij j1p )p +
nXj
(jeij j1p )p
1A 1p
0@(jbij 1q j�ij)q + i�1Xj=1
(2j�ijjcij j1q j�j j)q + (jciij
1q j�2i � �2� j)q +
nXj
(j�ijjeij j1q j&j j)q
1A 1q
375q
= n�q2
0@jbij+ iXj=1
jcij j+nXj
jeij j
1Aqp0@jbijj�qi j+ 2q i�1X
j=1
jcij jj�qi jj�qj j+ jciijj�
2i � �2� j)q + j�
qi j
nXj
jeij jj&qj j
1A :As Cn and En are uniformly bounded in both row and column sum norms and elements of Bn are uniformly
bounded, there exists a constant cR such that jbij � cR=3,Pi
j=1 jcij j � cR=3, andPn
j jeij j � cR=3 for all iand n. Hence,
jRn;ljq � n�q2 cq=pR
0@jbijj�qi j+ 2q i�1Xj=1
jcij jj�qi jj�qj j+ jciij(j�
2i � �2� j)q +
nXj
jeij jj�qi jj&qj j
1A :There exists a �nite constant cq > 1 such that Ej&qi j � cq and (j�2i � �2� j)q � cq for all i and n. Taking
q = 2 + �R, as Ej�2(2+�R)i j exists andPn
j (jeij jj�qi j) � cGj�
qi j where cG = jjEnjj1 <1, it follows that
2nXl=n+1
EjRn;lj2+�R � 22+�Rn�2+�R2 c
(2+�R)=pR c2q
0@jbij+ iXj=1
Ejcij j+ jciij+nXj
E(jeij jj&2+�Ri j)
1A = O(n��R2 ):
As �2Rn= O(1) and is bounded away from zero,
2nXl=1
EjR�n;lj2+�R =1
�2+�RRn
2nXl=1
EjRn;lj2+�R = O(n��R2 );
which goes to zero as n tends to in�nity. Conditions for the martingale CLT satisfy so Rn=�Rn
d! N(0; 1):
Q.E.D.
Proof of Lemma 3 (the nonsingular information matrix). The information matrix is I� =
� limn!1E�1n@2 lnLn(�0)
@�@�0
�, where the expression of E
�@2 lnLn(�0)
@�@�0
�can be found in Appendix A.3. Let
CI = (c1; c02; vec(c3)
0, c4, c05, c06)0 be a k� dimensional column vector of constants such that I�0CI = 0, where
c1 and c4 are constants; c2, c5; and c6 are column vectors of dimension k, J , and p; c3 is a k� p matrix. Toshow the information matrix is nonsingular, it is su¢ cient to show CI = 0.
37
![Page 38: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/38.jpg)
From the second row block of the linear equation system I�0CI = 0, we have
limn!1
1
n[�c1X 0
nE(GnXn�0 +Gn"n�0)�X 0nXnc2 + (�
00 (X 0
nXn))vec(c3)] = 0:
Therefore
limn!1
1
nX 0nXnc2 = lim
n!1
1
n[(�00 (X 0
nXn))vec(c3)� c1X 0nE(GnXn�0 +Gn"n�0)]:
From the third row block, we have
limn!1
1
n[c1(�0 X 0
n)E(GnXn�0 +Gn"n�0) + (�0 (X 0nXn))c2 � ((��1"0 �2v0) (X 0
nXn))vec(c3)] = 0:
Since (�0 (X 0nXn))c2 = �0 (X 0
nXnc2), by eliminating limn!11nX
0nXnc2, we have
vec
�limn!1
X 0nXnn
c3(�0�00 � �2v0��1"0 )
�= vec
�limn!1
X 0nXnn
c3��1"0 (�v��
0v� � �2v0�"0)��1"0
�= 0:
Under Assumption 2.5), �v��0v���2v0�"0 is negative de�nite by Cauchy�Schwarz inequality and Assumption2.3) implies limn!1
X0nXn
n is positive de�nite. Hence, it follows that c3 = 0. Now with c3 = 0,
c2 = �c1 limn!1
(X 0nXn)
�1X 0nE(GnXn�0 +Gn"n�0):
From the fourth row block, we have
limn!1
1
nE[�c1tr(Gn)]�
c42�2�0
�JXj=1
c5j2�2�0
�00@�"0@�j
�0 +�00�2�0
c6 = 0: (B.1)
From the sixth row block, we have
limn!1
1
nc1f2�0E[tr(Gn)]���1"0 E["0nGn(Xn�0+"n�0)]g+c4
�0�2�0
+
JXj=1
c5j�2�0
��1"0 �2v0
@�"0@�j
�0�(��1"0 +2�0�
00
�2�0)c6 = 0:
(B.2)
Then (B.1)�2�0+(B.2) gives
limn!1
c1n��1"0 E["
0nGn(Xn�0 + "n�0)]�
JXj=1
c5j��1"0
@�"0@�j
�0 +��1"0 c6 = 0:
Therefore
c6 =JXj=1
c5j@�"0@�j
�0 � c1 limn!1
1
nE["0nGn(Xn�0 + "n�0)]:
38
![Page 39: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/39.jpg)
From the �fth row block, for any k = 1; :::J , we have
limn!1
1
nc1
��00@�"0@�k
��1"0 E["0nGn(Xn�0 + "n�0)� �"0�0tr(Gn)]
�� c42�2�0
�00@�"0@�k
�0 +�2v0�2��00@�"0@�k
��1"0 c6
(B.3)
�JXj=1
c5j
"�2�02tr
�@�"0@�j
��1"0@�"0@�k
��1"0
�+ �00
@�"0@�k
��1"0@�"0@�j
�0 +�002�2�0
@�"@�k
�0�00
@�"@�j
�0
#= 0:
Plugging c6 into (B.1), (B.1) becomes
�c1 limn!1
Etr(Gn)
n� c42�2�0
+�002�2�0
0@ JXj=1
c5j@�"0@�j
�0 � 2c1 limn!1
1
nE["0nGn(Xn�0 + "n�0)]
1A = 0:
Plugging c6 into (B.3), (B.3) becomes
�00@�"0@�k
�0
24 �002�2�0
0@ JXj=1
c5j@�"0@�j
�0 � 2c1 limn!1
1
nE["0nGn(Xn�0 + "n�0)]
1A� c1 limn!1
Etr(Gn)
n� c42�2�0
35��2�02
JXj=1
c5jtr
�@�"0@�j
��1"0@�"0@�k
��1"0
�= 0:
These two equations imply
JXj=1
c5jtr
�@�"0@�j
��1"0@�"0@�k
��1"0
�= 0 for any k = 1; :::J:
Since �0js are distinct parameters elements in �", all the entries in@�"0@�k
are zero except for the place where
�k is. Hence, for each k = 1; :::J , c5k = 0, i.e., c5 = 0. With c3 = 0 and c5 = 0, we have
c6 = �c1 limn!1
1
nE["0nGn(Xn�0 + "n�0)];
c4 = �2c1�00 limn!1
1
nE["0nGn(Xn�0 + "n�0)]� 2c1�2�0 lim
n!1Etr(Gn)
n:
39
![Page 40: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/40.jpg)
Go to the �rst row block,
0 = � limn!1
c11
nE[(Xn�0 + "n�0)
0G0nGn(Xn�0 + "n�0)]� c1 limn!1
1
n[�2�0tr[E(G
2n +GnG
0n)]
+c1 limn!1
1
nE[(Xn�0 + "n�0)
0G0n]Xn(X0nXn)
�1X 0nE[Gn(Xn�0 + "n�0)] + 2c1�
2�0
�limn!1
Etr(Gn)
n
�2+c1
�limn!1
E"0nGn(Xn�0 + "n�0)
n
�0��1"0
�limn!1
E"0nGn(Xn�0 + "n�0)
n
�:
Denote Hn = Gn(Xn�0 + "n�0), then
0 = c1 limn!1
1
n[E(H 0
nHn)� E(H 0n"n)�
�1"0 E("
0nHn)� E(H 0
n)Xn(X0nXn)
�1X 0nE(Hn)]
+c1 limn!1
�2�0
Etr(G2n +GnG
0n)
n� 2
�Etr(Gn)
n
�2!:
By Cauchy�Schwarz inequality, E(H 0nHn)� E(H 0
n)E(Hn) � E(H 0n"n)�
�1"0 E("
0nHn). Hence,
E(H 0nHn)� E(H 0
n"n)��1"0 E("
0nHn)� E(H 0
n)Xn(X0nXn)
�1X 0nE(Hn)
� E(H 0n)E(Hn)� E(H 0
n)Xn(X0nXn)
�1X 0nE(Hn) = E(H
0n)[In �Xn(X 0
nXn)�1X 0
n]E(Hn) � 0:
Because Assumption 3.2 implies limn!11nE(H
0n)[In�Xn(X 0
nXn)�1X 0
n]E(Hn) is positive de�nite and E[tr(G2n+
GnG0n)]� 2
nE2[tr(Gn)] � E[tr(G2n +GnG0n � 2
n tr2(Gn)] =
12E[tr(Gn +G
0n � 2tr(Gn)In=n)2] � 0, it follows
that c1 = 0, and therefore c2, c4, and c6 are all zeros. Q.E.D.
Proof of Theorem 2. The stochastic terms in second derivatives can be written in the general form with
associated with A (h spatial dependence of "n) or B (an increasing dependence geometric series expansion),
so from Propositions 1 and 2 we have the pointwise convergence
1
n
@2 lnLn(�jYn; Zn)@�@�0� E
�@2 ln(Ln(�jYn; Zn))
@�@�0
� p! 0:
All parameters in � entry the second derivatives polynomially except for tr[G2n(�)]: As
1
n
��tr[G2n(�1)]� tr[G2n(�2)]�� = 1
n
��2tr[G3n(�)](�1 � �2)�� = j�1 � �2jO(1);the second derivatives are stochastically equicontinuous. Then the uniform convergence holds,
sup�2�
1
n
@2 lnLn(�jYn; Zn)@�@�0� E
�@2 ln(Ln(�jYn; Zn))
@�@�0
� p! 0:
40
![Page 41: Estimating a spatial autoregressive model with an …...emphasized, endogeneity of spatial weights was an important one. They concluded that "many of these are still waiting for good](https://reader036.vdocument.in/reader036/viewer/2022071401/60ec9eb2f5eaf527f022324d/html5/thumbnails/41.jpg)
Hence,
pn(b� � �0) = �
1
n
@2 lnLn(e�)@�@�0
!�11pn
@ lnLn(�0)
@�
= ��E
�1
n
@2 lnLn(�0)
@�@�0
���11pn
@ lnLn(�0)
@�+ op(1)
d! N
0;
�limn!1
E(1
n
@ lnLn(�0)
@�
@ lnLn(�0)
@�0)
��1!:
The convergence in distribution holds because any linear combination of the elements in @ lnLn(�0)@� can be
expressed in the form of Rn in the CLT of Lemma 2. By Cramer-Wold device, we can concludepn(b���0) d!
N(0; I�1� ). Q.E.D.
41