nonlinear models with spatial data - new york...
TRANSCRIPT
[Part 14] 1/103
Nonlinear Models with Spatial Data
William Greene
Stern School of Business, New York University
Washington D.C.
July 12, 2013
[Part 14] 2/103
Y=1[New Plant Located in County]
Klier and McMillen: Clustering of Auto Supplier Plants in the United States. JBES, 2008
[Part 14] 3/103
Outcome Models for Spatial Data
Spatial Regression Models
Estimation and Analysis
Nonlinear Models and Spatial Regression
Nonlinear Models: Specification, Estimation
Discrete Choice: Binary, Ordered, Multinomial, Counts
Sample Selection
Stochastic Frontier
[Part 14] 4/103
Spatial Autocorrelation
ii
( ) ( ) , N observations on a spatially
arranged variable
contiguity matrix; 0
spatial autocorrelation parameter, -1 < < 1.
E[ ]= Var[ ]=
x i W x i ε
W W
ε 0, ε
2
1
2 -1
( ) [ ]
E[ ]= Var[ ]= [( ) ( )]
I
Spatial "moving average" form
x i I W ε
x i, x I W I W
[Part 14] 5/103
Spatial Autocorrelation in Regression
ii
2
2
11 1
12
( ) . w 0.
E[ | ]= Var[ | ]=
E[ | ]=
Var[ | ] ( )( )
ˆ ( )( ) ( )( )
1 ˆ ˆ( )( )ˆN
ˆ The subject
y Xβ I - W ε
ε X 0, ε X I
y X Xβ
y X = I - W I - W
A Generalized Regression Model
β X I - W I - W X X I - W I - W y
y - Xβ I - W I - W y - Xβ
of much research
[Part 14] 6/103
Bell and Bockstael (2000) Spatial
Autocorrelation in Real Estate Sales
[Part 14] 7/103
Agreed Upon Objective:
Practical Obstacles
• Problem: Maximize logL involving sparse(I-W)
• Inaccuracies in determinant and inverse
• Kelejian and Prucha (1999) moment basedestimator of
• Followed by FGLS
[Part 14] 8/103
Spatial Autoregression in a Linear Model
2
1
1 1
1
2 -1
+ .
E[ | ] Var[ | ]=
[ ] ( )
[ ] [ ]
E[ | ] [ ]
Var[ | ] [( ) ( )]
Estimators: Various f
y = Wy Xβ ε
ε X = 0, ε X I
y = I W Xβ ε
= I W Xβ I W ε
y X = I W Xβ
y X = I W I W
orms of generalized least squares.
Maximum likelihood | ~ Normal[ , ]ε 0
[Part 14] 9/103
Complications of the Generalized
Regression Model
Potentially very large N – GIS data on agriculture
plots
Estimation of . There is no natural residual based
estimator
Complicated covariance structures – no simple
transformations
[Part 14] 10/103
Panel Data Application
it it i it
t t t
E.g., N countries, T periods
y c
= N observations at time t.
Similar assumptions
Candidate for SUR or Spatial Autocorrelation model.
x β
ε Wε v
[Part 14] 11/103
Spatial Autocorrelation in a Panel
[Part 14] 12/103
Analytical Environment
Generalized linear regression
Complicated disturbance covariance matrix
Estimation platform: Generalized least
squares or maximum likelihood (normality)
Central problem, estimation of
[Part 14] 13/103
[Part 14] 14/103
Outcomes in Nonlinear Settings
Land use intensity in Austin, Texas – Discrete Ordered
Intensity = 1,2,3,4
Land Usage Types in France, 1,2,3 – Discrete Unordered
Oak Tree Regeneration in Pennsylvania – Count
Number = 0,1,2,… (Many zeros)
Teenagers in the Bay area:
physically active = 1 or physically inactive = 0 – Binary
Pedestrian Injury Counts in Manhattan – Count
Efficiency of Farms in West-Central Brazil – Nonlinear
Model (Stochastic frontier)
Catch by Alaska trawlers in a nonrandom sample
[Part 14] 15/103
Nonlinear Outcomes
Discrete revelation of choice indicates
latent underlying preferences
Binary choice between two alternatives
Unordered choice among multiple choices
Ordered choice revealing underlying
strength of preferences
Counts of events
Stochastic frontier and efficiency
Nonrandom sample selection
[Part 14] 16/103
Modeling Discrete Outcomes
“Dependent Variable” typically labels an outcome
No quantitative meaning
Conditional relationship to covariates
No “regression” relationship in most cases.
Models are often not conditional means.
The “model” is usually a probability
Nonlinear models – usually not estimated by any type
of linear least squares
[Part 14] 17/103
Nonlinear Spatial Modeling
Discrete outcome yit = 0, 1, …, J for
some finite or infinite (count case) J.
i = 1,…,n
t = 1,…,T
Covariates xit
Conditional Probability (yit = j)
= a function of xit.
[Part 14] 18/103
[Part 14] 19/103
Issues in Spatial Discrete Choice
• A series of Issues Spatial dependence between alternatives: Nested logit
Spatial dependence in the LPM: Solves some practical problems. A bad model
Spatial probit and logit: Probit is generally more amenable to modeling
Statistical mechanics: Social interactions – not practical
Autologistic model: Spatial dependency between outcomes or utillities.
See below
Variants of autologistic: The model based on observed outcomes is
incoherent (“selfcontradictory”)
Endogenous spatial weights
Spatial heterogeneity: Fixed and random effects. Not practical.
• The model discussed below
[Part 14] 20/103
Two Platforms
Random Utility for Preference ModelsOutcome reveals underlying utility Binary: u* = ’x y = 1 if u* > 0
Ordered: u* = ’x y = j if j-1 < u* < j
Unordered: u*(j) = ’xj , y = j if u*(j) > u*(k)
Nonlinear Regression for Count ModelsOutcome is governed by a nonlinearregression E[y|x] = g(,x)
[Part 14] 21/103
Maximum Likelihood Estimation
Cross Section Case: Binary Outcome
Random Utility: y* = +
Observed Outcome: y = 1 if y* > 0,
0 if y* 0.
Probabilities: P(y=1|x) = Prob(y* > 0| )
x
x
= Prob( > - )
P(y=0|x) = Prob(y* 0| )
= Prob( - )
Likelihood for the sample = joint probability
x
x
x
n
i ii=1
n
i ii=1
= Prob(y=y | )
Log Likelihood = logProb(y=y | )
x
x
[Part 14] 22/103
Cross Section Case: n observations
1 1 1 1 1 1
2 2 2 2 2 2
n n n n n n
y =j | or > Prob( or > )
y =j | or > Prob( or > )Prob Prob =
... ... ...
y =j | or > Prob( or > )
We operate on the mar
x x x
x x x
x x x
n
i ii=1
2
ginal probabilities of n observations
LogL( | )= logF 2y 1
1 Probit F(t) = (t) exp( t / 2)dt (t)dt
2
exp(t) Logit F(t) = (t) =
1 exp(t)
t t
X, y x
[Part 14] 23/103
How to Induce Correlation
Joint distribution of multiple observations
Correlation of unobserved heterogeneity
Correlation of latent utility
[Part 14] 24/103
Bivariate Counts
Intervening variable approach
Y1 = X1 + Z, Y2 = Y2 + Z; All 3 Poisson distributed
Only allows positive correlation.
Limited to two outcomes
Bivariate conditional means
1 = exp(x1 + 1), 2 = exp(x2 + 2), Cor(1,2)=
|Cor(y1,y2)| << || (Due to residual variation)
Copula functions – Useful for bivariate. Less so if > 2.
[Part 14] 25/103
Spatially Correlated Observations
Correlation Based on Unobservables
1 1 1 1 1
2 2 2 2 2
n n n n n
y u u 0
y u u 0 ~ f ,
... ... ... ...
y u u 0
In the cross section case, = .
= the usual spatial weight matrix .
x
xI I I
x
W W W
W
W 0 Now, it is a full matrix.
The joint probably is a single n fold integral.
[Part 14] 26/103
Spatially Correlated Observations
Correlated Utilities
* *1 1 1 11 1
* *12 2 2 22 2
* *
n n
y y
y y
... ...... ...
y y
In the cross section case,
= the usual spatial weight matrix .
n n n n
x x
x x
x x
W I W
W
W = . Now, it is a full
matrix. The joint probably is a single n fold integral.
0
[Part 14] 27/103
Log Likelihood
In the unrestricted spatial case, the log
likelihood is one term,
LogL = log Prob(y1|x1, y2|x2, … ,yn|xn)
In the discrete choice case, the
probability will be an n fold integral,usually for a normal distribution.
[Part 14] 28/103
*1 11 1
*2 22 2
*
n n
*
i i
1 12 2 13 3 1 1
2 21 1 23 3 2 2
y y
y y
...... ...
y y
y 1[y 0]
y 1[ (w y w y ...) 0]
y 1[ (w y w y ...) 0] etc.
The model based on observa
n n
x
x
x
x
x
W
bles is more reasonable.
There is no reduced form unless is lower triangular.
This model is not identified. (It is "incoherent.")
W
A Theoretical Behavioral Conflict
[Part 14] 29/103
See Maddala (1983)
From Klier and
McMillen (2012)
[Part 14] 30/103
LogL for an Unrestricted BC Model
n 1
1 1 1 2 12 1 n 1n 1
2 2 1 2 21 2 n 2n 2
n
n n n 1 n1 n 2 n2 n
i i
LogL( | )=
q 1 q q w ... q q w
q q q w 1 ... q q wlog ... d
... ... ... ... ... ...
q q q w q q w ... 1
q 1 if y = 0 and +1 if
x x
X, y
i i y = 1 = 2y 1
One huge observation - n dimensional normal integral.
Not feasible for any reasonable sample size.
Even if computable, provides no device for estimating
sampling standard errors.
[Part 14] 31/103
Solution Approaches for Binary Choice
Distinguish between private andsocial shocks and use pseudo-ML
Approximate the joint density anduse GMM with the EM algorithm
Parameterize the spatial correlationand use copula methods
Define neighborhoods – make W a sparse matrix and use pseudo-ML
Others …
[Part 14] 32/103
[Part 14] 33/103
*
i i i ij jj i
* ii i i
i ij jj i
*
i i i
* i ii i i
ii
2 2 2
i ijj i
Spatial autocorrelation in the heterogeneity
y w
y 1[y 0], Prob y 1
Var w
or
y u
y 1[y 0]Prob y 1Var u
1 w
x
x
x
x x
[Part 14] 34/103
ni i
i 1i
1
MLE
Heteroscedastic Probit
Estimation and Inference
(2y -1)MLE: logL = log
n ˆ ( ) ( ) =Score vector
implies the algorithm, Newton's Method.
EM alg
x
H S S
orithm essentially replaces with during iterations.
(Slightly more involved for the heteroscedasticity. LHS variable
in the EM iterations is the score vector.)
To compute the asymptotic covariance
H X X
, we need Var[ ( )]
Observations are (spatially) correlated! How to compute it?
S
[Part 14] 35/103
GMMPinske, J. and Slade, M., (1998) “Contracting in Space: An Application of Spatial Statistics to Discrete Choice Models,” Journal of Econometrics, 85, 1, 125-154.Pinkse, J. , Slade, M. and Shen, L (2006) “Dynamic Spatial Discrete Choice Using One Step GMM: An Application to Mine Operating Decisions”, Spatial Economic Analysis, 1: 1, 53 — 99.
1
i
*= + , = +
= [ - ]
= u
Cross section case: =0
Probit Model: FOC for estimation of is based on the
ˆ generalized residuals u
y W u
I W u
A
X ε
x x
x x
i i i
n i i iii=1
i i
= y E[ | y ]
(y ( )) ( ) =
( )[1 ( )]
Spatially autocorrelated case: Moment equations are still
valid. Complication is computing the variance of the
x 0
moment
equations, which requires some approximations.
[Part 14] 36/103
GMM Approach
Spatial autocorrelation induces
heteroscedasticity that is a function of
Moment equations include the
heteroscedasticity and an additionalinstrumental variable for identifying .
LM test of = 0 is carried out under the null
hypothesis that = 0.
Application: Contract type in pricing for 118
Vancouver service stations.
[Part 14] 37/103
GMM
1
*= + , = +
= [ - ]
= u
Autocorrelated Case: 0
Probit Model: FOC for estimation of is based on the
ˆ generalized residuals u
y W u
I W u
A
X ε
x x
x x
i i i
i ii
n ii ii
ii=1
i i
ii ii
= y E[ | y ]
ya ( ) a ( )
=
1a ( ) a ( )
z 0
Requires at least K +1 instrumental variables.
[Part 14] 38/103
Extension to Dynamic Choice Model
Pinske, Slade, Shen (2006)
[Part 14] 39/103
[Part 14] 40/103
[Part 14] 41/103
1
1* 2
ii i
*ii i i
i
i i i
* = + , = + = ( - )
[ ], Var[ ] = ( - ) ( - ) ,
Prob(y 1)
Generalized residual u d
Instruments
Cri
Spatial Logit Model
Iterated 2SLS (GMM)
y X e e We I W
d = 1y 0 e I W I W
xx
Z
1terion: q = ( , ) ( ) ( , ) u Z Z Z Z u
[Part 14] 42/103
i i i
*
i i i
i 1 1*i i
i i i ii2
i
1 2 n
0
k k k
u d
(1 )u /
, =( - ) ( - )u / (1 ) A
[ , ,... ]
1. Logit estimation of =0,
ˆˆ2. = ( - ), (
Algorithm
Iterated 2SLS (GMM)
x
g A I W W I Wx
G g g g
G
u d G Z Z Z
β |
1
k
1
k k k k k
k k
k 1 k
)
ˆ ˆ ˆ3.
ˆ ˆ4. until is sufficiently small.
ˆ ˆ
Z G
G G G u
[Part 14] 43/103
LM Test?
• If = 0, g = 0 because Aii = 0
• At the initial logit values, g = 0
• Thus, if = 0, g = 0
• How to test = 0 using an LM style test.
• Same problem shows up in RE models
• But, here, is in the interior of the parameter space!
[Part 14] 44/103
Pseudo Maximum Likelihood
Maximize a likelihood function that
approximates the true one
Produces consistent estimators of
parameters
How to obtain standard errors?
Asymptotic normality? Conditions for
CLT are more difficult to establish.
[Part 14] 45/103
Pseudo MLE
1
*
i i i ijj i
* * 2 2
i i i ij iij i
*= + , = +
= [ - ]
= u
Autocorrelated Case: 0
y W
y 1[y 0]. Var[y ] 1 W a ( )
Implies a heteroscedastic probit.
Pseud
y X W u
I W u
A
x
θ ε
o MLE is based on the marginal densities.
How to obtain the asymptotic covariance matrix?
[See Wang, Iglesias, Wooldridge (2013)]
[Part 14] 46/103
n i i
i 1i
1
MLE
Estimation and Inference
(2y -1)MLE: logL = log
ˆ n ( ) ( ) =Score vector
implies the algorithm, Newton's Meth
Heteroscedastic Probit Approac
S
h
x
H S
od.
EM algorithm essentially replaces with during iterations.
(Slightly more involved for the heteroscedasticity. LHS variable
in the EM iterations is the score vector.)
To compute the asymptotic c
H X X
ovariance, we need Var[ ( )]
Observations are (spatially) correlated! How to compute it?
S
[Part 14] 47/103
ˆ ˆ ˆ(data, ) (data, ) (data, )
ˆ(data, ) Negative inverse of Hessian
ˆ(data, ) Covariance matrix of scores.
ˆHow to compute (data, )
Terms are not independent in a spatial setting.
V A B A
A
B
B
Covariance Matrix for Pseudo MLE
[Part 14] 48/103
‘Pseudo’ Maximum Likelihood
Smirnov, A., “Modeling Spatial Discrete Choice,” Regional Science and Urban Economics, 40, 2010.
1 1
1 t
t 0
Spatial Autoregression in Utilities
* * , 1( * ) for all n individuals
* ( ) ( )
( ) ( ) assumed convergent
=
= + where
y Wy X y y 0
y I W X I W
I W W
A
D A - D
n
j 1 ij j
i i
i
= diagonal elements
*
Private Social
Then
aProb[y 1 or 0 | ] F (2y 1) , p
d
D
y AX D A - D
Suppose individuals ignore the social "shocks."
xX
robit or logit.
[Part 14] 49/103
Pseudo Maximum Likelihood
Bases correlation in underlying utilities
Assumes away the correlation in the reduced form
Makes a behavioral assumption
Requires inversion of (I-W)
Computation of (I-W) is part of the optimizationprocess - is estimated with .
Does not require multidimensional integration (for a logit
model, requires no integration)
[Part 14] 50/103
Copula Method and Parameterization
Bhat, C. and Sener, I., (2009) “A copula-based closed-form binary logit choice modelfor accommodating spatial correlation across observational units,” Journal of Geographical Systems, 11, 243–272
* *
i i i i i
1 2 n
Basic Logit Model
y , y 1[y 0] (as usual)
Rather than specify a spatial weight matrix, we assume
[ , ,..., ] have an n-variate distribution.
Sklar's Theorem represents the joint distribut
x
i
1 1 2 2 n n
ion in terms
of the continuous marginal distributions, ( ) and a copula
function C[u = ( ) ,u ( ) ,...,u ( ) | ]
[Part 14] 51/103
Copula Representation
[Part 14] 52/103
Model
[Part 14] 53/103
Likelihood
[Part 14] 54/103
Parameterization
[Part 14] 55/103
[Part 14] 56/103
[Part 14] 57/103
Other Approaches
Case (1992): Define “regions” or neighborhoods. No
correlation across regions. Produces essentially apanel data probit model. (Wang et al. (2013))
Beron and Vijverberg (2003): Brute force integration
using GHK simulator in a probit model.
Lesage: Bayesian - MCMC
Others. See Bhat and Sener (2009).
Case A (1992) Neighborhood influence and technological change. Economics 22:491–508Beron KJ, Vijverberg WPM (2004) Probit in a spatial context: a monte carlo analysis. In: Anselin L, Florax RJGM, Rey SJ (eds) Advances in spatial econometrics: methodology, tools and applications. Springer, Berlin
[Part 14] 58/103
[Part 14] 59/103
See also Arbia, G., “Pairwise Likelihood Inference for Spatial Regressions
Estimated on Very Large Data Sets” Manuscript, Catholic University del Sacro
Cuore, Rome, 2012.
[Part 14] 60/103
Partial MLE
*
1 1 1 1j jj 1
* * 2 2
1 1 1 1j 11j 1
*
2 2 2 2j jj 2
* * 2 2
2 2 2 2j 22j 2
* *
1 2
Observation 1
y W
y 1[y 0] Var[y ] 1 W a ( )
Observation 2
y W
y 1[y 0] Var[y ] 1 W a ( )
Covariance of y and y = a
x
x
12( )
[Part 14] 61/103
Bivariate Probit
Pseudo MLE
Consistent
Asymptotically normal?
Resembles time series case
Correlation need not fade with ‘distance’
Better than Pinske/Slade Univariate Probit?
How to choose the pairings?
[Part 14] 62/103
[Part 14] 63/103
Lesage methods - MCMC
• SEM Model…
• Bayesian MCMC
• Data augmentation for unobserved y
• Quirks about sampler for rho.
[Part 14] 64/103
Ordered Probability Model
1
1 2
2 3
J-1 J
j-1
y* , we assume contains a constant term
y 0 if y* 0
y = 1 if 0 < y*
y = 2 if < y*
y = 3 if < y*
...
y = J if < y*
In general : y = j if < y*
β x x
j
-1 o J j-1 j,
, j = 0,1,...,J
, 0, , j = 1,...,J
[Part 14] 65/103
A Spatial Ordered Choice ModelWang, C. and Kockelman, K., (2009) Bayesian Inference for Ordered Response Data with a Dynamic Spatial Ordered Probit Model, Working Paper, Department of Civil and Environmental Engineering, Bucknell University.
* *
1
* *
1
Core Model: Cross Section
y , y = j if y , Var[ ] 1
Spatial Formulation: There are R regions. Within a region
y u , y = j if y
Spatial he
i i i i j i j i
ir ir i ir ir j ir j
β x
β x
2
2
1 2 1
teroscedasticity: Var[ ]
Spatial Autocorrelation Across Regions
= + , ~ N[ , ]
= ( - ) ~ N[ , {( - ) ( - )} ]
The error distribution depends on 2 para
ir r
v
v
u Wu v v 0 I
u I W v 0 I W I W
2meters, and
Estimation Approach: Gibbs Sampling; Markov Chain Monte Carlo
Dynamics in latent utilities added as a final step: y*(t)=f[y*(t-1)].
v
[Part 14] 66/103
An Ordered Probability Model
1
1 2
2 3
J-1 J
j-1
y* , we assume contains a constant term
y 0 if y* 0
y = 1 if 0 < y*
y = 2 if < y*
y = 3 if < y*
...
y = J if < y*
In general : y = j if < y*
β x x
j
-1 o J j-1 j,
, j = 0,1,...,J
, 0, , j = 1,...,J
[Part 14] 67/103
OCM for Land Use Intensity
[Part 14] 68/103
OCM for Land Use Intensity
[Part 14] 69/103
Estimated Dynamic OCM
[Part 14] 70/103
[Part 14] 71/103
[Part 14] 72/103
Data Augmentation
[Part 14] 73/103
Unordered Multinomial Choice
j ij ij
Underlying Random Utility for Each Alternative
U(i,j) = , i = individual, j = alternative
Preference Revelation
Y(i) = j if and only if U(i,j) > U(i,k
Core Random Utility Model
x
1 J
1 J
) for all k j
Model Frameworks
Multinomial Probit: [ ,..., ] ~ N[0, ]
Multinomial Logit: [ ,..., ] ~ iid type 1 extreme value
[Part 14] 74/103
Spatial Multinomial Probit
Chakir, R. and Parent, O. (2009) “Determinants of land use changes: A spatial multinomial probit approach, Papers in Regional Science, 88, 2, 328-346.
jt ijt ik ijt
n
ij il lkl 1
Utility Functions, land parcel i, usage type j, date t
U(i,j,t)=
Spatial Correlation at Time t
w
Modeling Framework: Normal / Multinomial Probit
Estimation: MCMC
x
- Gibbs Sampling
[Part 14] 75/103
[Part 14] 76/103
[Part 14] 77/103
[Part 14] 78/103
Random Parameters Models
[Part 14] 79/103
[Part 14] 80/103
Canonical Model
j
Poisson Regression
y = 0,1,...
exp( ) Prob[y = j| ] =
j!
Conditional Mean = exp( )
Signature Feature: Equidispersion
Usual Alternative: Various forms of Negative Binomial
Spatial E
x
x
i i i
n
i im m im 1
ffect: Filtered through the mean
= exp( + )
= w
x
Rathbun, S and Fei, L (2006) “A Spatial Zero-Inflated Poisson Regression Model for Oak Regeneration,” Environmental Ecology Statistics, 13, 2006, 409-426
[Part 14] 81/103
Canonical Model for Counts
j
Poisson Regression
y = 0,1,...
exp( ) Prob[y = j| ] =
j!
Conditional Mean = exp( )
Signature Feature: Equidispersion
Usual Alternative: Negative Binomial
Spatial Effect: Filtered
x
x
i i i
n
i im m im 1
through the mean
= exp( + )
= w
x
Rathbun, S and Fei, L (2006) “A Spatial Zero-Inflated Poisson Regression Model for Oak Regeneration,” Environmental Ecology Statistics, 13, 2006, 409-426
[Part 14] 82/103
Zero Inflation
There are two states
Always zero
Zero is one possible value, or 1,2,…
Prob(0) = Prob(state 1) +
Prob(state 2) P(0|state 2)
[Part 14] 83/103
Numbers of firms locating in Texas
counties: Count data (Poisson)
Bicycle and pedestrian
injuries in census tracts in
Manhattan. (Count data and
ordered outcomes)
A Blend of Ordered
Choice and Count Data
Models
[Part 14] 84/103
Kriging
[Part 14] 85/103
[Part 14] 86/103
[Part 14] 87/103
Spatial Autocorrelation in a
Sample Selection Model
•Alaska Department of Fish and Game.•Pacific cod fishing eastern Bering Sea – grid of locations•Observation = ‘catch per unit effort’ in grid square•Data reported only if 4+ similar vessels fish in the region•1997 sample = 320 observations with 207 reported full data
Flores-Lagunes, A. and Schnier, K., “Sample selection and Spatial Dependence,” Journal of Applied Econometrics, 27, 2, 2012, pp. 173-204.
[Part 14] 88/103
Spatial Autocorrelation in a
Sample Selection Model
•LHS is catch per unit effort = CPUE
•Site characteristics: MaxDepth, MinDepth, Biomass
•Fleet characteristics:
Catcher vessel (CV = 0/1)
Hook and line (HAL = 0/1)
Nonpelagic trawl gear (NPT = 0/1)
Large (at least 125 feet) (Large = 0/1)
Flores-Lagunes, A. and Schnier, K., “Sample selection and Spatial Dependence,” Journal of Applied Econometrics, 27, 2, 2012, pp. 173-204.
[Part 14] 89/103
Spatial Autocorrelation in a
Sample Selection Model
*
1 0 1 1 1 1 1
*
2 0 2 2 2 2 2
21 1 12
122 12 2
*
1 1
2
0~ , , (?? 1??)
0
Observation Mechanism
1 > 0 Probit Model
i i i i ij j ij i
i i i i ij j ij i
i
i
i i
i
y u u c u
y u u c u
N
y y
y
x
x
*
2 1 if = 1, unobserved otherwise.i iy y
[Part 14] 90/103
Spatial Autocorrelation in a
Sample Selection Model
1 1 1
1 (1)
1 1 1 2
2* (1) 2 (1)
1 0 1 1 1 1 1 1
* (2) 2 (
2 0 2 2 2 1 1
= Spatial weight matrix, 0.
[ ] = , likewise for
( ) , Var[ ] ( )
( ) , Var[ ] ( )
ii
N N
i i ij i i ijj j
N
i i ij i i ijj
y u
y u
u Cu
C C
u I C u
x
x
2
2)
1
(1) (2)
1 2 12 1 Cov[ , ] ( ) ( )
N
j
N
i i ij ijju u
[Part 14] 91/103
Spatial Weights
2
1,
Euclidean distance
Band of 7 neighbors is used
Row standardized.
ij
ij
ij
cd
d
[Part 14] 92/103
Two Step Estimation
0 1
22 (1)(1) (2)1 11
2(1)
10 1
22 (1)
1 1
Probit estimated by Pinske/Slade GMM
( )( ) ( )
( )
( )
Spatial regression with included IMR i
i
NN
ijij ij jj
iN
ijji
N
ijj
x
x
n second step
(*) GMM procedure combines the two steps in one large estimation.
[Part 14] 93/103
[Part 14] 94/103
Spatial Stochastic Frontier
Production function model
y = + ε
y = + v - u
v unexplained noise = N[0,1]
u = inefficiency > 0; efficiency = exp(-u)
Object of estimation is u, not
Not a linear regression. Fit by MLE or MCMC.
β x
β x
[Part 14] 95/103
247 Spanish Dairy Farms, 6 Years
[Part 14] 96/103
A True Random Effects Model
ij
i ij ij ij
1 n
i k
y Output of farm j in municipality i in Center-West Brazil
y α + +v - u
(α ,...,α ) conditionally autoregressive based on neighbors
α -α is smaller when municipalities i and k are closer tog
ij
β x
ether
Spatial Stochastic Frontier Models; Accounting for Unobserved Local
Determinants of Inefficiency
Schmidt, Moriera, Helfand, Fonseca; Journal of Productivity Analysis,
2009.
[Part 14] 97/103
[Part 14] 98/103
Estimation by Maximum Likelihood
[Part 14] 99/103
LeSage (2000) on Timing
“The Bayesian probit and tobit spatial
autoregressive models described here have been
applied to samples of 506 and 3,107
observations. The time required to produce
estimates was around 350 seconds for the 506
observations sample and 900 seconds for the
case infolving 3,107 observations. …
(inexpensive Apple G3 computer running at 266
Mhz.)”
[Part 14] 100/103
Time and Space (In Your Computer)
[Part 14] 101/103
[Part 14] 102/103
Thank you
[Part 14] 103/103
•
•