nonlinear models with spatial data - new york...

[Part 14] 1/103

Nonlinear Models with Spatial Data

William Greene

Stern School of Business, New York University

Washington D.C.

July 12, 2013

[Part 14] 2/103

Y=1[New Plant Located in County]

Klier and McMillen: Clustering of Auto Supplier Plants in the United States. JBES, 2008

[Part 14] 3/103

Outcome Models for Spatial Data

Spatial Regression Models

Estimation and Analysis

Nonlinear Models and Spatial Regression

Nonlinear Models: Specification, Estimation

Discrete Choice: Binary, Ordered, Multinomial, Counts

Sample Selection

Stochastic Frontier

[Part 14] 4/103

Spatial Autocorrelation

ii

( ) ( ) , N observations on a spatially

arranged variable

contiguity matrix; 0

spatial autocorrelation parameter, -1 < < 1.

E[ ]= Var[ ]=

x i W x i ε

W W

ε 0, ε

2

1

2 -1

( ) [ ]

E[ ]= Var[ ]= [( ) ( )]

I

Spatial "moving average" form

x i I W ε

x i, x I W I W

[Part 14] 5/103

Spatial Autocorrelation in Regression

ii

2

2

11 1

12

( ) . w 0.

E[ | ]= Var[ | ]=

E[ | ]=

Var[ | ] ( )( )

ˆ ( )( ) ( )( )

1 ˆ ˆ( )( )ˆN

ˆ The subject

y Xβ I - W ε

ε X 0, ε X I

y X Xβ

y X = I - W I - W

A Generalized Regression Model

β X I - W I - W X X I - W I - W y

y - Xβ I - W I - W y - Xβ

of much research

[Part 14] 6/103

Bell and Bockstael (2000) Spatial

Autocorrelation in Real Estate Sales

[Part 14] 7/103

Agreed Upon Objective:

Practical Obstacles

• Problem: Maximize logL involving sparse(I-W)

• Inaccuracies in determinant and inverse

• Kelejian and Prucha (1999) moment basedestimator of

• Followed by FGLS

[Part 14] 8/103

Spatial Autoregression in a Linear Model

2

1

1 1

1

2 -1

+ .

E[ | ] Var[ | ]=

[ ] ( )

[ ] [ ]

E[ | ] [ ]

Var[ | ] [( ) ( )]

Estimators: Various f

y = Wy Xβ ε

ε X = 0, ε X I

y = I W Xβ ε

= I W Xβ I W ε

y X = I W Xβ

y X = I W I W

orms of generalized least squares.

Maximum likelihood | ~ Normal[ , ]ε 0

[Part 14] 9/103

Complications of the Generalized

Regression Model

Potentially very large N – GIS data on agriculture

plots

Estimation of . There is no natural residual based

estimator

Complicated covariance structures – no simple

transformations

[Part 14] 10/103

Panel Data Application

it it i it

t t t

E.g., N countries, T periods

y c

= N observations at time t.

Similar assumptions

Candidate for SUR or Spatial Autocorrelation model.

x β

ε Wε v

[Part 14] 11/103

Spatial Autocorrelation in a Panel

[Part 14] 12/103

Analytical Environment

Generalized linear regression

Complicated disturbance covariance matrix

Estimation platform: Generalized least

squares or maximum likelihood (normality)

Central problem, estimation of

[Part 14] 13/103

[Part 14] 14/103

Outcomes in Nonlinear Settings

Land use intensity in Austin, Texas – Discrete Ordered

Intensity = 1,2,3,4

Land Usage Types in France, 1,2,3 – Discrete Unordered

Oak Tree Regeneration in Pennsylvania – Count

Number = 0,1,2,… (Many zeros)

Teenagers in the Bay area:

physically active = 1 or physically inactive = 0 – Binary

Pedestrian Injury Counts in Manhattan – Count

Efficiency of Farms in West-Central Brazil – Nonlinear

Model (Stochastic frontier)

Catch by Alaska trawlers in a nonrandom sample

[Part 14] 15/103

Nonlinear Outcomes

Discrete revelation of choice indicates

latent underlying preferences

Binary choice between two alternatives

Unordered choice among multiple choices

Ordered choice revealing underlying

strength of preferences

Counts of events

Stochastic frontier and efficiency

Nonrandom sample selection

[Part 14] 16/103

Modeling Discrete Outcomes

“Dependent Variable” typically labels an outcome

No quantitative meaning

Conditional relationship to covariates

No “regression” relationship in most cases.

Models are often not conditional means.

The “model” is usually a probability

Nonlinear models – usually not estimated by any type

of linear least squares

[Part 14] 17/103

Nonlinear Spatial Modeling

Discrete outcome yit = 0, 1, …, J for

some finite or infinite (count case) J.

i = 1,…,n

t = 1,…,T

Covariates xit

Conditional Probability (yit = j)

= a function of xit.

[Part 14] 18/103

[Part 14] 19/103

Issues in Spatial Discrete Choice

• A series of Issues Spatial dependence between alternatives: Nested logit

Spatial dependence in the LPM: Solves some practical problems. A bad model

Spatial probit and logit: Probit is generally more amenable to modeling

Statistical mechanics: Social interactions – not practical

Autologistic model: Spatial dependency between outcomes or utillities.

See below

Variants of autologistic: The model based on observed outcomes is

incoherent (“selfcontradictory”)

Endogenous spatial weights

Spatial heterogeneity: Fixed and random effects. Not practical.

• The model discussed below

[Part 14] 20/103

Two Platforms

Random Utility for Preference ModelsOutcome reveals underlying utility Binary: u* = ’x y = 1 if u* > 0

Ordered: u* = ’x y = j if j-1 < u* < j

Unordered: u*(j) = ’xj , y = j if u*(j) > u*(k)

Nonlinear Regression for Count ModelsOutcome is governed by a nonlinearregression E[y|x] = g(,x)

[Part 14] 21/103

Maximum Likelihood Estimation

Cross Section Case: Binary Outcome

Random Utility: y* = +

Observed Outcome: y = 1 if y* > 0,

0 if y* 0.

Probabilities: P(y=1|x) = Prob(y* > 0| )

x

x

= Prob( > - )

P(y=0|x) = Prob(y* 0| )

= Prob( - )

Likelihood for the sample = joint probability

x

x

x

n

i ii=1

n

i ii=1

= Prob(y=y | )

Log Likelihood = logProb(y=y | )

x

x

[Part 14] 22/103

Cross Section Case: n observations

1 1 1 1 1 1

2 2 2 2 2 2

n n n n n n

y =j | or > Prob( or > )

y =j | or > Prob( or > )Prob Prob =

... ... ...

y =j | or > Prob( or > )

We operate on the mar

x x x

x x x

x x x

n

i ii=1

2

ginal probabilities of n observations

LogL( | )= logF 2y 1

1 Probit F(t) = (t) exp( t / 2)dt (t)dt

2

exp(t) Logit F(t) = (t) =

1 exp(t)

t t

X, y x

[Part 14] 23/103

How to Induce Correlation

Joint distribution of multiple observations

Correlation of unobserved heterogeneity

Correlation of latent utility

[Part 14] 24/103

Bivariate Counts

Intervening variable approach

Y1 = X1 + Z, Y2 = Y2 + Z; All 3 Poisson distributed

Only allows positive correlation.

Limited to two outcomes

Bivariate conditional means

1 = exp(x1 + 1), 2 = exp(x2 + 2), Cor(1,2)=

|Cor(y1,y2)| << || (Due to residual variation)

Copula functions – Useful for bivariate. Less so if > 2.

[Part 14] 25/103

Spatially Correlated Observations

Correlation Based on Unobservables

1 1 1 1 1

2 2 2 2 2

n n n n n

y u u 0

y u u 0 ~ f ,

... ... ... ...

y u u 0

In the cross section case, = .

= the usual spatial weight matrix .

x

xI I I

x

W W W

W

W 0 Now, it is a full matrix.

The joint probably is a single n fold integral.

[Part 14] 26/103

Spatially Correlated Observations

Correlated Utilities

* *1 1 1 11 1

* *12 2 2 22 2

* *

n n

y y

y y

... ...... ...

y y

In the cross section case,

= the usual spatial weight matrix .

n n n n

x x

x x

x x

W I W

W

W = . Now, it is a full

matrix. The joint probably is a single n fold integral.

0

[Part 14] 27/103

Log Likelihood

In the unrestricted spatial case, the log

likelihood is one term,

LogL = log Prob(y1|x1, y2|x2, … ,yn|xn)

In the discrete choice case, the

probability will be an n fold integral,usually for a normal distribution.

[Part 14] 28/103

*1 11 1

*2 22 2

*

n n

*

i i

1 12 2 13 3 1 1

2 21 1 23 3 2 2

y y

y y

...... ...

y y

y 1[y 0]

y 1[ (w y w y ...) 0]

y 1[ (w y w y ...) 0] etc.

The model based on observa

n n

x

x

x

x

x

W

bles is more reasonable.

There is no reduced form unless is lower triangular.

This model is not identified. (It is "incoherent.")

W

A Theoretical Behavioral Conflict

[Part 14] 29/103

See Maddala (1983)

From Klier and

McMillen (2012)

[Part 14] 30/103

LogL for an Unrestricted BC Model

n 1

1 1 1 2 12 1 n 1n 1

2 2 1 2 21 2 n 2n 2

n

n n n 1 n1 n 2 n2 n

i i

LogL( | )=

q 1 q q w ... q q w

q q q w 1 ... q q wlog ... d

... ... ... ... ... ...

q q q w q q w ... 1

q 1 if y = 0 and +1 if

x x

X, y

i i y = 1 = 2y 1

One huge observation - n dimensional normal integral.

Not feasible for any reasonable sample size.

Even if computable, provides no device for estimating

sampling standard errors.

[Part 14] 31/103

Solution Approaches for Binary Choice

Distinguish between private andsocial shocks and use pseudo-ML

Approximate the joint density anduse GMM with the EM algorithm

Parameterize the spatial correlationand use copula methods

Define neighborhoods – make W a sparse matrix and use pseudo-ML

Others …

[Part 14] 32/103

[Part 14] 33/103

*

i i i ij jj i

* ii i i

i ij jj i

*

i i i

* i ii i i

ii

2 2 2

i ijj i

Spatial autocorrelation in the heterogeneity

y w

y 1[y 0], Prob y 1

Var w

or

y u

y 1[y 0]Prob y 1Var u

1 w

x

x

x

x x

[Part 14] 34/103

ni i

i 1i

1

MLE

Heteroscedastic Probit

Estimation and Inference

(2y -1)MLE: logL = log

n ˆ ( ) ( ) =Score vector

implies the algorithm, Newton's Method.

EM alg

x

H S S

orithm essentially replaces with during iterations.

(Slightly more involved for the heteroscedasticity. LHS variable

in the EM iterations is the score vector.)

To compute the asymptotic covariance

H X X

, we need Var[ ( )]

Observations are (spatially) correlated! How to compute it?

S

[Part 14] 35/103

GMMPinske, J. and Slade, M., (1998) “Contracting in Space: An Application of Spatial Statistics to Discrete Choice Models,” Journal of Econometrics, 85, 1, 125-154.Pinkse, J. , Slade, M. and Shen, L (2006) “Dynamic Spatial Discrete Choice Using One Step GMM: An Application to Mine Operating Decisions”, Spatial Economic Analysis, 1: 1, 53 — 99.

1

i

*= + , = +

= [ - ]

= u

Cross section case: =0

Probit Model: FOC for estimation of is based on the

ˆ generalized residuals u

y W u

I W u

A

X ε

x x

x x

i i i

n i i iii=1

i i

= y E[ | y ]

(y ( )) ( ) =

( )[1 ( )]

Spatially autocorrelated case: Moment equations are still

valid. Complication is computing the variance of the

x 0

moment

equations, which requires some approximations.

[Part 14] 36/103

GMM Approach

Spatial autocorrelation induces

heteroscedasticity that is a function of

Moment equations include the

heteroscedasticity and an additionalinstrumental variable for identifying .

LM test of = 0 is carried out under the null

hypothesis that = 0.

Application: Contract type in pricing for 118

Vancouver service stations.

[Part 14] 37/103

GMM

1

*= + , = +

= [ - ]

= u

Autocorrelated Case: 0

Probit Model: FOC for estimation of is based on the

ˆ generalized residuals u

y W u

I W u

A

X ε

x x

x x

i i i

i ii

n ii ii

ii=1

i i

ii ii

= y E[ | y ]

ya ( ) a ( )

=

1a ( ) a ( )

z 0

Requires at least K +1 instrumental variables.

[Part 14] 38/103

Extension to Dynamic Choice Model

Pinske, Slade, Shen (2006)

[Part 14] 39/103

[Part 14] 40/103

[Part 14] 41/103

1

1* 2

ii i

*ii i i

i

i i i

* = + , = + = ( - )

[ ], Var[ ] = ( - ) ( - ) ,

Prob(y 1)

Generalized residual u d

Instruments

Cri

Spatial Logit Model

Iterated 2SLS (GMM)

y X e e We I W

d = 1y 0 e I W I W

xx

Z

1terion: q = ( , ) ( ) ( , ) u Z Z Z Z u

[Part 14] 42/103

i i i

*

i i i

i 1 1*i i

i i i ii2

i

1 2 n

0

k k k

u d

(1 )u /

, =( - ) ( - )u / (1 ) A

[ , ,... ]

1. Logit estimation of =0,

ˆˆ2. = ( - ), (

Algorithm

Iterated 2SLS (GMM)

x

g A I W W I Wx

G g g g

G

u d G Z Z Z

β |

1

k

1

k k k k k

k k

k 1 k

)

ˆ ˆ ˆ3.

ˆ ˆ4. until is sufficiently small.

ˆ ˆ

Z G

G G G u

[Part 14] 43/103

LM Test?

• If = 0, g = 0 because Aii = 0

• At the initial logit values, g = 0

• Thus, if = 0, g = 0

• How to test = 0 using an LM style test.

• Same problem shows up in RE models

• But, here, is in the interior of the parameter space!

[Part 14] 44/103

Pseudo Maximum Likelihood

Maximize a likelihood function that

approximates the true one

Produces consistent estimators of

parameters

How to obtain standard errors?

Asymptotic normality? Conditions for

CLT are more difficult to establish.

[Part 14] 45/103

Pseudo MLE

1

*

i i i ijj i

* * 2 2

i i i ij iij i

*= + , = +

= [ - ]

= u

Autocorrelated Case: 0

y W

y 1[y 0]. Var[y ] 1 W a ( )

Implies a heteroscedastic probit.

Pseud

y X W u

I W u

A

x

θ ε

o MLE is based on the marginal densities.

How to obtain the asymptotic covariance matrix?

[See Wang, Iglesias, Wooldridge (2013)]

[Part 14] 46/103

n i i

i 1i

1

MLE

Estimation and Inference

(2y -1)MLE: logL = log

ˆ n ( ) ( ) =Score vector

implies the algorithm, Newton's Meth

Heteroscedastic Probit Approac

S

h

x

H S

od.

EM algorithm essentially replaces with during iterations.

(Slightly more involved for the heteroscedasticity. LHS variable

in the EM iterations is the score vector.)

To compute the asymptotic c

H X X

ovariance, we need Var[ ( )]

Observations are (spatially) correlated! How to compute it?

S

[Part 14] 47/103

ˆ ˆ ˆ(data, ) (data, ) (data, )

ˆ(data, ) Negative inverse of Hessian

ˆ(data, ) Covariance matrix of scores.

ˆHow to compute (data, )

Terms are not independent in a spatial setting.

V A B A

A

B

B

Covariance Matrix for Pseudo MLE

[Part 14] 48/103

‘Pseudo’ Maximum Likelihood

Smirnov, A., “Modeling Spatial Discrete Choice,” Regional Science and Urban Economics, 40, 2010.

1 1

1 t

t 0

Spatial Autoregression in Utilities

* * , 1( * ) for all n individuals

* ( ) ( )

( ) ( ) assumed convergent

=

= + where

y Wy X y y 0

y I W X I W

I W W

A

D A - D

n

j 1 ij j

i i

i

= diagonal elements

*

Private Social

Then

aProb[y 1 or 0 | ] F (2y 1) , p

d

D

y AX D A - D

Suppose individuals ignore the social "shocks."

xX

robit or logit.

[Part 14] 49/103

Pseudo Maximum Likelihood

Bases correlation in underlying utilities

Assumes away the correlation in the reduced form

Makes a behavioral assumption

Requires inversion of (I-W)

Computation of (I-W) is part of the optimizationprocess - is estimated with .

Does not require multidimensional integration (for a logit

model, requires no integration)

[Part 14] 50/103

Copula Method and Parameterization

Bhat, C. and Sener, I., (2009) “A copula-based closed-form binary logit choice modelfor accommodating spatial correlation across observational units,” Journal of Geographical Systems, 11, 243–272

* *

i i i i i

1 2 n

Basic Logit Model

y , y 1[y 0] (as usual)

Rather than specify a spatial weight matrix, we assume

[ , ,..., ] have an n-variate distribution.

Sklar's Theorem represents the joint distribut

x

i

1 1 2 2 n n

ion in terms

of the continuous marginal distributions, ( ) and a copula

function C[u = ( ) ,u ( ) ,...,u ( ) | ]

[Part 14] 51/103

Copula Representation

[Part 14] 52/103

Model

[Part 14] 53/103

Likelihood

[Part 14] 54/103

Parameterization

[Part 14] 55/103

[Part 14] 56/103

[Part 14] 57/103

Other Approaches

Case (1992): Define “regions” or neighborhoods. No

correlation across regions. Produces essentially apanel data probit model. (Wang et al. (2013))

Beron and Vijverberg (2003): Brute force integration

using GHK simulator in a probit model.

Lesage: Bayesian - MCMC

Others. See Bhat and Sener (2009).

Case A (1992) Neighborhood influence and technological change. Economics 22:491–508Beron KJ, Vijverberg WPM (2004) Probit in a spatial context: a monte carlo analysis. In: Anselin L, Florax RJGM, Rey SJ (eds) Advances in spatial econometrics: methodology, tools and applications. Springer, Berlin

[Part 14] 58/103

[Part 14] 59/103

See also Arbia, G., “Pairwise Likelihood Inference for Spatial Regressions

Estimated on Very Large Data Sets” Manuscript, Catholic University del Sacro

Cuore, Rome, 2012.

[Part 14] 60/103

Partial MLE

*

1 1 1 1j jj 1

* * 2 2

1 1 1 1j 11j 1

*

2 2 2 2j jj 2

* * 2 2

2 2 2 2j 22j 2

* *

1 2

Observation 1

y W

y 1[y 0] Var[y ] 1 W a ( )

Observation 2

y W

y 1[y 0] Var[y ] 1 W a ( )

Covariance of y and y = a

x

x

12( )

[Part 14] 61/103

Bivariate Probit

Pseudo MLE

Consistent

Asymptotically normal?

Resembles time series case

Correlation need not fade with ‘distance’

Better than Pinske/Slade Univariate Probit?

How to choose the pairings?

[Part 14] 62/103

[Part 14] 63/103

Lesage methods - MCMC

• SEM Model…

• Bayesian MCMC

• Data augmentation for unobserved y

• Quirks about sampler for rho.

[Part 14] 64/103

Ordered Probability Model

1

1 2

2 3

J-1 J

j-1

y* , we assume contains a constant term

y 0 if y* 0

y = 1 if 0 < y*

y = 2 if < y*

y = 3 if < y*

...

y = J if < y*

In general : y = j if < y*

β x x

j

-1 o J j-1 j,

, j = 0,1,...,J

, 0, , j = 1,...,J

[Part 14] 65/103

A Spatial Ordered Choice ModelWang, C. and Kockelman, K., (2009) Bayesian Inference for Ordered Response Data with a Dynamic Spatial Ordered Probit Model, Working Paper, Department of Civil and Environmental Engineering, Bucknell University.

* *

1

* *

1

Core Model: Cross Section

y , y = j if y , Var[ ] 1

Spatial Formulation: There are R regions. Within a region

y u , y = j if y

Spatial he

i i i i j i j i

ir ir i ir ir j ir j

β x

β x

2

2

1 2 1

teroscedasticity: Var[ ]

Spatial Autocorrelation Across Regions

= + , ~ N[ , ]

= ( - ) ~ N[ , {( - ) ( - )} ]

The error distribution depends on 2 para

ir r

v

v

u Wu v v 0 I

u I W v 0 I W I W

2meters, and

Estimation Approach: Gibbs Sampling; Markov Chain Monte Carlo

Dynamics in latent utilities added as a final step: y*(t)=f[y*(t-1)].

v

[Part 14] 66/103

An Ordered Probability Model

1

1 2

2 3

J-1 J

j-1

y* , we assume contains a constant term

y 0 if y* 0

y = 1 if 0 < y*

y = 2 if < y*

y = 3 if < y*

...

y = J if < y*

In general : y = j if < y*

β x x

j

-1 o J j-1 j,

, j = 0,1,...,J

, 0, , j = 1,...,J

[Part 14] 67/103

OCM for Land Use Intensity

[Part 14] 68/103

OCM for Land Use Intensity

[Part 14] 69/103

Estimated Dynamic OCM

[Part 14] 70/103

[Part 14] 71/103

[Part 14] 72/103

Data Augmentation

[Part 14] 73/103

Unordered Multinomial Choice

j ij ij

Underlying Random Utility for Each Alternative

U(i,j) = , i = individual, j = alternative

Preference Revelation

Y(i) = j if and only if U(i,j) > U(i,k

Core Random Utility Model

x

1 J

1 J

) for all k j

Model Frameworks

Multinomial Probit: [ ,..., ] ~ N[0, ]

Multinomial Logit: [ ,..., ] ~ iid type 1 extreme value

[Part 14] 74/103

Spatial Multinomial Probit

Chakir, R. and Parent, O. (2009) “Determinants of land use changes: A spatial multinomial probit approach, Papers in Regional Science, 88, 2, 328-346.

jt ijt ik ijt

n

ij il lkl 1

Utility Functions, land parcel i, usage type j, date t

U(i,j,t)=

Spatial Correlation at Time t

w

Modeling Framework: Normal / Multinomial Probit

Estimation: MCMC

x

- Gibbs Sampling

[Part 14] 75/103

[Part 14] 76/103

[Part 14] 77/103

[Part 14] 78/103

Random Parameters Models

[Part 14] 79/103

[Part 14] 80/103

Canonical Model

j

Poisson Regression

y = 0,1,...

exp( ) Prob[y = j| ] =

j!

Conditional Mean = exp( )

Signature Feature: Equidispersion

Usual Alternative: Various forms of Negative Binomial

Spatial E

x

x

i i i

n

i im m im 1

ffect: Filtered through the mean

= exp( + )

= w

x

Rathbun, S and Fei, L (2006) “A Spatial Zero-Inflated Poisson Regression Model for Oak Regeneration,” Environmental Ecology Statistics, 13, 2006, 409-426

[Part 14] 81/103

Canonical Model for Counts

j

Poisson Regression

y = 0,1,...

exp( ) Prob[y = j| ] =

j!

Conditional Mean = exp( )

Signature Feature: Equidispersion

Usual Alternative: Negative Binomial

Spatial Effect: Filtered

x

x

i i i

n

i im m im 1

through the mean

= exp( + )

= w

x

Rathbun, S and Fei, L (2006) “A Spatial Zero-Inflated Poisson Regression Model for Oak Regeneration,” Environmental Ecology Statistics, 13, 2006, 409-426

[Part 14] 82/103

Zero Inflation

There are two states

Always zero

Zero is one possible value, or 1,2,…

Prob(0) = Prob(state 1) +

Prob(state 2) P(0|state 2)

[Part 14] 83/103

Numbers of firms locating in Texas

counties: Count data (Poisson)

Bicycle and pedestrian

injuries in census tracts in

Manhattan. (Count data and

ordered outcomes)

A Blend of Ordered

Choice and Count Data

Models

[Part 14] 84/103

Kriging

[Part 14] 85/103

[Part 14] 86/103

[Part 14] 87/103

Spatial Autocorrelation in a

Sample Selection Model

•Alaska Department of Fish and Game.•Pacific cod fishing eastern Bering Sea – grid of locations•Observation = ‘catch per unit effort’ in grid square•Data reported only if 4+ similar vessels fish in the region•1997 sample = 320 observations with 207 reported full data

Flores-Lagunes, A. and Schnier, K., “Sample selection and Spatial Dependence,” Journal of Applied Econometrics, 27, 2, 2012, pp. 173-204.

[Part 14] 88/103



•LHS is catch per unit effort = CPUE

•Site characteristics: MaxDepth, MinDepth, Biomass

•Fleet characteristics:

Catcher vessel (CV = 0/1)

Hook and line (HAL = 0/1)

Nonpelagic trawl gear (NPT = 0/1)

Large (at least 125 feet) (Large = 0/1)

Flores-Lagunes, A. and Schnier, K., “Sample selection and Spatial Dependence,” Journal of Applied Econometrics, 27, 2, 2012, pp. 173-204.

[Part 14] 89/103



*

1 0 1 1 1 1 1

*

2 0 2 2 2 2 2

21 1 12

122 12 2

*

1 1

2

0~ , , (?? 1??)

0

Observation Mechanism

1 > 0 Probit Model

i i i i ij j ij i

i i i i ij j ij i

i

i

i i

i

y u u c u

y u u c u

N

y y

y

x

x

*

2 1 if = 1, unobserved otherwise.i iy y

[Part 14] 90/103



1 1 1

1 (1)

1 1 1 2

2* (1) 2 (1)

1 0 1 1 1 1 1 1

* (2) 2 (

2 0 2 2 2 1 1

= Spatial weight matrix, 0.

[ ] = , likewise for

( ) , Var[ ] ( )

( ) , Var[ ] ( )

ii

N N

i i ij i i ijj j

N

i i ij i i ijj

y u

y u

u Cu

C C

u I C u

x

x

2

2)

1

(1) (2)

1 2 12 1 Cov[ , ] ( ) ( )

N

j

N

i i ij ijju u

[Part 14] 91/103

Spatial Weights

2

1,

Euclidean distance

Band of 7 neighbors is used

Row standardized.

ij

ij

ij

cd

d

[Part 14] 92/103

Two Step Estimation

0 1

22 (1)(1) (2)1 11

2(1)

10 1

22 (1)

1 1

Probit estimated by Pinske/Slade GMM

( )( ) ( )

( )

( )

Spatial regression with included IMR i

i

NN

ijij ij jj

iN

ijji

N

ijj

x

x

n second step

(*) GMM procedure combines the two steps in one large estimation.

[Part 14] 93/103

[Part 14] 94/103

Spatial Stochastic Frontier

Production function model

y = + ε

y = + v - u

v unexplained noise = N[0,1]

u = inefficiency > 0; efficiency = exp(-u)

Object of estimation is u, not

Not a linear regression. Fit by MLE or MCMC.

β x

β x

[Part 14] 95/103

247 Spanish Dairy Farms, 6 Years

[Part 14] 96/103

A True Random Effects Model

ij

i ij ij ij

1 n

i k

y Output of farm j in municipality i in Center-West Brazil

y α + +v - u

(α ,...,α ) conditionally autoregressive based on neighbors

α -α is smaller when municipalities i and k are closer tog

ij

β x

ether

Spatial Stochastic Frontier Models; Accounting for Unobserved Local

Determinants of Inefficiency

Schmidt, Moriera, Helfand, Fonseca; Journal of Productivity Analysis,

2009.

[Part 14] 97/103

[Part 14] 98/103

Estimation by Maximum Likelihood

[Part 14] 99/103

LeSage (2000) on Timing

“The Bayesian probit and tobit spatial

autoregressive models described here have been

applied to samples of 506 and 3,107

observations. The time required to produce

estimates was around 350 seconds for the 506

observations sample and 900 seconds for the

case infolving 3,107 observations. …

(inexpensive Apple G3 computer running at 266

Mhz.)”

[Part 14] 100/103

Time and Space (In Your Computer)

[Part 14] 101/103

[Part 14] 102/103

Thank you

[Part 14] 103/103

•

•

nonlinear models with spatial data - new york...

Documents