spatial discrete choice models

80
Spatial Discrete Choice Models Professor William Greene Stern School of Business, New York University

Upload: homer

Post on 24-Mar-2016

54 views

Category:

Documents


2 download

DESCRIPTION

Spatial Discrete Choice Models. Professor William Greene Stern School of Business, New York University. Spatial Correlation. Spatially Autocorrelated Data. Per Capita Income in Monroe County, New York, USA. The Hypothesis of Spatial Autocorrelation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Spatial Discrete Choice Models

Spatial Discrete Choice Models

Professor William GreeneStern School of Business, New York University

Page 2: Spatial Discrete Choice Models

Spatial Correlation

Page 3: Spatial Discrete Choice Models

Per Capita Income in Monroe County, New York, USA

Spatially Autocorrelated Data

Page 4: Spatial Discrete Choice Models

The Hypothesis of Spatial Autocorrelation

Page 5: Spatial Discrete Choice Models

Spatial Discrete Choice Modeling: Agenda

Linear Models with Spatial Correlation Discrete Choice Models Spatial Correlation in Nonlinear Models

· Basics of Discrete Choice Models· Maximum Likelihood Estimation

Spatial Correlation in Discrete Choice· Binary Choice· Ordered Choice· Unordered Multinomial Choice· Models for Counts

Page 6: Spatial Discrete Choice Models

Linear Spatial Autocorrelation

ii

( ) ( ) , N observations on a spatially arranged variable

'contiguity matrix;' 0

spatial a

x i W x i ε

W WW must be specified in advance. It is not estimated.

2

1

2 -1

utocorrelation parameter, -1 < < 1.E[ ]= Var[ ]=( ) [ ]E[ ]= Var[ ]= [( ) ( )]

ε 0, ε Ix i I W ε = Spatial "moving average" formx i, x I W I W

Page 7: Spatial Discrete Choice Models

Testing for Spatial Autocorrelation

Page 8: Spatial Discrete Choice Models
Page 9: Spatial Discrete Choice Models

Spatial Autocorrelation

2

2 2

.E[ ]= Var[ ]=E[ ]=Var[ ]

y Xβ Wεε| X 0, ε| X I y| X Xβ

y| X = WWA Generalized Regression Model

Page 10: Spatial Discrete Choice Models

Spatial Autoregression in a Linear Model

2

1

1 1

1

2 -1

+ .E[ ]= Var[ ]=

[ ] ( ) [ ] [ ]E[ ]=[ ]Var[ ] [( ) ( )]

y Wy Xβ εε| X 0, ε| X I

y I W Xβ εI W Xβ I W ε

y| X I W Xβy| X = I W I W

Page 11: Spatial Discrete Choice Models

Complications of the Generalized Regression Model

Potentially very large N – GPS data on agriculture plots

Estimation of . There is no natural residual based estimator

Complicated covariance structure – no simple transformations

Page 12: Spatial Discrete Choice Models

Panel Data Application

it i it

E.g., N countries, T periods (e.g., gasoline data)y c

= N observations at time t.Similar assumptions Candidate for SUR or Spatial Autocorrelation model.

it

t t t

x βε Wε v

Page 13: Spatial Discrete Choice Models

Spatial Autocorrelation in a Panel

Page 14: Spatial Discrete Choice Models

Alternative Panel Formulations

i,t t 1 i it

i

Pure space-recursive - dependence pertains to neighbors in period t-1 y [ ] regression + Time-space recursive - dependence is pure autoregressive and on neighborsin period t-1 y

Wy

,t i,t-1 t 1 i it

i,t i,t-1 t i it

y + [ ] regression + Time-space simultaneous - dependence is autoregressive and on neighborsin the current period y y + [ ] regression + Time-space dynamic -

Wy

Wy

i,t i,t-1 t i t 1 i it

dependence is autoregressive and on neighborsin both current and last period y y + [ ] + [ ] regression + Wy Wy

Page 15: Spatial Discrete Choice Models

Analytical Environment Generalized linear regression Complicated disturbance covariance

matrix Estimation platform

· Generalized least squares· Maximum likelihood estimation when

normally distributed disturbances (still GLS)

Page 16: Spatial Discrete Choice Models

Discrete Choices Land use intensity in Austin, Texas –

Intensity = 1,2,3,4 Land Usage Types in France, 1,2,3 Oak Tree Regeneration in Pennsylvania

Number = 0,1,2,… (Many zeros) Teenagers physically active = 1 or

physically inactive = 0, in Bay Area, CA.

Page 17: Spatial Discrete Choice Models

Discrete Choice Modeling

Discrete outcome reveals a specific choice

Underlying preferences are modeled

Models for observed data are usually not conditional means· Generally, probabilities of outcomes· Nonlinear models – cannot be estimated by any type

of linear least squares

Page 18: Spatial Discrete Choice Models

Discrete Outcomes Discrete Revelation of Underlying

Preferences· Binary choice between two alternatives· Unordered choice among multiple

alternatives· Ordered choice revealing underlying

strength of preferences Counts of Events

Page 19: Spatial Discrete Choice Models

Simple Binary Choice: Insurance

Page 20: Spatial Discrete Choice Models

Redefined Multinomial Choice

Fly Ground

Page 21: Spatial Discrete Choice Models

Multinomial Unordered Choice - Transport Mode

Page 22: Spatial Discrete Choice Models

Health Satisfaction (HSAT)Self administered survey: Health Care Satisfaction? (0 – 10)

Continuous Preference Scale

Page 23: Spatial Discrete Choice Models

Ordered Preferences at IMDB.com

Page 24: Spatial Discrete Choice Models

Counts of Events

Page 25: Spatial Discrete Choice Models

Modeling Discrete Outcomes “Dependent Variable” typically labels

an outcome· No quantitative meaning· Conditional relationship to covariates

No “regression” relationship in most cases

The “model” is usually a probability

Page 26: Spatial Discrete Choice Models

Simple Binary Choice: Insurance

Decision: Yes or No = 1 or 0Depends on Income, Health, Marital Status, Gender

Page 27: Spatial Discrete Choice Models

Multinomial Unordered Choice - Transport Mode

Decision: Which Type, A, T, B, C. Depends on Income, Price, Travel Time

Page 28: Spatial Discrete Choice Models

Health Satisfaction (HSAT)Self administered survey: Health Care Satisfaction? (0 – 10)

Outcome: Preference = 0,1,2,…,10Depends on Income, Marital Status, Children, Age, Gender

Page 29: Spatial Discrete Choice Models

Counts of Events

Outcome: How many events at each location = 0,1,…,10Depends on Season, Population, Economic Activity

Page 30: Spatial Discrete Choice Models

Nonlinear Spatial Modeling Discrete outcome yit = 0, 1, …, J for

some finite or infinite (count case) J.· i = 1,…,n· t = 1,…,T

Covariates xit . Conditional Probability (yit = j)

= a function of xit.

Page 31: Spatial Discrete Choice Models

Two Platforms

Random Utility for Preference Models Outcome reveals underlying utility· Binary: u* = ’x y = 1 if u* > 0· Ordered: u* = ’x y = j if j-1 < u* < j

· Unordered: u*(j) = ’xj , y = j if u*(j) > u*(k) Nonlinear Regression for Count Models

Outcome is governed by a nonlinear regression· E[y|x] = g(,x)

Page 32: Spatial Discrete Choice Models

Probit and Logit Models Prob(y 1 or 0| ) = F( ) or [1- F( )]x x xi i i iθ θ

Page 33: Spatial Discrete Choice Models

Implied Regression Function

Page 34: Spatial Discrete Choice Models

Estimated Binary Choice Models:The Results Depend on F(ε)

LOGIT PROBIT EXTREME VALUEVariable Estimate t-ratio Estimate t-ratio Estimate t-ratioConstant -0.42085 -2.662 -0.25179 -2.600 0.00960 0.078X1 0.02365 7.205 0.01445 7.257 0.01878 7.129X2 -0.44198 -2.610 -0.27128 -2.635 -0.32343 -2.536X3 0.63825 8.453 0.38685 8.472 0.52280 8.407Log-L -2097.48 -2097.35 -2098.17Log-L(0) -2169.27 -2169.27 -2169.27

Page 35: Spatial Discrete Choice Models

+ 1 (X1+1) + 2 (X2) + 3 X3 (1 is positive)

Effect on Predicted Probability of an Increase in X1

Page 36: Spatial Discrete Choice Models

Estimated Partial Effects vs. Coefficients

Page 37: Spatial Discrete Choice Models

Applications: Health Care UsageGerman Health Care Usage Data, 7,293 Individuals, Varying Numbers of PeriodsVariables in the file areData downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987). (Downloaded from the JAE Archive)

DOCTOR = 1(Number of doctor visits > 0) HOSPITAL = 1(Number of hospital visits > 0) HSAT = health satisfaction, coded 0 (low) - 10 (high) DOCVIS = number of doctor visits in last three months HOSPVIS = number of hospital visits in last calendar year PUBLIC = insured in public health insurance = 1; otherwise = 0 ADDON = insured by add-on insurance = 1; otherswise = 0 HHNINC = household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped) HHKIDS = children under age 16 in the household = 1; otherwise = 0 EDUC = years of schooling AGE = age in years FEMALE = 1 for female headed household, 0 for male EDUC = years of education

Page 38: Spatial Discrete Choice Models

An Estimated Binary Choice Model

Page 39: Spatial Discrete Choice Models

An Estimated Ordered Choice Model

Page 40: Spatial Discrete Choice Models

An Estimated Count Data Model

Page 41: Spatial Discrete Choice Models

210 Observations on Travel Mode Choice

CHOICE ATTRIBUTES CHARACTERISTICMODE TRAVEL INVC INVT TTME GC HINCAIR .00000 59.000 100.00 69.000 70.000 35.000TRAIN .00000 31.000 372.00 34.000 71.000 35.000BUS .00000 25.000 417.00 35.000 70.000 35.000CAR 1.0000 10.000 180.00 .00000 30.000 35.000AIR .00000 58.000 68.000 64.000 68.000 30.000TRAIN .00000 31.000 354.00 44.000 84.000 30.000BUS .00000 25.000 399.00 53.000 85.000 30.000CAR 1.0000 11.000 255.00 .00000 50.000 30.000AIR .00000 127.00 193.00 69.000 148.00 60.000TRAIN .00000 109.00 888.00 34.000 205.00 60.000BUS 1.0000 52.000 1025.0 60.000 163.00 60.000CAR .00000 50.000 892.00 .00000 147.00 60.000AIR .00000 44.000 100.00 64.000 59.000 70.000TRAIN .00000 25.000 351.00 44.000 78.000 70.000BUS .00000 20.000 361.00 53.000 75.000 70.000CAR 1.0000 5.0000 180.00 .00000 32.000 70.000

Page 42: Spatial Discrete Choice Models

An Estimated Unordered Choice Model

Page 43: Spatial Discrete Choice Models

Maximum Likelihood EstimationCross Section Case

Binary Outcome·

·

·

Random Utility: y* = + Observed Outcome: y = 1 if y* > 0,

0 if y* 0. Probabilities: P(y=1|x) = Prob(y* > 0| )

x

x

·

= Prob( > - ) P(y=0|x) = Prob(y* 0| ) = Prob( - ) Likelihood for the sample = joint probability

xxx

·

i i1

i i1

= Prob(y=y| ) Log Likelihood = logProb(y=y| )

x

x

n

in

i

Page 44: Spatial Discrete Choice Models

Cross Section Case

1 1 1 1

2 2 2 2

1 1

2 2

| or > | or > Prob Prob... ...| or >

Prob( or > )Prob( or > ) = ...Prob( or >

x xx x

x xxx

x

n n n n

n

y jy j

y j )

We operate on the marginal probabilities of n observations

n

Page 45: Spatial Discrete Choice Models

Log Likelihoods for Binary Choice Models

·

·

1

2

Logl( | )= logF 2 1 Probit

1 F(t) = (t) exp( t / 2)dt2

(t)dt Logit

exp(t) F(t) = (t) = 1 exp(t)

X,y xni ii

t

t

y

Page 46: Spatial Discrete Choice Models

Spatially Correlated ObservationsCorrelation Based on Unobservables

1 1 1 1 1

2 2 2 2 2 2

u u 0u u 0 ~ f ,... ... ... ...u u 0

In the cross section case, = . = the usual spatial weight matrix .

xx

x

W WW

WW I

n n n n n

yy

y

Now, it is a full matrix. The joint probably is a single n fold integral.

Page 47: Spatial Discrete Choice Models

Spatially Correlated ObservationsCorrelated Utilities

* *1 1 1 11 1

* * 12 2 2 22 2

* *... ...... ...

In the cross section case= the usual spatial weight matrix .

x xx x

x x

W I W

Wn n n nn n

y yy y

y y

, = . Now, it is a full matrix. The joint probably is a single n fold integral.

W I

Page 48: Spatial Discrete Choice Models

Log Likelihood

In the unrestricted spatial case, the log likelihood is one term,

LogL = log Prob(y1|x1, y2|x2, … ,yn|xn) In the discrete choice case, the

probability will be an n fold integral, usually for a normal distribution.

Page 49: Spatial Discrete Choice Models

LogL for an Unrestricted BC Model

1

1 1 1 2 12 1 1 1

2 2 1 2 21 2 2 2

1 1 2 2

1 ...1 ...LogL( | )=log ... ... ... ... ... ... ...

... 1

1 if y = 0 and

x xX,y n

n n

n nn

n n n n n n n

i i

q q q w q q wq q q w q q w

d

q q qw q q w

q

+1 if y = 1.

One huge observation - n dimensional normal integral.

Not feasible for any reasonable sample size.

Even if computable, provides no device for estimating sampling standard errors.

i

Page 50: Spatial Discrete Choice Models

Solution Approaches for Binary Choice

Distinguish between private and social shocks and use pseudo-ML

Approximate the joint density and use GMM with the EM algorithm

Parameterize the spatial correlation and use copula methods

Define neighborhoods – make W a sparse matrix and use pseudo-ML

Others …

Page 51: Spatial Discrete Choice Models

Pseudo Maximum LikelihoodSmirnov, A., “Modeling Spatial Discrete Choice,” Regional Science and Urban Economics, 40, 2010.

1 1

10

Spatial Autoregression in Utilities* * , 1( * ) for all n individuals* ( ) ( )

( ) ( ) assumed convergent = = + where

tt

y Wy X y y 0y I W X I WI W W

AD A-D

1

= diagonal elements*

Private Social Then

aProb[y 1 or 0| ] F (2 1) , pnj ij j

i ii

yd

Dy AX D A-D

Suppose individuals ignore the social "shocks."xX

robit or logit.

Page 52: Spatial Discrete Choice Models

Pseudo Maximum Likelihood Assumes away the correlation in the

reduced form Makes a behavioral assumption Requires inversion of (I-W) Computation of (I-W) is part of the

optimization process - is estimated with .

Does not require multidimensional integration (for a logit model, requires no integration)

Page 53: Spatial Discrete Choice Models

GMMPinske, J. and Slade, M., (1998) “Contracting in Space: An Application of Spatial Statistics to Discrete Choice Models,” Journal of Econometrics, 85, 1, 125-154.Pinkse, J. , Slade, M. and Shen, L (2006) “Dynamic Spatial Discrete Choice Using One Step GMM: An Application to Mine Operating Decisions”, Spatial Economic Analysis, 1: 1, 53 — 99.

1*= + , = +

= [ - ] = uCross section case: =0Probit Model: FOC for estimation of is based on the

ˆ generalized residuals ui

y W uI W u

A

Xθ ε

1

= y [ | ]( ( )) ( ) = ( )[1 ( )]

Spatially autocorrelated case: Moment equations are stillvalid. Complication is computing the variance of the

i i

n i i iii

i i

E yy

x xx 0x x

momentequations, which requires some approximations.

Page 54: Spatial Discrete Choice Models

GMM

1*= + , = +

= [ - ] = uAutocorrelated Case: 0Probit Model: FOC for estimation of is based on the

ˆ generalized residuals u

y W uI W u

A

Xθ ε

1

= y [ | ]

( ) ( ) = 1( ) ( )

i i i

i ii

n ii iiii

i i

ii ii

E y

ya a

a a

x x

z 0x x

Requires at least K+1 instrumental variables.

Page 55: Spatial Discrete Choice Models

GMM Approach Spatial autocorrelation induces

heteroscedasticity that is a function of Moment equations include the

heteroscedasticity and an additional instrumental variable for identifying .

LM test of = 0 is carried out under the null hypothesis that = 0.

Application: Contract type in pricing for 118 Vancouver service stations.

Page 56: Spatial Discrete Choice Models

Copula Method and ParameterizationBhat, C. and Sener, I., (2009) “A copula-based closed-form binary logit choice modelfor accommodating spatial correlation across observational units,” Journal of Geographical Systems, 11, 243–272

* *

1 2

Basic Logit Modely , y 1[y 0] (as usual)Rather than specify a spatial weight matrix, we assume[ , ,..., ] have an n-variate distribution.

Sklar's Theorem represents the joint distribut

i i i i i

n

x

1 1 2 2

ion in termsof the continuous marginal distributions, ( ) and a copulafunction C[u = ( ) ,u ( ) ,...,u ( ) | ]

i

n n

Page 57: Spatial Discrete Choice Models

Copula Representation

Page 58: Spatial Discrete Choice Models

Model

Page 59: Spatial Discrete Choice Models

Likelihood

Page 60: Spatial Discrete Choice Models

Parameterization

Page 61: Spatial Discrete Choice Models
Page 62: Spatial Discrete Choice Models
Page 63: Spatial Discrete Choice Models

Other Approaches

Case (1992): Define “regions” or neighborhoods. No correlation across regions. Produces essentially a panel data probit model.

Beron and Vijverberg (2003): Brute force integration using GHK simulator in a probit model.

Others. See Bhat and Sener (2009).

Case A (1992) Neighborhood influence and technological change. Economics 22:491–508Beron KJ, Vijverberg WPM (2004) Probit in a spatial context: a monte carlo analysis. In: Anselin L, Florax RJGM, Rey SJ (eds) Advances in spatial econometrics: methodology, tools and applications. Springer, Berlin

Page 64: Spatial Discrete Choice Models
Page 65: Spatial Discrete Choice Models

Ordered Probability Model

1

1 2

2 3

J -1 J

j-1

y* , we assume contains a constant termy 0 if y* 0y = 1 if 0 < y* y = 2 if < y* y = 3 if < y* ...y = J if < y* In general: y = j if < y*

βx x

j

-1 o J j-1 j,

, j = 0,1,...,J, 0, , j = 1,...,J

Page 66: Spatial Discrete Choice Models

Outcomes for Health Satisfaction

Page 67: Spatial Discrete Choice Models

A Spatial Ordered Choice ModelWang, C. and Kockelman, K., (2009) Bayesian Inference for Ordered Response Data with a Dynamic Spatial Ordered Probit Model, Working Paper, Department of Civil and Environmental Engineering, Bucknell University.

* *1

* *1

Core Model: Cross Section y , y = j if y , Var[ ] 1Spatial Formulation: There are R regions. Within a region y u , y = j if y Spatial he

i i i i j i j i

ir ir i ir ir j ir j

βx

βx2

2

1 2 1

teroscedasticity: Var[ ]Spatial Autocorrelation Across Regions = + , ~ N[ , ] = ( - ) ~ N[ , {( - ) ( - )} ] The error distribution depends on 2 para

ir r

v

v

u Wu v v 0 Iu I W v 0 I W I W

2meters, and Estimation Approach: Gibbs Sampling; Markov Chain Monte CarloDynamics in latent utilities added as a final step: y*(t)=f[y*(t-1)].

v

Page 68: Spatial Discrete Choice Models

OCM for Land Use Intensity

Page 69: Spatial Discrete Choice Models

OCM for Land Use Intensity

Page 70: Spatial Discrete Choice Models

Estimated Dynamic OCM

Page 71: Spatial Discrete Choice Models
Page 72: Spatial Discrete Choice Models

Unordered Multinomial Choice

Underlying Random Utility for Each Alternative U(i,j) = , i = individual, j = alternative Preference Revelation

Y(i) = j if and only if U(i,j) > U(i,k

j ij ij

·

·

Core Random Utility Model

x

1

1

) for all k j Model Frameworks

Multinomial Probit: [ ,..., ] ~N[0, ] Multinomial Logit: [ ,..., ] ~ type I extreme value

J

J iid

·

Page 73: Spatial Discrete Choice Models

Multinomial Unordered Choice - Transport Mode

Decision: Which Type, A, T, B, C. Depends on Income, Price, Travel Time

Page 74: Spatial Discrete Choice Models

Spatial Multinomial ProbitChakir, R. and Parent, O. (2009) “Determinants of land use changes: A spatial multinomial probit approach, Papers in Regional Science, 88, 2, 328-346.

1

Utility Functions, land parcel i, usage type j, date t U(i,j,t)=Spatial Correlation at Time t Modeling Framework: Normal / Multinomial ProbitEstimation: MCMC

jt ijt ik ijt

nij il lkl

x

w

- Gibbs Sampling

Page 75: Spatial Discrete Choice Models
Page 76: Spatial Discrete Choice Models
Page 77: Spatial Discrete Choice Models
Page 78: Spatial Discrete Choice Models

Modeling Counts

Page 79: Spatial Discrete Choice Models

Canonical Model

Poisson Regression y = 0,1,...

exp( ) Prob[y = j|x] = ! Conditional Mean = exp( x)Signature Feature: EquidispersionUsual Alternative: Various forms of Negative BinomialSpatial E

j

j

1

ffect: Filtered through the mean = exp( x + ) =

i i in

i im m imw

Rathbun, S and Fei, L (2006) “A Spatial Zero-Inflated Poisson Regression Model for Oak Regeneration,” Environmental Ecology Statistics, 13, 2006, 409-426

Page 80: Spatial Discrete Choice Models