discrete choice modeling william greene stern school of business ifs at ucl february 11-13, 2004

80
Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Discrete Choice Modeling

William GreeneStern School of BusinessIFS at UCLFebruary 11-13, 2004

Page 2: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Discrete Choice Modeling

Econometric Methodology Binary Choice Models Multinomial Choice

Model Building Specification Estimation Analysis Applications

NLOGIT Software

Page 3: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Our Agenda1. Methodology2. Discrete Choice Models3. Binary Choice Models4. Panel Data Models for Binary Choice5. Introduction to NLOGIT6. Discrete Choice Settings7. The Multinomial Logit Model8. Heteroscedasticity in Utility Functions9. Nested Logit Modeling10. Latent Class Models11. Mixed Logit Models and Simulation Based Estimation12. Revealed and Stated Preference Data Sets

Page 4: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Part 1

Methodology

Page 5: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Measurement as Observation

Population Measurement Theory

CharacteristicsBehavior Patterns

Page 6: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Individual Behavioral Modeling Assumptions about behavior

Common elements across individuals Unique elements

Prediction Population aggregates Individual behavior

Page 7: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Modeling Choice

Activity as choices Preferences Behavioral axioms Choice as utility maximization

Page 8: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Inference

Population Measurement Econometrics

CharacteristicsBehavior PatternsChoices

Page 9: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Econometric Frameworks

Nonparametric Parametric

Classical (Sampling Theory) Bayesian

Page 10: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Likelihood Based Inference Methods

Behavioral TheoryStatistical Theory

Observed Measurement

LikelihoodFunction

The likelihood function embodies the theoretical description of the population. Characteristics of the population are inferred from the characteristics of the likelihood function. (Bayesian and Classical)

Page 11: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Modeling Discrete Choice Theoretical foundations Econometric methodology

Models Statistical bases Econometric methods

Estimation with econometric software Applications

Page 12: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Part 2

Basics of Discrete Choice Modeling

Page 13: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Modeling Consumer Choice:Continuous Measurement

• What do we measure?

• What is revealed by the data?

• What is the underlying model?

• What are the empirical tools?

Example: Travel expenditure based on price and income

Expenditure

Income

Low price

High price

Page 14: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Discrete Choice Observed outcomes

Inherently discrete: number of occurrences (e.g., family size; considered separately)

Implicitly continuous: the observed data are discrete by construction (e.g., revealed preferences; our main subject)

Implications For model building For analysis and prediction of behavior

Page 15: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Two Fundamental Building Blocks Underlying Behavioral Theory: Random

utility model

The link between underlying behavior and observed data

Empirical Tool: Stochastic, parametric model for binary choice

A platform for models of discrete choice

Page 16: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Random Utility A Theoretical Proposition About Behavior

Consumer making a choice among several alternatives

Example, brand choice (car, food) Choice setting for a consumer: Notation

Consumer i, i = 1, …, N

Choice setting t, t = 1, …, Ti (may be one)

Choice set j, j = 1,…, Ji (may be fixed)

Page 17: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Behavioral Assumptions

Preferences are transitive and complete wrt choice situations

Utility is defined over alternatives: Uijt

Utility maximization assumption

If Ui1t > Ui2t, consumer chooses alternative 1, not alternative 2.

Revealed preference (duality) If the consumer chooses alternative 1 and not

alternative 2, then Ui1t > Ui2t.

Page 18: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Random Utility Functions

Uitj = j + i ’xitj + i’zit + ijt

j = Choice specific constant

xitj = Attributes of choice presented to person i = Person specific taste weights

zit = Characteristics of the person

i = Weights on person specific characteristics

ijt = Unobserved random component of utility

Mean: E[ijt] = 0, Var[ijt] = 1

Page 19: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Part 3

Modeling Binary Choice

Page 20: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

A Model for Binary Choice Yes or No decision (Buy/Not buy) Example, choose to fly or not to fly to a destination

when there are alternatives. Model: Net utility of flying Ufly = +1Cost + 2Time + Income + Choose to fly if net utility is positive Data: X = [1,cost,terminal time] Z = [income]

y = 1 if choose fly, Ufly > 0, 0 if not.

Page 21: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

What Can Be Learned from the Data? (A Sample of Consumers, i = 1,…,N)

• Are the attributes “relevant?”

• Predicting behavior

- Individual

- Aggregate

• Analyze changes in behavior when

attributes change

Page 22: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Application 210 Commuters Between Sydney and

Melbourne Available modes = Air, Train, Bus, Car Observed:

Choice Attributes: Cost, terminal time, other Characteristics: Household income

First application: Fly or other

Page 23: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Binary Choice Data

Choose Air Gen.Cost Term Time Income1.0000 86.000 25.000 70.000.00000 67.000 69.000 60.000.00000 77.000 64.000 20.000.00000 69.000 69.000 15.000.00000 77.000 64.000 30.000.00000 71.000 64.000 26.000.00000 58.000 64.000 35.000.00000 71.000 69.000 12.000.00000 100.00 64.000 70.0001.0000 158.00 30.000 50.0001.0000 136.00 45.000 40.0001.0000 103.00 30.000 70.000.00000 77.000 69.000 10.0001.0000 197.00 45.000 26.000.00000 129.00 64.000 50.000.00000 123.00 64.000 70.000

Page 24: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

An Econometric Model Choose to fly iff UFLY > 0

Ufly = +1Cost + 2Time + Income + Ufly > 0

> -(+1Cost + 2Time + Income) Probability model: For any person observed by the

analyst, Prob(fly) = Prob[ > -(+1Cost + 2Time + Income)]

Note the relationship between the unobserved and the outcome

Page 25: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

A Regression - Like Model

INDEX

.2

.4

.6

.8

1.0

.0-1.8 -.6 .6 1.8 3.0-3.0

Pr[

Fly

]

+1Cost + 2TTime + Income

Page 26: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Econometrics How to estimate , 1, 2, ?

It’s not regression The technique of maximum likelihood

Prob[y=1] =

Prob[ > -(+1Cost + 2Time + Income)] Prob[y=0] = 1 - Prob[y=1]

Requires a model for the probability

0 1Prob[ 0] Prob[ 1]

y yL y y

Page 27: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Completing the Model: F() The distribution

Normal: PROBIT, natural for behavior Logistic: LOGIT, allows “thicker tails” Gompertz: EXTREME VALUE, asymmetric, underlies

the basic logit model for multiple choice Does it matter?

Yes, large difference in estimates Not much, quantities of interest are more stable.

Page 28: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004
Page 29: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Estimated Binary Choice Models

LOGIT PROBIT EXTREME VALUE

Variable Estimate t-ratio Estimate t-ratio Estimate t-ratio

Constant 1.78458 1.40591 0.438772 0.702406 1.45189 1.34775

GC 0.0214688 3.15342 0.012563 3.41314 0.0177719 3.14153

TTME -0.098467 -5.9612 -0.0477826 -6.65089 -0.0868632 -5.91658

HINC 0.0223234 2.16781 0.0144224 2.51264 0.0176815 2.02876

Log-L -80.9658 -84.0917 -76.5422

Log-L(0) -123.757 -123.757 -123.757

Page 30: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

A Regression - Like Model

INDEX

.2

.4

.6

.8

1.0

.0-1.8 -.6 .6 1.8 3.0-3.0

Pr[

Fly

]

+1Cost + 2Time + (Income+1)

Effect on predicted probability of an increase in income

( is positive)

Page 31: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Marginal Effects in Probability Models Prob[Outcome] = some F(+1Cost…) “Partial effect” = F(+1Cost…) / ”x”

(derivative) Partial effects are derivatives Result varies with model

Logit: F(+1Cost…) / x = Prob * (1-Prob) * Probit: F(+1Cost…) / x = Normal density

Scaling usually erases model differences

Page 32: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

The Delta Method

ˆ ˆ ,

ˆ ˆ ˆˆ. . , ,

ˆˆ . .

ˆ ,ˆ ,

ˆ

f

Est AsyVar

Est AsyVar

f

x

G x V G x

V =

xG x

Page 33: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Marginal Effects for Binary Choice Logit

Probit

ˆ ˆ ˆ[ | ] exp / 1 exp

[ | ]ˆ ˆ ˆ ˆ1

ˆ ˆ ˆ ˆ1 1 2

y

y

x x x x

x x xx

G x x I x x

ˆ[ | ]

[ | ]ˆ ˆ ˆ

ˆ ˆ ˆ

y

y

x x

x xx

G x I x x

Page 34: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Estimated Marginal Effects

Estimate t-ratio Estimate t-ratio Estimate t-ratio

GC .003721 3.267 .003954 3.466 .003393 3.354

TTME -.017065 -5.042 -.015039 -5.754 -.016582 -4.871

HINC .003869 2.193 .004539 2.532 .033753 2.064

Logit Probit Extreme Value

Page 35: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Marginal Effect for a Dummy Variable

Prob[yi = 1|xi,di] = F(’xi+di)

=conditional mean Marginal effect of d

Prob[yi = 1|xi,di=1]=Prob[yi= 1|xi,di=0] Logit: ˆ ˆˆ( )id x x

1 1 0 0

ˆ ˆ(1 ) (1 )

ˆ 0

g

Page 36: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

(Marginal) Effect – Dummy Variable

HighIncm = 1(Income > 50)+-------------------------------------------+| Partial derivatives of probabilities with || respect to the vector of characteristics. || They are computed at the means of the Xs. || Observations used are All Obs. |+-------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Characteristics in numerator of Prob[Y = 1] Constant .4750039483 .23727762 2.002 .0453 GC .3598131572E-02 .11354298E-02 3.169 .0015 102.64762 TTME -.1759234212E-01 .34866343E-02 -5.046 .0000 61.009524

Marginal effect for dummy variable is P|1 - P|0. HIGHINCM .8565367181E-01 .99346656E-01 .862 .3886 .18571429

(Autodetected)

Page 37: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Computing Effects Compute at the data means?

Simple Inference is well defined

Average individual effects More appropriate? Asymptotic standard errors. (Not done correctly

in the literature – terms are correlated!)

Page 38: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Elasticities

Elasticity =

How to compute standard errors? Delta method Bootstrap

Bootstrap the individual elasticities? (Will neglect variation in parameter estimates.)

Bootstrap model estimation?

ˆlog [ 1| ]ˆ

ˆlog [ 1| ]k

k kk

xP y

x P y

x

x

Page 39: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Estimated Income Elasticity for Air Choice Model

+------------------------------------------+| Results of bootstrap estimation of model.|| Model has been reestimated 25 times. || Statistics shown below are centered || around the original estimate based on || the original full sample of observations.|| Result is ETA = .71183 || bootstrap samples have 840 observations.|| Estimate RtMnSqDev Skewness Kurtosis || .712 .266 -.779 2.258 || Minimum = .125 Maximum = 1.135 |+------------------------------------------+

Mean Income = 34.55, Mean P = .2716, Estimated ME = .004539, Estimated Elasticity=0.5774.

Page 40: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Odds Ratio – Logit Model Only Effect Measure? “Effect of a unit change in

the odds ratio.”Prob[ 1| , ]

exp[ ]Prob[ 0 | , ]

Prob[ 1| , 1]Prob[ 0 | , 1] exp(Prob[ 1| , ]

Prob[ 0 | , ]

y zz

y z

y zy z

y zy z

xx

x

xx

xx

Page 41: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Inference for Odds Ratios Logit coefficient = , estimate = b Coefficient = exp(), estimate = exp(b) Standard error = exp(b) times se(b) t ratio is the same

Page 42: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

How Well Does the Model Fit? There is no R squared “Fit measures” computed from log L

“pseudo R squared = 1 – logL0/logL Others… - these do not measure fit.

Direct assessment of the effectiveness of the model at predicting the outcome

Page 43: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Fit Measures for Binary Choice Likelihood Ratio Index

Bounded by 0 and 1 Rises when the model is expanded

Cramer (and others)ˆ ˆ ˆ F | = 1 - F | = 0

=

Mean y Mean y reward for correct predictions minus

penalty for incorrect predictions

Page 44: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Fit Measures for the Logit Model+----------------------------------------+| Fit Measures for Binomial Choice Model || Probit model for variable MODE |+----------------------------------------+| Proportions P0= .723810 P1= .276190 || N = 210 N0= 152 N1= 58 || LogL = -84.09172 LogL0 = -123.7570 || Estrella = 1-(L/L0)^(-2L0/n) = .36583 |+----------------------------------------+| Efron | McFadden | Ben./Lerman || .45620 | .32051 | .75897 || Cramer | Veall/Zim. | Rsqrd_ML || .40834 | .50682 | .31461 |+----------------------------------------+| Information Akaike I.C. Schwarz I.C. || Criteria .83897 189.57187 |+----------------------------------------+

Page 45: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Predicting the Outcome

Predicted probabilities

P = F(a + b1Cost + b2Time + cIncome) Predicting outcomes

Predict y=1 if P is large Use 0.5 for “large” (more likely than not)

Count successes and failures

Page 46: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Individual Predictions from a Logit Model

Observation Observed Y Predicted Y Residual x(i)b Pr[Y=1]

81 .00000 .00000 .0000 -3.3944 .0325

85 .00000 .00000 .0000 -2.1901 .1006

89 1.0000 .00000 1.0000 -2.6766 .0644

93 1.0000 1.0000 .0000 .8113 .6924

97 1.0000 1.0000 .0000 2.6845 .9361

101 1.0000 1.0000 .0000 2.4457 .9202

105 1.0000 .00000 1.0000 -3.2204 .0384

109 1.0000 1.0000 .0000 .0311 .5078

113 .00000 .00000 .0000 -2.1704 .1024

117 .00000 .00000 .0000 -3.3729 .0332

445 .00000 1.0000 -1.0000 .0295 .5074

Note two types of errors and two types of successes.

Page 47: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Predictions in Binary Choice Predict y = 1 if P > P*

Success depends on the assumed P*

Page 48: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

ROC Curve Plot %Y=1 correctly predicted vs. %y=1

incorrectly predicted 450 is no fit. Curvature implies fit. Area under the curve compares models

Page 49: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004
Page 50: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Aggregate PredictionsFrequencies of actual & predicted outcomes

Predicted outcome has maximum probability.

Threshold value for predicting Y=1 = .5000

Predicted

------ ---------- + -----

Actual 0 1 | Total

------ ---------- + -----

0 151 1 | 152

1 20 38 | 58

------ ---------- + -----

Total 171 39 | 210

Page 51: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Analyzing PredictionsFrequencies of actual & predicted outcomes

Predicted outcome has maximum probability.

Threshold value for predicting Y=1 is P* .5000.

(This table can be computed with any P*.)

Predicted

------ -------------------- + -----

Actual 0 1 | Total

------ ----------------------+-------

0 N(a0,p0) N(a0,p1) | N(a0)

1 N(a1,p0) N(a1,p1) | N(a1)

------ ----------------------+ -----

Total N(p0) N(p1) | N

Page 52: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Analyzing Predictions - Success Sensitivity = % actual 1s correctly predicted = 100N(a1,p1)/N(a1) % [100(38/58)=65.5%]

Specificity = % actual 0s correctly predicted = 100N(a0,p0)/N(a0) % [100(151/152)=99.3%]

Positive predictive value = % predicted 1s that were actual 1s = 100N(a1,p1)/N(p1) % [100(38/39)=97.4%]

Negative predictive value = % predicted 0s that were actual 0s = 100N(a0,p0)/N(p0) % [100(151/171)=88.3%]

Correct prediction = %actual 1s and 0s correctly predicted = 100[N(a1,p1)+N(a0,p0)]/N [100(151+38)/210=90.0%]

Page 53: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Analyzing Predictions - Failures False positive for true negative = %actual 0s predicted as 1s =

100N(a0,p1)/N(a0) % [100(1/152)=0.668%]

False negative for true positive = %actual 1s predicted as 0s = 100N(a1,p0)/N(a1) % [100(20/258)=34.5%]

False positive for predicted positive = % predicted 1s that were actual 0s = 100N(a0,p1)/N(p1) % [100(1/39)=2/56%]

False negative for predicted negative = % predicted 0s that were actual 1s = 100N(a1,p0)/N(p0) % [100(20/171)=11.7%]

False predictions = %actual 1s and 0s incorrectly predicted = 100[N(a0,p1)+N(a1,p0)]/N [100(1+20)/210=10.0%]

Page 54: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Aggregate Prediction is a Useful Way to Assess the Importance of a Variable

Frequencies of actual & predicted outcomes. Predicted outcome has maximum probability. Threshold value for predicting Y=1 = .5000

Predicted

------ ---------- + -----

Actual 0 1 | Total

------ ---------- + -----

0 145 7 | 152

1 48 10 | 58

------ ---------- + -----

Total 193 17 | 210

Predicted

------ ---------- + -----

Actual 0 1 | Total

------ ---------- + -----

0 151 1 | 152

1 20 38 | 58

------ ---------- + -----

Total 171 39 | 210

Model fit without TTME

Model fit with TTME

Page 55: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Simulating the Model to Examine Changes in Market Shares

Suppose TTME increased by 25% for everyone.

Before increase After increase

Predicted

------ ---------- + -----

Actual 0 1 | Total

------ ---------- + -----

0 151 1 | 152

1 20 38 | 58

------ ---------- + -----

Total 171 39 | 210

Predicted

------ ---------- + -----

Actual 0 1 | Total

------ ---------- + -----

0 152 0 | 152

1 29 29 | 58

------ ---------- + -----

Total 181 29 | 210

• The model predicts 10 fewer people would fly

• NOTE: The same model used for both sets of predictions.

Page 56: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

ScalingUitj = j + i ’xitj + i’zit + ijt

ijt = Unobserved random component of utility

Mean: E[ijt] = 0, Var[ijt] = 1 Why assume variance = 1? What if there are subgroups with different variances?

Cost of ignoring the between group variation? Specifically modeling

More general heterogeneity across people Cost of the homogeneity assumption Modeling issues

Page 57: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Choice Between Two Alternatives By way of example: Automobile type

Choices (1) SUV or (2) Sedan, Ji = 2 One choice situation: Ti = 1 Attribute: xij = price, perhaps others Characteristic: zi = income No variation in taste parameters, i =

What do revealed choices tell us?

Page 58: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Modeling the Binary Choice Ui,suv = suv + Psuv + suvIncome + i,suv Ui,sed

= sed + Psed + sedIncome + i,sed

Chooses SUV: Ui,suv > Ui,sed Ui,suv - Ui,sed > 0

(SUV-SED) + (PSUV-PSED) + (SUV-sed)Income

+ i,suv - i,sed > 0

i > -[ + (PSUV-PSED) + Income]

Page 59: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Probability Model for Choice Between Two Alternatives

i > -[ + (PSUV-PSED) + Income]

Page 60: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Individual vs. Grouped Data Proportions and Frequencies Likelihood is the same

Yji may be 1s and 0s, proportions, or frequencies for the two outcomes.

0 1

1Prob[ 0] Prob[ 1]i i

N y y

iL y y

Page 61: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Weighting and Choice Based Sampling Weighted log likelihood for all data types

Endogenous weights for individual data

“Biased” sampling – “Choice Based”

1

log y0 log Prob[ 0 | ] 1 log Prob[ 1| ]N

i i i i i iiL w y y y

x x

( ) ( ) / ( )

= ( )

i i i i i i

i

i

i

w y y P y

True proportion of y s

Sample proportion of y s

a function of y two values

Page 62: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Choice Based Sample

Sample Population Weight

Air 27.62% 14% 0.5068

Ground 72.38% 86% 1.1882

Page 63: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Choice Based Sampling Correction Maximize Weighted Log Likelihood Covariance Matrix Adjustment

V = H-1 G H-1 (all three weighted)

H = Hessian

G = Outer products of gradients

• “Robust” covariance matrix (?) (Above without weights. What is it robust to?)

Page 64: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Effect of Choice Based Sampling

Unweighted+---------+--------------+----------------+--------+---------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |+---------+--------------+----------------+--------+---------+ Constant 1.784582594 1.2693459 1.406 .1598 GC .2146879786E-01 .68080941E-02 3.153 .0016 TTME -.9846704221E-01 .16518003E-01 -5.961 .0000 HINC .2232338915E-01 .10297671E-01 2.168 .0302+---------------------------------------------+| Weighting variable CBWT || Corrected for Choice Based Sampling |+---------------------------------------------++---------+--------------+----------------+--------+---------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |+---------+--------------+----------------+--------+---------+ Constant 1.014022236 1.1786164 .860 .3896 GC .2177810754E-01 .63743831E-02 3.417 .0006 TTME -.7434280587E-01 .17721665E-01 -4.195 .0000 HINC .2471679844E-01 .95483369E-02 2.589 .0096

Page 65: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Hypothesis Testing – Neyman/Pearson Comparisons of Likelihood Functions

Likelihood Ratio Tests Lagrange Multiplier Tests

Distance Measures: Wald Statistics

(All to be demonstrated in the lab)

Page 66: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Heteroscedasticity in Binary Choice Models

Random utility: Yi = 1 iff ’xi + i > 0 Resemblance to regression: How to accommodate

heterogeneity in the random unobserved effects across individuals?

Heteroscedasticity – different scaling Parameterize: Var[i] = exp(’zi) Reformulate probabilities

Probit:

Partial effects are now very complicated

'Prob[ 1]

exp( ' )i

ii

Y

x

z

Page 67: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Application: Credit Data Counts of major derogatory reports) “Deadbeat” = 1 if MAJORDRG > 0 Mean depends on AGE, INCOME, OWNRENT,

SELFEMPLOYED Variance depends on AVGEXP, DEPENDT (average

monthly expenditure, number of dependents) Probit model with heteroscedasticity

Page 68: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Probit with Heteroscedasticity+---------------------------------------------+| Binomial Probit Model || Dependent variable DEADBEAT || Number of observations 1319 || Log likelihood function -639.3388 || Restricted log likelihood -653.3217 || Chi-squared 27.96596 || Degrees of freedom 6 || Significance level .9535906E-04 |+---------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Index function for probability Constant -1.272312665 .13598690 -9.356 .0000 AGE .1126209389E-01 .40404726E-02 2.787 .0053 33.213103 INCOME .5286782288E-01 .20239074E-01 2.612 .0090 3.3653760 OWNRENT -.2049230056 .88518106E-01 -2.315 .0206 .44048522 SELFEMPL .1143040149 .13825044 .827 .4084 .68991660E-01 Variance function AVGEXP -.4768665802E-03 .12613317E-03 -3.781 .0002 185.05707 DEPNDT .6880605703E-02 .42546206E-01 .162 .8715 .99393480+-------------------------------------------+| Partial derivatives of E[y] = F[*] with || respect to the vector of characteristics. || They are computed at the means of the Xs. || Observations used for means are All Obs. |+-------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Index function for probability Constant -.3768739381 .54283831E-01 -6.943 .0000 AGE .3335964337E-02 .12357954E-02 2.699 .0069 33.213103 INCOME .1566006938E-01 .65292318E-02 2.398 .0165 3.3653760 OWNRENT -.6070059841E-01 .24667682E-01 -2.461 .0139 .44048522 SELFEMPL .3385819023E-01 .41052591E-01 .825 .4095 .68991660E-01 Variance function AVGEXP -.1133874143E-03 .31868469E-04 -3.558 .0004 185.05707 DEPNDT .1636042704E-02 .10080807E-01 .162 .8711 .99393480

Page 69: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Part 4Panel Data Models for Binary Choice

Page 70: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Panel Data and Binary Choice Models

Uit = + ’xit + it + Person i specific effect

Fixed effects using “dummy” variables

Uit = i + ’xit + it

Random effects using omitted heterogeneity

Uit = + ’xit + (it + vi)

Same outcome mechanism: Yit = [Uit > 0]

Page 71: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Fixed and Random Effects Models Fixed Effects

Robust to both specifications Inconvenient to compute (many parameters) Incidental parameters problem

Random Effects Inconsistent if correlated with X Small number of parameters Easier to compute

Computation – available estimators

Page 72: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Fixed Effects Dummy variable coefficients

Uit = i + ’xit + it

Can be done by “brute force” for 10,000s of individuals

F(.) = appropriate probability for the observed outcome

Compute and i for i=1,…,N (may be large) See “Estimating Econometric Models with Fixed Effects” at

www.stern.nyu.edu/~wgreene

1 1log log ( ' )iN T

i iti tL F

x

Page 73: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Random Effects Uit = + ’xit + (it + v vi) Logit model (can be generalized) Joint probability for individual i | vi = Unobserved component vi must be eliminated

Maximize wrt , and v

How to do the integration? Analytic integration – quadrature; most familiar software Simulation

1( ' )iT

it v itF v

x

1 1

1log log ( ' )iTN i

it v i ii tv v

vL F v f dv

x

Page 74: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Estimation by Simulation

1 1

1log ( ' )iTN i

it v i ii tv v

vF v f dv

x

is the sum of the logs of E[Pr(y1,y2,…|vi)]. Can be estimated by sampling vi and averaging. (Use random numbers.)

1 1 1

1log ( ' )iTN R

it v iri r tF v

R x

Page 75: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Random Effects is Equivalent to a Random Constant Term Uit = + ’xit + (it + v vi) = ( + vi) + ’xit + it

= i + ’xit + it

i is random with mean and variance

View the simulation as sampling over i

2

1 1 1

1log ( ' )iTN R

ir iti r tF

R x

• Why not make all the coefficients random?

Page 76: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

A Sampling Experiment CLOGIT data using GC, TTME, INVT and HINC Standardized data: each Xit* is (Xit – Mean(X))/Sx

Constructed utilities

Uit = 0 + 1GCit* + 1TTMEit* + 1INVTit*

+ (Random numberit + HINCi*) Treat 4 observations in each group as a panel with

T = 4. (We will examine a “live” panel data set in the lab.)

Page 77: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Estimated Fixed Effects Model+---------------------------------------------+| FIXED EFFECTS Logit Model || Maximum Likelihood Estimates || Dependent variable Z || Weighting variable None || Number of observations 840 || Iterations completed 5 || Log likelihood function -342.1919 || Sample is 4 pds and 210 individuals. || Bypassed 51 groups with inestimable a(i). || LOGIT (Logistic) probability model |+---------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Index function for probability GC .6708935970E-02 .18621919E-01 .360 .7186 112.29560 TTME .3648053834E-01 .57989428E-02 6.291 .0000 34.779874 INVT .3338438006E-02 .25104319E-02 1.330 .1836 492.25314 INVC .6795479927E-02 .19477804E-01 .349 .7272 48.448113 Partial derivatives of E[y] = F[*] with respect to the characteristics. Computed at the means of the Xs. Estimated E[y|means,mean alphai]= .501 Estimated scale factor for dE/dx= .250 GC .1677222976E-02 .46555287E-02 .360 .7186 112.29560 TTME .9120074679E-02 .14482840E-02 6.297 .0000 34.779874 INVT .8346040194E-03 .62727700E-03 1.331 .1833 492.25314 INVC .1698858823E-02 .48687627E-02 .349 .7271 48.448113

WHY?

Page 78: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Estimated Random Effects Model (1)+---------------------------------------------+| Logit Model for Panel Data || Maximum Likelihood Estimates || Dependent variable Z || Weighting variable None || Number of observations 840 || Iterations completed 15 || Log likelihood function -494.6084 || Hosmer-Lemeshow chi-squared = 15.81181 || P-value= .04515 with deg.fr. = 8 || Random Effects Logit Model for Panel Data |+---------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Characteristics in numerator of Prob[Y = 1] Constant -2.074416165 .20930847 -9.911 .0000 GC .9739427161E-02 .53423005E-02 1.823 .0683 TTME .8353847679E-02 .30194645E-02 2.767 .0057 INVT .1252315669E-03 .69864222E-03 .179 .8577 INVC -.1215241461E-02 .55156025E-02 -.220 .8256 RndmEfct .9492940742E-01 .18841088 .504 .6144 -.58755677E-07

Page 79: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Estimated Random Effects Model (2)+---------------------------------------------+| Random Coefficients Logit Model || Maximum Likelihood Estimates || Dependent variable Z || Weighting variable None || Number of observations 840 || Iterations completed 14 || Log likelihood function -494.5136 || Restricted log likelihood -496.1793 || Chi-squared 3.331300 || Degrees of freedom 1 || Significance level .6797315E-01 || Sample is 4 pds and 210 individuals. || LOGIT (Logistic) probability model || Simulation based on 100 random draws |+---------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Nonrandom parameters GC .1928882840E-01 .40879229E-02 4.718 .0000 110.87976 TTME .2364065236E-01 .24280249E-02 9.737 .0000 34.589286 INVT .5332059842E-03 .54092102E-03 .986 .3243 486.16548 INVC -.6668386903E-02 .41649216E-02 -1.601 .1094 47.760714 Means for random parameters Constant -2.942970074 .15967241 -18.431 .0000 Scale parameters for dists. of random parameters Constant .5338591567 .56357583E-01 9.473 .0000 Conditional Mean at Sample Point .4886 Scale Factor for Marginal Effects .2499 GC .4819681744E-02 .10205421E-02 4.723 .0000 110.87976 TTME .5907067980E-02 .59571899E-03 9.916 .0000 34.589286 INVT .1332316870E-03 .13504534E-03 .987 .3239 486.16548 INVC -.1666223679E-02 .10411841E-02 -1.600 .1095 47.760714

Page 80: Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Commands for Panel Data Models Model: LOGIT ; Lhs = …

; Rhs = …

; Pds = number of periods Common effect

Fixed effects ; FEM $ or ; Fixed $ Random ; Random Effects $ Simulation ; RPM ; Fcn=One(N) $

Use with Probit, Logit (and many others)