limited dependent variables: event counts adapted primarily from john mciver’s notes, hoffman’s...

38
Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models for Categorical and Limited DVs”

Upload: imani-radford

Post on 31-Mar-2015

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Limited Dependent Variables:Event Counts

Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear

Models,” and Scott’s “Regression Models for Categorical and Limited DVs”

Page 2: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Event Counts

• The DV is…– Event count models are models where the dependent variable is a

count of events: i.e., the number of occurrences in a fixed domain. – The domain may be a unit of time (minute, day, year) or units in

fixed time (an individual or geographic unit).

• The DV is not…– Grouped binary data

• Data which are the number of “successes” (or “failures”) out of some known number of binary trials (# of failed coups, # successful veto overrides)

• Political Knowledge measures?

– Ordinal data• Use ordered logit or ordered probit

Page 3: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Counts as DVs

• Political protests in a nation in a year (Kasler 1996)

• Number of lynchings per county per year in the South (Tolnay, Deane, and Beck 1996)

• Number of retirements per year on the Supreme Court (Hagle 1993)

Page 4: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Characteristics of Event Data• 1) Event counts are non-

negative (lower bound is zero)

• 2) Counts are integers (discrete, rather than continuous variables): 2.7 children??

• 3) A histogram will indicate a rapidly decreasing tail, esp. w/ rare phenomena

• 4) Distribution is not normal (in most cases)– Poisson or negative binomial

0.1

.2.3

.4.5

De

nsity

0 2 4 6 8polpart

Source: 1996 National Black Election Study

Page 5: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

How do we estimate these regression models?

• Maximum Likelihood Estimation– Find the parameter of interest (lambda, Beta, p)

given a set of data.– MLE finds the value of the parameter that makes

the observed data most likely– Liabilities (or assets…) of MLE:

• Consistency: Sample size increases, bias decreases• Asymptotic efficiency: Smallest variance among

consistent estimators• Asymptotic normally distributed: Hypothesis testing

Page 6: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Why not OLS?

• OLS assumes a linear relationship– This assumption will often produce predicted event counts less than

zero (a logical impossibility).– This assumption also means that the difference between 0 and 1 event

in a given unit is the same as the difference between 10 and 11 events or between 100 and 101 events.

• Heteroskedasticity is likely (and a certainty if events are distributed as they commonly occur as Poisson distributed data).

• So OLS is…inaccurate, inconsistent, biased and inefficient. Yuck.

Page 7: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

But not always…

• When OLS is okay…– As lambda (rate of the event) increases, the DV will increasingly

appear to follow a normal distribution

Page 8: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

The Poisson Distribution

• Count variables, especially when measuring a rare phenomena, often follow a Poisson distribution.

• Lambda ( ) is known as the rate in the context of Poisson distribution.

Page 9: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Probability of Number of Events in a Poisson Distribution

• If the average number of political acts per year, based on past data, is 2, then we expect the probability of one political act in the next year would be…?

LambdaNumber of

Events P(i)2 0 0.135335 1 0.270671 2 0.270671 3 0.180447 4 0.090224 5 0.036089

Page 10: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Assumptions of Poisson

1) The mean of the distribution equals its variance (a.k.a equidispersion)

2) Events that make up the Poisson distribution are assumed to be independent– A lack of independence can lead to a violation of

Assumption 1. Known as overdispersion.• Different distribution is used for these models – the

overdispersed Poisson or the negative binomial.

Page 11: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Negative Binomial (overdispersed data) v. Poisson Distribution

• Non-electoral PTP• Mean = 1.59• Var = 2.08

• Electoral PTP• Mean = 1.37• Var = 1.33

0.1

.2.3

.4.5

Den

sity

0 2 4 6 8polpart

0.1

.2.3

Den

sity

0 2 4 6nonepolpart

Page 12: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Poisson Regression Model

• Goal– Estimate the increase in the DV for a unit change in

the IV– Predict expected counts for various groups

• Intuition– We use the regression equation to come up with the

expected “log-number” of events and then exponentiate this quantity to obtain a predicted count

– Interpretation of coefficients is done in a similar way

Page 13: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Poisson Regression: Electoral Participation• What causes African Americans to participate in more political acts?• Does education affect the number of political acts?

by educdum: sum polpart

-> educdum = High School or Less

Variable | Obs Mean Std. Dev. Min Max-------------+-------------------------------------------------------- polpart | 335 1.080597 .9922211 0 5

-------------------------------------------------------------------------> educdum = More than HS

Variable | Obs Mean Std. Dev. Min Max-------------+-------------------------------------------------------- polpart | 517 1.560928 1.198619 0 6

Page 14: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Poisson regression in Stata

• Generic code:– poisson dv iv (poisson polpart educdum)

Page 15: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Interpretation

• Signs indicate the effect on the expected number of counts.

• Incident Rate Ratios– In the Poisson case, the quantity of interest is

known as the incidence rate – that is, λ. The natural way to compare two observations, then, is the “incidence rate ratio” (or IRR).

Page 16: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Incidence Rate Ratios

• For a binary covariate XD, we can think of the IRR as the ratio…

That is, we can tell the relative change in the incidence rate for a one–unit change in any given variable Xk by simply exponentiating its coefficient estimate βk.

Page 17: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Interpretation: Expected Counts and Incidence Rate Ratios

• In our case, then:– Expected number of acts among those w/ HS educ or less (x=0):

• exp (0.0775137) = 1.08– Expected number of acts among those w/ more than HS educ (x=1):

• exp (0.0775137 + 0.3677671) = 1.56

• This means that the incidence rate for those with more than a HS education is 1.56 /1.08 = 1.44 times that for those with a HS education or less

• We can also calculate percent differences between these groups:– Percent difference = (1.56 – 1.08) / 1.08 = 44% increase in political acts

Formula for Expected Counts

Page 18: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

An extended model

Page 19: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Quantities of Interest

In the example, this means that the estimated IRR for the education variable is equal to

exp(0.10274) = 1.11.

• This means that a one–unit change in the level of education variable corresponds to an estimated IRR 1.11.– i.e., increasing the level of education of a respondent by

one year increases the estimated incidence rate by a factor of 1.11 or about 11% more political acts, cetaris parabus.

Page 20: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Stata reports irr’s as well

Page 21: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Percent Change in Expected Count

• For an 8 unit increase in education (min to max), this means we will see (all else equal):

Page 22: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Calculating Expected Counts

• For a typical case (education =4.08 [some college], contacted = 0, efficacy =0.49, female = 1), the predicted count would be:

E(Y|mean of Xi) = exp[−0.434 + (0.103 × 4.08) + (0.462 × 0)

+ (0.365*0.49) + (-0.051*1)] = exp(0.11409) = 1.12

Page 23: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Expected Counts

• You can accordingly calculate the change in expected counts by calculating the predicted count for different values of Xi, and taking the difference.– The expected count for the same person (on the previous slide),

but who was contacted would be = exp(0.57609) = 1.78– So, being contacted results in (1.78−1.12) ≈ 0.67 increase in

political acts.– Note that 1.78/1.12 = 1.59, which is the same as the IRR for a one

unit change in contacted.

• Stata way:– “predict polpart1, n” where ‘n’ provides counts rather than ‘p’ for

probability

Page 24: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Expected Political Acts as Education Increases (other IVs at mean or mode)

1 2 3 4 5 6 7 8 90

0.20.40.60.8

11.21.41.61.8

2

Number of Electoral Political ActsSource: 1996 NBES

R's Level of Education

Num

ber o

f Act

s

Education XBNumber of Political

Acts

1 -0.20352 0.815857365

2 -0.10078 0.904135774

3 0.001964 1.001966193

4 0.104704 1.110382181

5 0.207444 1.230529129

6 0.310184 1.363676366

7 0.412924 1.511230566

8 0.515664 1.674750607

9 0.618404 1.855964047

Page 25: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Alternatives to Poisson

• The assumption that the mean equals the variance is often unrealistic– Overdispersed data: Variance exceeds the mean– Problems:

• Poisson is consistent, but inefficient• SEs are biased downward using Poisson resulting in

larger z-values (incorrect inferences)• Solutions:

a) Extradispersed Poisson Regressionb) Negative binomial regression model

Page 26: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Extradispersed Poisson Regression Model

• Accounts for the fact that the variance of the DV differs from the mean– Affects only the standard errors of the model

• SEExtradispersed = SEUnadjusted * sqrt(dispersion)

– Point estimates are the same (rates, IRRs, predicted counts)

• In Stata:– glm dv ivs, family(poisson) link(log) scale(dev) irls– predict dv, mu Note that we use ‘mu’ instead of ‘n’ which

is the general command asking fro predicted values when using glm.

Page 27: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Negative Binomial (overdispersed data) v. Poisson Distribution

• Non-electoral PTP• Mean = 1.59• Var = 2.08

• Electoral PTP• Mean = 1.37• Var = 1.33

0.1

.2.3

.4.5

Den

sity

0 2 4 6 8polpart

0.1

.2.3

Den

sity

0 2 4 6nonepolpart

Page 28: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Non-Electoral Participation via Poisson

Page 29: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Non-Electoral Participation via Extradispersed Poisson

Page 30: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Negative Binomial

• Assumes that the variance is larger than the mean– More appropriate than Poisson in the common

situation where the events of interest are not independent

– Follows a different probability mass function• Stata

– nbreg dv ivs– nbreg dv ivs, irr– predict dv1, n

Page 31: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Non-electoral PTP by Negative Binomial

Page 32: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Testing for Overdispersion

• In addition to examining whether or not we can reject the null that alpha = 0, we can also test for overdispersion using the log likelihoods from both the Poisson and the NBRM models:

G2 = 2(ln LNBRM – ln LPRM)

tests the null hypothesis that alpha = 0.

• Distributed as X2 and the two values in the parentheses are log likelihoods from the NBRM and Poisson regressions

Page 33: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Which regression model to use?

• No generally accepted rule of thumb regarding how much extradispersion is allowable before switching from Poisson to Negative Binomial (Hoffman 2004; Cameron and Tivedi 1998)– Estimate both Poisson and negative binomial– Compare results– If alpha is greater than zero and results differ, use negative binomial.– If variance is smaller than the mean (rare), negative binomial is not

appropriate. Extradispersed Poisson will probably be the best route.

• Differences tend to affect SEs rather than coefficients (significance of variables rather than estimated coefficients).

Page 34: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Diagnostic Tests for Poisson

Residual analysis• Compute deviance residuals and predicted counts

– Plot against one another looking for poor fit and influential observations

• Stata– predict count, mu– predict dev1, deviance

– Plot deviance residuals against each IV (if IVs are continuous random variables)

• Different functional form

– Plot deviance residuals in a normal probability (Q-Q) plot to examine distribution

• Residuals should fall along diagonal

Page 35: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

-2-1

01

23

devi

ance

res

idua

l

0 1 2 3 4predicted mean polpart

-4-2

02

4de

vian

ce r

esid

ual

-4 -2 0 2 4Inverse Normal

Residuals Plotted against Predicted Counts of Political Acts

• twoway(scatter dev1 count)

QQ Plot of Residuals Against Normal Probability

qnorm dev1

•Graph 1 indicates that there may be some observations at the top of the plot that may be influential or indicate that the model is misspecified.

•Graph 2 indicates that the residuals generally follow a normal distribution, indicating our estimator choice is likely appropriate

Page 36: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Extensions

• Zero-inflated or zero-modified count models– Number of 0s in a sample exceeds number

predicted under Poisson or negative binomial

• Truncated count model– Count variables observed only after the first count

occur (“hurdle” models)• Number of alcoholic beverages in a day (Hoffman 2004)

Page 37: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

Empirical Examples of Event Counts (Poisson Regression)

• D. Cannon (1993) “Sacrificial Lambs or Strategic Politicians? Political Amateurs in US House Elections.” AJPS 37: 1119-1141.

• J. Robertson (1983) “Inflation, Unemployment and Government Collapse.” Comparative Political Studies 15: 425-444.

• T. Shields & C. Huang (1995) “Presidential Vetoes: An Event Count Model.” PRQ 48: 559-572

• J. Spriggs II & P. Wahlbeck (1995) “Calling It Quits: Strategic Retirement on the Federal Courts of Appeals, 1893-1991.” PRQ 48: 573-597.

• T. Volgy & L. Imwalle (1995) “Hegemonic and Bipolar Perspectives on the New World Order.” AJPS 39: 819-834.

• M. Koch & S. Cranmer (2009) “Testing the “Dick Cheney” Hypothesis: Do Governments of the Left Attract more Terrorism than Governments of the Right?”

Page 38: Limited Dependent Variables: Event Counts Adapted primarily from John McIver’s notes, Hoffman’s “Generalized Linear Models,” and Scott’s “Regression Models

References• Long, J. Scott. 1997. Regression Models for Categorical and Limited

Dependent Variables. Thousand Oaks, CA: Sage Publications.• Gujarati, Damodar N. 2003. Basic Econometrics. Singapore:

McGraw-Hill, 4th Edition.• Hoffman, John P. 2003. Generalized Linear Models. Boston: Pearson

Education Inc.• Gary King (1988)“Statistical Models for Political Science Event

Counts: Bias in Conventional Procedures and Evidence for the Exponential Poisson Regression Model.” American Journal of Political Science 32: 838-863.

• Gary King (1989) “Variance Specification in Event Count Models: From Restrictive Assumptions to a Generalized Estimator.” American Journal of Political Science 33: 762-784.

• Gary King (1989) “Event Count Models for International Relations: Generalizations and Applications.” International Studies Quarterly, Vol. 33: 123-147.