latent class analysis of rotation group bias: the case of unemployment

Post on 14-Jan-2016

39 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Latent Class Analysis of Rotation Group Bias: The Case of Unemployment. Paul Biemer, UNC and RTI Bac Tran, US Census Bureau Jane Zavisca, University of Arizona SAMSI Conference, 11/10/2005. Overview. Motivation : To understand measurement error in the official unemployment rate - PowerPoint PPT Presentation

TRANSCRIPT

Paul Biemer, UNC and RTI

Bac Tran, US Census Bureau

Jane Zavisca, University of Arizona

SAMSI Conference, 11/10/2005

Latent Class Analysis of Rotation Group Bias: The Case of Unemployment

Overview

Motivation: To understand measurement error in the official unemployment rate

Method: Latent Class Analysis: measurement error as classification error

Distinction from previous research: Focus on measurement error mechanisms, as opposed to correcting marginal estimates.

Ultimate goal: To improve survey design.

The Official Unemployment Rate

In LaborForce {

Source: The Current Population Survey, 2004

Employed 62.4%Unemployed 3.6%Not in Labor Force 34.0%

Employed 94.5%Unemployed 5.5%

Employment Status

Unemployment Rate

The Official Unemployment Rate

Categories Employed: worked at least one hour in previous

week, or temporarily absent from job. Unemployed: not employed and “actively” looking

for work (unprompted categories), or temporarily laid off.

Not in Labor Force (NILF): All others.

Evidence for Measurement Error in Labor Force Status (LFS) in the CPS 1. Re-interview inconsistency

2. Rotation group bias

Re-interview Inconsistency

1% random sample of original sample of ≈ 50,000 households is re-interviewed monthly (without replacement).

Re-interview occurs in same week as the original interview.

Inconsistent responses suggest measurement error.

Re-interview Inconsistency (2001-2003)

8.9% of cases are inconsistently classified.

First Interview Empl. Unempl. NILF AllEmpl. 58.2 0.4 4.2 62.7Unempl. 0.5 1.9 1.0 3.4NILF 2.0 0.8 31.0 33.9All 60.7 3.1 36.2 100.0

Reinterview

Unemployment Inconsistency (2001-2003)

First Interview Empl. Unempl. NILF AllEmpl. 92.6 0.6 6.7 100Unempl. 13.8 56.4 29.8 100NILF 6.0 2.4 91.6 100

Reinterview

Rotation Group Design

BEGINDATE J F M A M J J A S O N D J F M A M J J

Oct-01 8Nov-01 7 8Dec-01 6 7 8Jan-02 5 6 7 8Feb-02 5 6 7 8Mar-02 5 6 7 8Apr-02 5 6 7 8May-02 5 6 7 8Jun-02 5 6 7 8Jul-02 5 6 7 8

Aug-02 5 6 7 8Sep-02 5 6 7 8Oct-02 4 5 6 7 8Nov-02 3 4 5 6 7 8Dec-02 2 3 4 5 6 7 8Jan-03 1 2 3 4 5 6 7 8Feb-03 1 2 3 4 5 6 7 8Mar-03 1 2 3 4 5 6 7 8Apr-03 1 2 3 4 5 6 7 8

2003 2004SAMPLE MONTH

Rotation Group Bias (2002 Full CPS)

5

5.5

6

6.5

7

1 2 3 4 5 6 7 8

Month-in-Sample

Un

em

plo

ym

en

t R

ate

What Could Cause Rotation Group Bias? Non-response bias: rotation groups may

represent different populations. Differences in interview setting

telephone vs. face-to-face proxy vs. self

Time in sample effect Improved understanding of questionnaire Embarrassment at admitting prolonged

unemployment Interview changes behavior

Latent Class Analysis to Test Hypotheses Sources of Rotation Group Bias

Non-response bias (different populations): Does latent employment status vary by rotation group?

Measurement error: Does rotation group influence error rates?

Differences in setting: Does interview mode (telephone vs. face-to-face) initial

interview influence error rates? Does interview mode account for apparent rotation group

effects on error rates? Social pressure:

Gender influences latent employment status Does gender also influence error rates? Does the effect of rotation group vary by gender?

Correlation between Month-in-Sample and Interview Mode

20

88

40

88

80

12

60

12

1 2 - 4 5 6 - 8Month in Sample

In Person

By Phone

Re-interview Data Set N = 24,297 (un-weighted data) X = True Labor Force Status (Latent Variable) A = Observed Labor Force Status at Inititial

Interview B = Observed Labor Force Status as Time 2

(Reinterview)

Basic Latent Class Model

XA

B

XBjt

XAit

Xt

ABXijt

|| BXjt

AXit

Bj

Ai

Xt

ABXijtf )ln(

X, A|X, B|X Shorthand:

(with usual constraints for identifiability)

Grouping Variable

XBjt

XAit

SXts

Ss

ABXSijts

|||

X

A

B

S, X|S, A|X, B|X

S

BXjt

AXit

XSts

Bj

Ai

Xt

Ss

ABXSijtsf )ln(

External Variable influencing Classification Error

XMBjtm

XMAitm

SXts

SMs

ABXSMijtsm

|||

XA

B

SM, X|S, A|XM , B|XM

S M

BXMjtm

AXMitm

BMjm

AMim

BXjt

AXit

XSts

SMts

Bj

Ai

Xt

Mm

Ss

ABXSMijtsf

)ln(

Grouping versus External Variables

XA

B

SM, X|S, A|XMS {AXM AXS} , B|XMS {BXM BXS}

S M

Covariates

S = Gender Men: 47% Women: 52%

M = Month in Sample 1 or 5: 28% 2-4, 6-8: 72%

T = Interview Mode (Initial Interview) Telephone: 72% In Person: 18%

Statistical Power & Identifiability IssuesFirst

Interview Empl. Unempl. NILF AllEmpl. 13939 87 1007 15033Unempl. 113 459 249 821NILF 500 204 7739 8443All 14552 750 8995 24297

Reinterview

• Large total N, but relatively small N for unemployed.•More variables means more identifiable models, but also diminishing cell counts and boundary solutions.

Principles of Model Construction Always include X|S A|X B|X

Assume 3 latent classes & S as grouping variable Fit classification table of A*B*M*T*S.

Vary following effects M as grouping variable M &/or T affecting classification error for A & B T affecting A but not B S affecting A & B when identifiable based on other

restrictions (including interaction of M & S)

Principles of Model Construction Try equality constraints

Equal influence of M & or S on error rate for A & B.

Error rate for T at time A = error rate at time B (when T does not affect B).

Principles of Model Selection Limit search to theoretically plausible models. Limit search to identifiable models. Overall model fit

P-value of likelihood ratio test vs. saturated model > .01 Dissimilarity index < .05

Model selection among those meeting above criteria: Bayesian information criterion (BIC) Likelihood ratio test for nested models

Check substantive interpretation within set of possible best models.

Best-Fitting Models

Model Group Effects on

Classification df L2 pval BIC dissim1 X|S A|XM; A|XT; B|XM;

B|XT24 38.0 0.04 -204 0.007

2 X|S X|M A|XM; A|XT; B|XM; B|XT

22 37.0 0.02 -184 0.007

3 X|S A|XMS=B|XMS; A|XT; B|XT

20 25.4 0.20 -176 0.006

4 X|S X|M A|XMS=B|XMS; A|XT; B|XT

18 20.2 0.32 -161 0.005

5 X|S A|XM, A|XT, A|XS B|XM, B|XT B|XS

12 12.5 0.41 -108 0.005

6 X|S X|M A|XM, A|XT, A|XS B|XM, B|XT B|XS

10 7.6 0.67 -93 0.004

Estimated Unemployment Rate Model 1 (similar to other top models)

UE = 4.9% Observed M.I.S. 1 & 5

UE = 6.0% Observed M.I.S. 2-4, 6-8

UE = 4.7%

Conditional Probabilities for Employment Status

Latent Observed Biemer Tran State State A B 1997 1999

E 96.8 95.5 98.7 98.7U 0.3 0.0 0.4 0.4N 2.9 4.5 0.8 0.9E 13.7 11.1 8.6 9.8U 77.3 74.1 74.4 72.3N 9.0 14.8 17.0 17.9E 4.2 1.7 1.1 2.3U 2.0 1.9 0.9 1.5N 93.8 97.0 98.0 96.2

Model 1Current Estimates Previous EstimatesClassification

E

N

U

Conditional Probabilities for A|TX & B|TX

Latent ObservedState State Phone Visit Phone Visit

E 97.3% 96.7% 96.9% 95.9%U 0.3% 0.4% 0.0% 0.0%N 2.5% 2.9% 3.1% 4.1%E 12.5% 9.5% 0.0% 0.0%U 61.0% 75.6% 83.6% 82.1%N 26.4% 14.8% 16.4% 17.9%E 4.1% 6.3% 5.3% 5.3%U 0.1% 2.0% 1.3% 2.4%N 95.8% 91.7% 93.4% 92.3%

Interview

E

ReinterviewConditional ProbabilitiesClassification

U

N

Conditional Probabilities for A|MX & B|MX

Latent State

Observed State

MIS 1,5

MIS 2-4,6-8

MIS 1,5

MIS 2-4,6-8

E 97.7% 98.8% 96.0% 96.7%U 0.8% 0.6% 0.1% 0.0%N 1.5% 0.6% 3.8% 3.3%E 8.2% 14.6% 3.0% 1.6%U 83.3% 71.8% 87.3% 79.6%N 8.5% 13.6% 9.7% 18.8%E 7.7% 5.4% 3.1% 5.2%U 2.8% 1.5% 2.3% 1.2%N 89.5% 93.1% 94.6% 93.6%

U

N

ClassificationConditional Probabilities

Interview Reinterview

E

Summary Findings Change in structural model (treating month-in-

sample as grouping variable) does not change the preferred measurement model.

Models fit nearly as well without M as grouping variable; casts doubt on non-response bias hypothesis.

M-I-S bias is not just a function of interview mode. Covariate effects (esp. S) on response error should

be examined further in model with more df; need another grouping variable.

Unresolved Issues Ambiguous results for model selection Most interested in fit of unemployment

classification, but this is overwhelmed in measures of overall fit

Software limitations: clustering, local & boundary solutions, standard errors not consistently output

Future Research Agenda Try finer coding of month-in-sample Develop models for other variables: age, race,

proxy vs. self Pool more years of data Develop hypotheses & interpretation based on

review of: experimental work analyses of non-response related models including Markov latent class models

of employment status transitions

Rotation Group Bias (2001-2003, reinterview data)

0%

1%

2%

3%

4%

5%

6%

7%

1 2 3 4 5 6 7 8

Month-in-Sample

Un

emp

loym

ent

Rat

e

top related