laura t. ozcoskun and katherine jenny thompson presented by samson adeshiyan

25
1 Redefining the Unit Nonresponse Adjustment Cells for the Survey of Residential Alterations and Repairs (SORAR) Laura T. Ozcoskun and Katherine Jenny Thompson Presented By Samson Adeshiyan

Upload: nuala

Post on 30-Jan-2016

53 views

Category:

Documents


0 download

DESCRIPTION

Redefining the Unit Nonresponse Adjustment Cells for the Survey of Residential Alterations and Repairs (SORAR). Laura T. Ozcoskun and Katherine Jenny Thompson Presented By Samson Adeshiyan. Outline. Background The Problem The Authors’ Recipe for a Solution - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

1

Redefining the Unit Nonresponse Adjustment Cells for the Survey of

Residential Alterations and Repairs (SORAR)

Laura T. Ozcoskun and

Katherine Jenny Thompson

Presented By Samson Adeshiyan

Page 2: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

2

Outline

• Background

• The Problem

• The Authors’ Recipe for a Solution

• Some Empirical Results Interspersed

Page 3: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

3

Survey of Residential Alterations and Repairs (SORAR) Background• Monthly data collection• Low unit response rates• Key item: Total Expenditures

• Maintenance and Repairs• Improvements

• Multi-stage sample of Housing Units (HUs)• Privately-owned vacant HUs (Vacant)• Rental and 5+ unit properties (Rental)

• Modified Half-Sample Variance Estimator

Page 4: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

4

The Problem (Motivation)

• SORAR’s three-stage weighting procedure• Duplication control (field subsampling)• Unit non-response adjustment • Post-stratification adjustment

• Suspected that variables used to define unit nonresponse weighting cells not highly related to• Response propensity or• Cell means

Page 5: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

5

Response Model

• “Quasi-Randomization” (Oh & Scheuren 1983)• Covariate dependent, missing-at-random (MAR) response

mechanism• Response propensity (p) is a random variable.

• Minimum requirements for weighting cells:1. Heterogeneous response propensities or

2. Heterogeneous cell means

• Optimal adjustment cells satisfy both conditions.

Page 6: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

6

The Authors’ Recipe

• Determine Eligible Sets of Classification Variables

• Determine Uncollapsed Cells (Full Model)• Logistic Regression Analysis

• Determine Collapsed Cells (Reduced Model)• General Linear Hypothesis Tests• Relative Efficiency Diagnostic (MSE Ratios)• Time Series Plots of Adjustment Factors

Page 7: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

7

Step 1: Find Sets of Classification Variables for Cells

• Respondent requirements per cell:• Actual Cell Size 5

• needed for logistic regression

• Effective “Sample” (cell) Size 5

• Categorical variables

Page 8: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

8

Cell Sizes• Effective “Sample” (Cell) Size

• rp is the Actual cell size of cell p

• DEFFp is the design effect for item Y in cell p• indicates efficient design for item Y

p

pp DEFF

rr ~

pp rr ~

Page 9: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

9

Candidate Cells (SORAR)• Candidate cell variables (categorical)

• Region (currently used)• Metropolitan Statistical Area (MSA) status

(currently used)• Tenure (Vacant/Rental)• Single-unit vs. Multi-unit

• Candidate cross classifications• Region/MSA Status/Single or Multi-Unit• Region/Tenure/Single or Multi-Unit

Page 10: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

10

Step 2: Uncollapsed Cells (Full Model)

• Response Propensity Modeling

• Logistic Regression• Complex survey adaptations of Roberts, Rao,

and Kumar (1987) to test statistics

• Full and reduced (nested) models• Want all effects to be significant in full model• Would like to reject majority of nested models

Page 11: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

11

Logistic Regression (SORAR)

• 18 months

• Separate full and reduced models for each month

• Between-cell covariance approximations = 0 (anti-conservative) = -0.25 = -0.50 (conservative)

Page 12: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

12

Model 1: Region/MSA/Single or Multi-Unit

Hypothesis = 0 = -0.25 = -0.50

Rejected Not Rejected

Rejected Not Rejected

Rejected Not Rejected

REGION = MSA = HU =0 (Full) 18 0 18 0 18 0

REGION = MSA=0|HU

0 14 4 13 5 10 8

REGION = HU=0|MSA

0 18 0 18 0 18 0

MSA = HU=0|REGION

0 18 0 18 0 18 0

REGION = 0| MSA

0, HU 0 12 6 12 6 9 9

MSA = 0| REGION

0, HU 0 8 10 8 10 8 10

HU = 0| REGION

0, TEN 0 18 0 18 0 18 0

Very sensitive to correlation assumptionsIndicates necessity of including Single/Multi-Unit in

weighting cellsRegion and MSA less necessary given Single/Multi-Unit

Page 13: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

13

Model 2: Region/Tenure/Single or Multi-Unit

Insensitive to correlation assumptions (change)Indicates necessity of including Single/Multi-Unit in

weighting cells (unchanged)Region and Tenure often necessary (change)

Hypothesis = 0 = -0.25 = -0.50

Rejected Not Rejected

Rejected Not Rejected

Rejected Not Rejected

REGION = TEN = HU =0 (Full) 18 0 18 0 18 0

REGION = TEN=0|HU

0 18 0 18 0 17 1

REGION = HU=0|TEN

0 18 0 18 0 18 0

TEN = HU=0|REGION 0 18 0 18 0 18 0

REGION = 0| TEN

0, HU 0 14 4 14 4 11 7

TEN = 0| REGION

0, HU 0 13 5 13 5 13 5

HU = 0| REGION

0, TEN 0 18 0 18 0 18 0

Page 14: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

14

Step 3: Collapsed Cells (Reduced Model)

• General Linear Hypothesis Tests

• Relative Efficiency Diagnostic

• Time Series Plots of Estimated Nonresponse Adjustment Factors

Page 15: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

15

General Linear Hypothesis Test

Hypothesis Tests• H0: and (collapse rows) • H0: and (collapse columns)

Not done with SORAR (cell estimates too variable)

2111 yy 2212 yy 1211 yy 2221 yy

Classification variable k

11y (cell 1) 12y (cell 2) Classification variable k’

21y (cell 3) 22y (cell 4)

Page 16: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

16

Relative Efficiency DiagnosticMSE Ratios

• Modified from Eltinge and Yanasaneh (1997)• Definitions

approximately model-unbiased estimate under full model

model-biased estimate under a collapsed weighting

procedure

(under model assumption)

• Mean squared error ratio:

FY

CY

)ˆ(ˆ)ˆ(ˆFF YVYESM

)ˆ(ˆ)ˆ(ˆ)ˆ(ˆ 2CCC YBYVYESM

)ˆ(ˆ)ˆ(ˆ)ˆ(ˆ 2

F

CCC

YV

YBYV

Page 17: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

17

SORAR MSE Ratios: Total Expenditures

• Tenure dropped: Median RH = 1.02

• HU Category dropped: Median RT = 0.93

• On average, RH is both greater than one and closer to one than RT

• Not terrifically compelling evidence for either collapsing

• How can values be less than 1?• Function of using empirical data

• Collapsed variances smaller or equivalent to uncollapsed variances

• Estimated bias often “negligible”

Page 18: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

18

Time Series Plots of Adjustment Factors

• Visual, less statistical • Fewer assumptions

• Full procedure and collapsed procedure adjustment factors• Within region (SORAR)• Inverse of response propensities (SORAR)

Page 19: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

19

Candidate Cells: Region by Single/Multi for Vacant Properties

• Original adjustment factors very different in scale

• Collapsed factors are far from both original factors

0

2

4

6

8

10

12

14

16

Vacant Single-Unit Property Factors Vacant Multi-Unit Property Factors

Collapsed Vacant Units

Page 20: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

20

Candidate Cells: Region by Single/Multi for Rental Properties

• Original adjustment factors very different in scale

• Collapsed factors are far from both original factors (c.f. multi-unit factors)

0

2

4

6

8

10

12

14

16

Rental Single-Unit Property Factors Rental Multi-Unit Property Factors

Collapsed Rental Units

Page 21: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

21

Candidate Cells: Region by Tenure for Single-Unit Properties

• Scale of original factors “similar” (compared to earlier slide)

• Collapsed factors different for single units

0

2

4

6

8

10

12

14

16

Vacant Single-Unit Property Factors Rental Single-Unit Property Factors

Collapsed Single Unit

Page 22: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

22

Candidate Cells: Region by Tenure for Multi-Unit Properties

• Scale of original factors similar

• Collapsed factors similar to original factors

0

2

4

6

8

10

12

14

16

Vacant Multi-Unit Property Factors Rental Multi-Unit Property Factors

Collapsed Multi Unit

Page 23: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

23

Final Recommendation (SORAR)

• Full weighting cells• Region/Tenure/Single or Multi-Unit

• Collapsed weighting cells• Region/Single or Multi-Unit• Region

Page 24: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

24

Conclusion

• Started with a recipe• Model-development tools• Diagnostic tools

• Modified the recipe for our survey• Considered and dropped diagnostics (data-based)

• Ended up with a new main course• More statistically defensible unit nonresponse

adjustment cells.

Page 25: Laura T. Ozcoskun and  Katherine Jenny Thompson Presented By Samson Adeshiyan

25

Any Questions?

• Laura Ozcoskun [email protected]

• Katherine Jenny Thompson [email protected]