laura t. ozcoskun and katherine jenny thompson presented by samson adeshiyan

Post on 30-Jan-2016

53 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Redefining the Unit Nonresponse Adjustment Cells for the Survey of Residential Alterations and Repairs (SORAR). Laura T. Ozcoskun and Katherine Jenny Thompson Presented By Samson Adeshiyan. Outline. Background The Problem The Authors’ Recipe for a Solution - PowerPoint PPT Presentation

TRANSCRIPT

1

Redefining the Unit Nonresponse Adjustment Cells for the Survey of

Residential Alterations and Repairs (SORAR)

Laura T. Ozcoskun and

Katherine Jenny Thompson

Presented By Samson Adeshiyan

2

Outline

• Background

• The Problem

• The Authors’ Recipe for a Solution

• Some Empirical Results Interspersed

3

Survey of Residential Alterations and Repairs (SORAR) Background• Monthly data collection• Low unit response rates• Key item: Total Expenditures

• Maintenance and Repairs• Improvements

• Multi-stage sample of Housing Units (HUs)• Privately-owned vacant HUs (Vacant)• Rental and 5+ unit properties (Rental)

• Modified Half-Sample Variance Estimator

4

The Problem (Motivation)

• SORAR’s three-stage weighting procedure• Duplication control (field subsampling)• Unit non-response adjustment • Post-stratification adjustment

• Suspected that variables used to define unit nonresponse weighting cells not highly related to• Response propensity or• Cell means

5

Response Model

• “Quasi-Randomization” (Oh & Scheuren 1983)• Covariate dependent, missing-at-random (MAR) response

mechanism• Response propensity (p) is a random variable.

• Minimum requirements for weighting cells:1. Heterogeneous response propensities or

2. Heterogeneous cell means

• Optimal adjustment cells satisfy both conditions.

6

The Authors’ Recipe

• Determine Eligible Sets of Classification Variables

• Determine Uncollapsed Cells (Full Model)• Logistic Regression Analysis

• Determine Collapsed Cells (Reduced Model)• General Linear Hypothesis Tests• Relative Efficiency Diagnostic (MSE Ratios)• Time Series Plots of Adjustment Factors

7

Step 1: Find Sets of Classification Variables for Cells

• Respondent requirements per cell:• Actual Cell Size 5

• needed for logistic regression

• Effective “Sample” (cell) Size 5

• Categorical variables

8

Cell Sizes• Effective “Sample” (Cell) Size

• rp is the Actual cell size of cell p

• DEFFp is the design effect for item Y in cell p• indicates efficient design for item Y

p

pp DEFF

rr ~

pp rr ~

9

Candidate Cells (SORAR)• Candidate cell variables (categorical)

• Region (currently used)• Metropolitan Statistical Area (MSA) status

(currently used)• Tenure (Vacant/Rental)• Single-unit vs. Multi-unit

• Candidate cross classifications• Region/MSA Status/Single or Multi-Unit• Region/Tenure/Single or Multi-Unit

10

Step 2: Uncollapsed Cells (Full Model)

• Response Propensity Modeling

• Logistic Regression• Complex survey adaptations of Roberts, Rao,

and Kumar (1987) to test statistics

• Full and reduced (nested) models• Want all effects to be significant in full model• Would like to reject majority of nested models

11

Logistic Regression (SORAR)

• 18 months

• Separate full and reduced models for each month

• Between-cell covariance approximations = 0 (anti-conservative) = -0.25 = -0.50 (conservative)

12

Model 1: Region/MSA/Single or Multi-Unit

Hypothesis = 0 = -0.25 = -0.50

Rejected Not Rejected

Rejected Not Rejected

Rejected Not Rejected

REGION = MSA = HU =0 (Full) 18 0 18 0 18 0

REGION = MSA=0|HU

0 14 4 13 5 10 8

REGION = HU=0|MSA

0 18 0 18 0 18 0

MSA = HU=0|REGION

0 18 0 18 0 18 0

REGION = 0| MSA

0, HU 0 12 6 12 6 9 9

MSA = 0| REGION

0, HU 0 8 10 8 10 8 10

HU = 0| REGION

0, TEN 0 18 0 18 0 18 0

Very sensitive to correlation assumptionsIndicates necessity of including Single/Multi-Unit in

weighting cellsRegion and MSA less necessary given Single/Multi-Unit

13

Model 2: Region/Tenure/Single or Multi-Unit

Insensitive to correlation assumptions (change)Indicates necessity of including Single/Multi-Unit in

weighting cells (unchanged)Region and Tenure often necessary (change)

Hypothesis = 0 = -0.25 = -0.50

Rejected Not Rejected

Rejected Not Rejected

Rejected Not Rejected

REGION = TEN = HU =0 (Full) 18 0 18 0 18 0

REGION = TEN=0|HU

0 18 0 18 0 17 1

REGION = HU=0|TEN

0 18 0 18 0 18 0

TEN = HU=0|REGION 0 18 0 18 0 18 0

REGION = 0| TEN

0, HU 0 14 4 14 4 11 7

TEN = 0| REGION

0, HU 0 13 5 13 5 13 5

HU = 0| REGION

0, TEN 0 18 0 18 0 18 0

14

Step 3: Collapsed Cells (Reduced Model)

• General Linear Hypothesis Tests

• Relative Efficiency Diagnostic

• Time Series Plots of Estimated Nonresponse Adjustment Factors

15

General Linear Hypothesis Test

Hypothesis Tests• H0: and (collapse rows) • H0: and (collapse columns)

Not done with SORAR (cell estimates too variable)

2111 yy 2212 yy 1211 yy 2221 yy

Classification variable k

11y (cell 1) 12y (cell 2) Classification variable k’

21y (cell 3) 22y (cell 4)

16

Relative Efficiency DiagnosticMSE Ratios

• Modified from Eltinge and Yanasaneh (1997)• Definitions

approximately model-unbiased estimate under full model

model-biased estimate under a collapsed weighting

procedure

(under model assumption)

• Mean squared error ratio:

FY

CY

)ˆ(ˆ)ˆ(ˆFF YVYESM

)ˆ(ˆ)ˆ(ˆ)ˆ(ˆ 2CCC YBYVYESM

)ˆ(ˆ)ˆ(ˆ)ˆ(ˆ 2

F

CCC

YV

YBYV

17

SORAR MSE Ratios: Total Expenditures

• Tenure dropped: Median RH = 1.02

• HU Category dropped: Median RT = 0.93

• On average, RH is both greater than one and closer to one than RT

• Not terrifically compelling evidence for either collapsing

• How can values be less than 1?• Function of using empirical data

• Collapsed variances smaller or equivalent to uncollapsed variances

• Estimated bias often “negligible”

18

Time Series Plots of Adjustment Factors

• Visual, less statistical • Fewer assumptions

• Full procedure and collapsed procedure adjustment factors• Within region (SORAR)• Inverse of response propensities (SORAR)

19

Candidate Cells: Region by Single/Multi for Vacant Properties

• Original adjustment factors very different in scale

• Collapsed factors are far from both original factors

0

2

4

6

8

10

12

14

16

Vacant Single-Unit Property Factors Vacant Multi-Unit Property Factors

Collapsed Vacant Units

20

Candidate Cells: Region by Single/Multi for Rental Properties

• Original adjustment factors very different in scale

• Collapsed factors are far from both original factors (c.f. multi-unit factors)

0

2

4

6

8

10

12

14

16

Rental Single-Unit Property Factors Rental Multi-Unit Property Factors

Collapsed Rental Units

21

Candidate Cells: Region by Tenure for Single-Unit Properties

• Scale of original factors “similar” (compared to earlier slide)

• Collapsed factors different for single units

0

2

4

6

8

10

12

14

16

Vacant Single-Unit Property Factors Rental Single-Unit Property Factors

Collapsed Single Unit

22

Candidate Cells: Region by Tenure for Multi-Unit Properties

• Scale of original factors similar

• Collapsed factors similar to original factors

0

2

4

6

8

10

12

14

16

Vacant Multi-Unit Property Factors Rental Multi-Unit Property Factors

Collapsed Multi Unit

23

Final Recommendation (SORAR)

• Full weighting cells• Region/Tenure/Single or Multi-Unit

• Collapsed weighting cells• Region/Single or Multi-Unit• Region

24

Conclusion

• Started with a recipe• Model-development tools• Diagnostic tools

• Modified the recipe for our survey• Considered and dropped diagnostics (data-based)

• Ended up with a new main course• More statistically defensible unit nonresponse

adjustment cells.

25

Any Questions?

• Laura Ozcoskun Laura.T.Ozcoskun@census.gov

• Katherine Jenny Thompson Katherine.J.Thompson@census.gov

top related