the estimation strategy of the national household survey (nhs) françois verret, mike bankier,...

The estimation strategy of the National Household Survey

(NHS)

François Verret, Mike Bankier, Wesley Benjamin & Lisa Hayden

Statistics CanadaPresentation at the ITSEW 2011

June 21, 2011

2 2

Outline of the presentation

1. Introduction2. Handling non-response error3. Simulation set-up4. Results5. Limits of the study6. Conclusion7. Future work

3

1. Introduction

2006 Census: 20% long form, 80% short form 2011:

• 100% Census mandatory short form• 30% sampled to voluntarily complete the NHS long form

Objectives of the long form: get data to plan, deliver and support government programs directed at target populations

2011 common topics to both forms: demography, family structure, language

Additional 2011 long form topics: education, ethnicity, income, immigration, mobility…

NHS sample size is 4.5 million dwellings (f = 30%)

4

1. Introduction Non-response error in the NHS:

• Survey now voluntary => expect significant non-response• To minimize the impact, after a fixed date restrict the collection efforts to a Non-

Response Follow-Up (NRFU) random sub-sample

Set-up developed by Hansen & Hurwitz (1946)1. Select 1st phase sample s from population U

2. Non-response snr observed in s

3. NRFU selected from snr

4. Response NRFUr and non-response NRFUnr observed in the NRFU (HH assumed 100% resp. rate)

Ussr snr

NRFU

NRFUr NRFUnr

5

1. Introduction

When 100% of the NRFU responds (as in Hansen and Hurwitz original setting), the NRFU can be used to estimate without non-response bias the total in snr

This is not the case in the NHS. However focusing the collection efforts on the NRFU

converts part of the non-response bias (that would be observed in the full snr) into sub-sampling error

Ussr snr NRFU

NRFUr NRFUnr

6

2. Handling non-response error

The estimation method chosen to minimize the remaining non-response bias should have the following properties:• As few bias assumptions as possible should be made• The method should be simple to explain and to implement in

production

Available micro-level auxiliary data to adjust for non-response:• 2011 Census short form• Tax data

Calibration: Agreement with Census totals is desirable from a user’s perspective

7


First class of contenders: Reweighting• Usual method used to compensate for total non-response in

social surveys• The Hansen & Hurwitz estimator of a total

is unbiased if 100% of the NRFU answers

When the assumption does not hold, we must model the last non-response mechanism/phase and reweight accordingly…

ˆr nr

k kHH

s NRFUak ak k s

y yt

8


Scores method:• Model the probability of response with a logistic

regression

• Form Response Homogeneity Groups (RHG) of respondents and non-respondents with similar predicted response probabilities

• Calculate the response rate in each RHG and assign these new predicted response probabilities to respondents

• Divide the NRFUr weights by this probability:

scoresˆ

ˆr r nr

k kRHG

s NRFUak ak kk s

y yt

p

9


Second class of contenders: Imputation• Usual method to compensate for item non-response• We will consider nearest-neighbour imputation using the

CANadian Census Edit & Imputation System (CANCEIS) only

1. Partial imputation: Impute only non-respondents to the subsample (NRFUnr) and use reweighting to take sampling into account

2. Mass imputation: Impute all non-respondents (snr/NRFUr)

mass

ˆˆc

r r nr r

k k

s NRFU s NRFUak ak

y yt

partial

ˆˆr r nrnr nr

k k k

s NRFU NRFUak ak akk s k s

y y yt

10


Some pros & consMethod

Scores Partial imputation

Mass imputation

Preserves micro-level information of non-respondents

√ √√

Does not create synthetic information √√ √

Uses less heavy non-response hypotheses

√√ √√

Fully takes sub-sampling design into account

√√ √√

Census systems available √√ √√

More calibration to known Census totals can be done

√ √√

11

3. Simulation set-up

Use 2006 Census 20% long form sample data Restricted to Census Metropolitan Area (CMA) of

Toronto Simulation aimed at preserving the properties of the

NHS (except for the f = 30%):• Non-response to the 1st phase was simulated by

deterministically blanking out the data of the 63% of respondents who answered last in 2006

• Of these non-respondents, the 78% who answered first will have their response restored if they are selected in the NRFU sub-sample

• NRFU sub-sampling was simulated by selecting a stratified random sample of 41% of snr

12


Estimators calculated• As points of reference, unbiased estimators:

• As contenders:

mass

ˆˆc

r r nr r

k k

s NRFU s NRFUak ak

y yt

partial

ˆˆr r nrnr nr

k k k

s NRFU NRFUak ak akk s k s

y y yt

scoresˆ

ˆr r nr

k kRHG

s NRFUak ak kk s

y yt

p

2006ˆ k

s ak

yt

ˆ

r nr

k kHH

s NRFUak ak k s

y yt

13


The scores method• A single logistic regression was done for the whole CMA of

Toronto• Household response probability was predicted• Considered for stepwise selection: household-level variables,

our best attempt at summarizing the person-level information and one paradata variable

• R-square of 26%• 13 RHG formed with predicted probabilities ranging from 29%

to 95%

14


Imputation methods• Nearest-neighbour imputation done with CANCEIS• RHG is defined by household size• The distance between non-respondents and donors

(respondents) is defined by weighting each household-level, person-level and paradata characteristics in the distance function

• Preference is given to donors who are geographically close• For each non-respondents, a list of donors is made and one is

randomly selected with probability proportional to a measure of size (1st phase weight for mass imputation, score method weights for partial imputation)

15


M=84 non short form characteristics over the various topics Average relative difference:

• Calculated at the CMA level:

• At the Weighting area (953 WA in total) level within the CMA:

2006

1 2006

ˆ ˆ100ˆ

Mj j

j j

t t

M t

1

ˆ ˆ100ˆ

Mj HHj

j HHj

t t

M t

9532006

1 1 2006

ˆ ˆ100ˆ953

Mij ij

i j ij

t t

M t

953

1 1

ˆ ˆ100ˆ953

Mij HHij

i j HHij

t t

M t

16

4. Results

Errors at the CMA and WA levels for Toronto

CMA WA

Point of comparison Point of comparison

Full first-phase

Hansen & Hurwitz

Full first-phase

Hansen & Hurwitz

Hansen & Hurwitz estimator 0.94 0.00 22.98 0.00Mass imputation

2.97 N/A 24.56 N/APartial imputation

2.25 1.52 26.69 13.22Scores method

2.03 1.45 26.77 18.67

17

5. Limits of the study

Results:• The simulation only includes one replication of the sub-

sampling and non-response mechanisms• Non-response bias is the measure of interest, but errors

were presented• Non-response mechanisms were generated

deterministically. Should they be generated probabilistically?

• The 2011 sampling, non-response and available data (ex: paradata) cannot be replicated exactly

• Only totals studied. What about other parameters such as correlations?

18

5. Limits of the study

Possible confounding effects:• Logistic regression was done at the aggregated level of the

CMA and no WA effect or interaction were considered• Paradata for imputation is more closely related to non-

response mechanism (give preference to late respondents in the distance)

• Weighting of donors in imputation has an impact• Calibration done from sample to U; calibration at inner

levels/phases could help scores and partial imputation

19

With these preliminary results, it seems scores method is doing well at aggregate levels, while partial imputation is doing better than scores at finer levels

• Mass imputation: Can you override the known sub-sample design with an imputation model?

• Partial imputation: Can include more information (person-level, paradata) than scores, but weighting of each component in the distance is partially data driven and not straightforward

• Scores method: More difficult to include the information, but variable selection to explain non-response is direct

6. Conclusion

20

Possible:• Replicate sub-sampling and imputation more than once to

isolate bias components• Consider other levels of calibration in the comparisons• Hybrid of scores and partial imputation

Definite:• Implement a method into NHS production• Estimate the errors and variances (multi-phase, large sampling

fractions, errors due to modeling,…) and educate data users

Important to get a good model for the last non-response mechanism. Whatever the method, quality of the results is a function of the auxiliary data available.

7. Future Work

21

For more information,

please contact:

François Verret - SSMD/DMES [email protected]

(613) 951-7318

mailto:[email protected]

the estimation strategy of the national household survey (nhs) françois verret, mike bankier,...

Documents