copyright 2010, the world bank group. all rights reserved. estimation and weighting, part i

16
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Upload: ann-robbins

Post on 27-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Estimation and Weighting, Part I

Page 2: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Goal of Estimation

Minimize a survey’s total error

• Sampling Error is error arising solely from the sampling process (measure: variance)– Mainly a function of sample size

• Surveys are also subject to biases from nonsampling errors such as: – Coverage errors and non-probability sampling– Response errors– Nonresponse

2

Page 3: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Typical Estimation Steps

The estimation steps for a typical household survey avoid or help control some nonsampling errors• Editing and Imputation are aimed at controlling response

errors• Basic Weighting based on probabilities of selection produces

essentially unbiased estimates when there is 100% response and no response error

• Nonresponse Adjustment helps avoid some obvious biases that arise when nonrespondents are ignored

• Population Controls help minimize some coverage problems

3

Page 4: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Editing and Imputation

Editing– deleting or correcting unacceptable data values– coding/combining data to classify respondents

Imputation – insert values for missing data– for missing items (imputation is common)– For missing HH or persons (not used as often)– modeling methods– Hot deck methods

4

Page 5: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

5

Item Nonresponse Imputation

When a household is interviewed and a small amount of data is not obtained for a person, imputing for the missing data creates a complete data set.

Hot Deck Method: Use answers from another similar unit to impute answers for an item nonresponse – “nearest neighbor”

Modeling Method: Mathematically impute an answers for an item nonresponse

Page 6: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Example of Imputation

Suppose a woman aged 29, was employed last month. This month, we were not able to obtain her labor force status. Construct a “transition matrix” using records of “similar” persons with labor force status coded in both months – use females aged 24-45.

Last MonthThis

MonthEMP UE NILF TOTAL

EMP 120 10 7 137

UE 2 20 5 27

NILF 5 2 50 57

Total 127 32 62 221

6

Page 7: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

7

Example of Imputation

Employed Last Month

EMP UE NILF

Sample Frequency

120

2

5

Estimated Probability

0.9449

0.0157

0.0394

Range for 0 rn 1

[0, 0.9449]

[0.9449, 0.9606]

[0.9606, 1]

Based on Frequencies, Compute Probabilities

Page 8: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Example of Imputation

• Generate a random number between 0 and 1• If rn = .7221, for example, then rn falls in the range [0, .9449] and

“employed” is imputed for this month– Will happen 94.49% of the time

• No guarantee that this is right for the particular data item that is imputed

• Imputed data set is complete and preserves known relationships

8

Page 9: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

9

Example of Imputation

Would you impute a labor force status?

Maybe not:• Usually a determination will be made concerning how much

data is required for a response to be accepted by a survey • For a labor force survey, enough information to determine LF

status will probably be required

Page 10: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Purpose of Weighting

Estimate the number of persons each person in a sample household represents

Each person interviewed helps represent– not-in-sample population of the area

(geographic stratum) where the person lives

– sample persons not interviewed– Generally, persons of the same age,

race, gender, and ethnic origin as the person interviewed

10

Page 11: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Basic Weights

Applied at the household level (all persons in HH have the same basic weight)

Inverse of probability of selection

In a typical HH sample there are two stages of sampling and two probabilities– 1st stage probability for an EA EAprob– 2nd stage probability for HH in that EA HHprob– TOTprob = EAprob * Hhprob– Baseweight = 1/TOTprob

11

Page 12: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Base Weights

• Self weighting samples are not common• Primary stratifier for HH surveys is geography, such as

state – often the base weights in a state are all equal– OR nearly the same

• For a self-weighting stratum use N/n:

Number N of HHs on the Frame

Number n of HHs in the Sample

12

Page 13: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Example of Basic Weighting

Sample Count

Sample

HHs

Base

Estimates After Basic Weighting

EMP UE HHs on Frame Weight EMP UE

State A

3,000

400

2,000

500,000

250

750,000

100,000

State B

2,750

250

1,750

175,000

100

275,000

25,000

13

Page 14: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Example of Basic Weighting

• Self-weighting within state• State A has N= 500,000 and sample n=2,000

– baseweight = N/n = 500,000/2,000 = 250– An estimate of employment obtained by multiplying sample

count (EMP = 3,000) by the baseweight • 3,000 x 250 = 750,000

• State B has N= 175,000 and sample n=1,750– baseweight = N/n = 175,000/1,750 = 100– An estimate of unemployment obtained by multiplying

sample count (UE = 250) by the baseweight • 250 x 100 = 25,000

14

Page 15: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Simple Weighted Estimates

Estimate x of a Total X• A Simple Weighted Estimate adds persons using their

weights (wi weight for ith person)

• Sum across all persons in the sample• xi is a data value for person i

– for example xi = 1 for employed, 0 otherwise

m

iixwx1

15

Page 16: Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I

Copyright 2010, The World Bank Group. All Rights Reserved.

Simple Weighted Estimates Example

Continue the previous example for State A• Simple Weighted Estimate of employment

xi = 1 for employed, 0 otherwise

• Can restrict sum to the 3,000 employed – since xi=0 for the other responding persons

000,750250*3000

)1(2503000

1

3000

1

000,4

1

ii

m

ii xwxwx

16