copyright 2010, the world bank group. all rights reserved. processing, part 1 data capture, editing,...

21
Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census 1

Upload: sheena-daniels

Post on 25-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

PROCESSING, Part 1

Data capture, editing, imputation and tabulation

Quality assurance for census

1

Page 2: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

COMPONENTS OF PROCESSING

Data CaptureEditingImputationTabulation

2

Page 3: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

DATA CAPTURE

Key EntryScanningDirect Entry

3

Page 4: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

KEY ENTRY

Key entry as a data capture techniques has advantages and disadvantages:

Pro• Relatively inexpensive• Skills readily available• Employment

4

Page 5: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

KEY ENTRY (cont’d)

Con• Time consuming• Requires many workstations• Error prone

5

Page 6: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

SCANNING

Scanning is a process similar to photocopying, but the current technology has advanced far beyond simple photocopying.

There are three levels of scanning:1. OMR: Optical Mark Recognition2. OCR: Optical Character Recognition3. ICR: Intelligent Character Recognition

6

Page 7: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

SCANNING (cont’d)

Pro• Fast processing

• Reliable

7

Page 8: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

Con• Expensive upfront costs for equipment ,

software, & training• Very precise requirements for paper,

printing and processing

8

SCANNING (cont’d)

Page 9: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

DIRECT ENTRY

• Most Recent Innovations

• Enumerators use hand-held computers,

or where internet is common

• Self enumeration through the internet

9

Page 10: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

DIRECT ENTRY

Pro• More efficient– Saves a step (immediate data capture)

• Improves data quality– Editing at respondent level– Timeliness

• Reduces some costs– Printing questionnaires

10

Page 11: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

DIRECT ENTRY

Con• Requires better trained enumerators• Riskier– Hardware failure– PDA loss– Requires electricity

• Expensive

11

Page 12: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

EDITING

The Editing Process• Identifies errors• Identifies non-response• Identifies logical inconsistencies

12

Page 13: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

DEALING WITH ERRORS

• Replacement (Imputation)

• Weighting

13

Page 14: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

IMPUTATION

Two classes of imputation

• Deterministic

• Stochastic

14

Page 15: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

DETERMINISTIC IMPUTATION

• Will yield the same answer each time– Missing data may be calculable from other values

- e.g. Citizenship can be calculated from Place of Birth

– Missing data is imputed by a sequential donor technique or any other method that will yield identical results

15

Page 16: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

DETERMINISTIC IMPUTATION

• Six main types– Deductive or Logical– Mean Value– Ratio/Regression– Sequential Hot Deck– Sequential Cold Deck– Nearest-Neighbor

16

Page 17: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

STOCHASTIC IMPUTATION

• Can yield different results if the process is rerun– Use of a random donor or other randomized

approach– Use of randomized residuals to create realistic

data

• Most deterministic methods have a stochastic counterpart

17

Page 18: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

IMPUTATIONWhich inconsistency to change?• The general rule is to change as few values as

possible• Which values to impute?– For a 10 year old married university graduate, do

we change the age or the education and marital status?

– changing only the age can make the record consistent, therefore change age

18

Page 19: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

VALIDATION

After imputation data are consistent.May or may not be correctValidation is the final step before certification

and release

19

Page 20: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

VALIDATION

May be subject to bias: • Design• Training• Enumerator• Respondent• Processing

20

Page 21: Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

Copyright 2010, The World Bank Group. All Rights Reserved.

CERTIFICATION

Final step before data release

NSO’s expression of confidence in the data

21