analysis of the data preparation process of the structural ... department of home affairs fdha...

17
Federal Department of Home Affairs FDHA Federal Statistical Office FSO Analysis of the data preparation process of the structural survey of the Swiss population census UNECE Work Session on Statistical Data Editing, WP. 27, Budapest 15 September 2015 Daniel Kilchmann, Statistical Methods Unit, Swiss Federal Statistical Office Beat Hulliger, University of Northwestern Switzerland Analysis CSS-SDPP: c SFSO 1

Upload: hathien

Post on 30-Mar-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Analysis of the data preparation processof the structural survey of the Swiss

population census

UNECE Work Session on Statistical Data Editing,WP. 27, Budapest 15 September 2015

Daniel Kilchmann, Statistical Methods Unit, Swiss Federal Statistical Office

Beat Hulliger, University of Northwestern Switzerland

Analysis CSS-SDPP: c©SFSO 1

Page 2: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Contents

Introduction

Data preparation process of the CSS

Analysis project of the CSS-SDPP

Outlook

Analysis CSS-SDPP: c©SFSO 2

Page 3: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Introduction

Analysis CSS-SDPP: Introduction c©SFSO 3

Page 4: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Swiss population Census 2010-

I Register based census combined with sample surveys:I municipalities’ registersI federal housing and dwelling registerI census’ structural survey (CSS)I two annual sample surveys on specific themes.

I CSS, sample size 250’000 persons, coveringI Labour market, language, religion, education, migration

and commuting of persons.I Household composition, household member

characteristics and dwelling variables.I Paper (75%), internet (25%).

Analysis CSS-SDPP: Introduction c©SFSO 4

Page 5: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Purpose of analysis project

I SFSO launched a project to analyse the statistical datapreparation process (SDPP) of the CSS to

I gather deeper knowledge about impact of the SDPP onresults,

I monitor the impact during the SDPP.

I Better understanding whether conceptual framework(EDIMBUS, [Luzi, O. et al.(2007)]) and process design(SFSO-SDPP) are appropriate.

I Selection of useful indicators, calculated during SDPP.

Analysis CSS-SDPP: Introduction c©SFSO 5

Page 6: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Data preparation process of the CSS

Analysis CSS-SDPP: Data preparation process of the CSS c©SFSO 6

Page 7: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Data preparation process of the CSS

A0

A1A2 A3

Analysis CSS-SDPP: Data preparation process of the CSS c©SFSO 7

Page 8: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

E&I methods

I Automized edit rules for missingness and inconsistencies.

I Call backs between A0 → A1.

I Deterministic imputation rules A1 → A2, A2 → A3.

I Outlier detection for rent A2 → A3.

I Nearest neighbour imputation based on NIM,[Bankier, M., Lachance, M. and Poirier, P.(2000)],A2 → A3.

I Outlier detection and comparisons A2 → A3.

I Ad-hoc analysis scripts based on indicators in[Luzi, O. et al.(2007)].

Loops only during implementation phase ⇒ not included inthe analysis.

Analysis CSS-SDPP: Data preparation process of the CSS c©SFSO 8

Page 9: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Analysis project of the CSS-SDPP

Analysis CSS-SDPP: Analysis project of the CSS-SDPP c©SFSO 9

Page 10: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Aims

1. Evaluation of the impact on results of individualtreatments or whole phases.

2. Potential improvements to the process design.

3. Highlighting of possible questionnaire design problems.

Analysis CSS-SDPP: Analysis project of the CSS-SDPP c©SFSO 10

Page 11: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Challenges

I Mostly categorical variables, each response categorycoded by a binary variable (multiple responses) → singlequestion = group of binary variables (response group).

I Categ. variables not prominent on indicator lists, e.g.[Ehling, M. et al.(2007)], [Luzi, O. et al.(2007)], betterin [Quality team of Eurostat(2014)] ⇒ enhancement ofuse/interpretation of indicators for categorical variables.

Analysis CSS-SDPP: Analysis project of the CSS-SDPP c©SFSO 11

Page 12: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

I No baseline SDPP for comparison ⇒ outcome of thestudy: baseline.

I No ’truth’ available ⇒ no indicators requiring the ’truth’.

I Development of a R-package.

Analysis CSS-SDPP: Analysis project of the CSS-SDPP c©SFSO 12

Page 13: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Levels of indicators

1. Global indicators: for the whole data set (allobservations, all variables)

2. Subset indicators: for subsets of the data (all variables)

3. Group indicators: for all observations (groups ofvariables)

4. Observation indicators: for single observations (allvariables)

5. Variable indicators: for single variables (all observations)

6. Subset-group indicators: for subsets of the data andgroups of variables

Edit rule indicators are under discussion.

Analysis CSS-SDPP: Analysis project of the CSS-SDPP c©SFSO 13

Page 14: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Core set of indicators

Description Global Subset Obser- Group Vari-vation able

unit response rate x x x xitem response rate x x x x xitem response ratio x x ximputation rate (responded∗) x x x x ximputation ratio (responded∗) x x

∗ Indicators for respondents only might be seen as a proxy for the impact

of edit rules.

Analysis CSS-SDPP: Analysis project of the CSS-SDPP c©SFSO 14

Page 15: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Outlook

Analysis CSS-SDPP: Outlook c©SFSO 15

Page 16: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

Outlook

I Formulas and meaning will be checked byimplementation.

I Indicators under discussion:I edit rulesI changes in structurally missingsI distributional.

I SDPP ’optimization’-criteria under investigation.

I Thresholds, indicating anomalies?

Analysis CSS-SDPP: Outlook c©SFSO 16

Page 17: Analysis of the data preparation process of the structural ... Department of Home Affairs FDHA Federal Statistical Office FSO Title of presentation Subtitle (not in bold) Analysis

Federal Department of Home Affairs FDHAFederal Statistical Office FSO

Title of presentationSubtitle (not in bold)

Author of presentationDate (optional: followed by event title)

References

Bankier, M., Lachance, M. and Poirier, P.

2001 canadian census minimum change donor imputation methodology.Working paper, UNECE work session of Statistical Data Editing, Cardiff, 2000.URL http://www.unece.org/stats/documents/2000.10.sde.htm.

Ehling, M. et al.

Handbook on Data Quality Assessment Methods and Tools.European Commission, Eurostat, 2007.

Luzi, O. et al.

EDIMBUS-RPM.Eurostat, August 2007.URL http://ec.europa.eu/eurostat/documents/64157/4374310/30-Recommended+

Practices-for-editing-and-imputation-in-cross-sectional-business-surveys-2008.pdf.

Quality team of Eurostat.

ESS Guidelines for the Implementation of the ESS Quality and Performance Indicators (QPI).European Commission, Eurostat, 2014.URL http://ec.europa.eu/eurostat/documents/64157/4373903/

02-ESS-Quality-and-performance-Indicators-2014.pdf.

Analysis CSS-SDPP: References c©SFSO 17