pilot census in poland some quality aspects geneva, 7-9 july 2010 janusz dygaszewicz central...

51
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Upload: octavia-haynes

Post on 17-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Pilot Census in PolandSome Quality Aspects

Geneva, 7-9 July 2010

Janusz DygaszewiczCentral Statistical Office

POLAND

Page 2: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

2

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Data processing infrastructure

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 3: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Key elements of census process in terms of census quality • Census planning - scope of census,• Data sources,• Data collecting,• Data storing,• Data processing,• Development of census results,• Dissemination of census results,• Census Metadata System.

Census Quality

3

Page 4: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

CENSUS PLANNING

4

Page 5: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Census planning Quality aspects: relevance, accuracy, costs including the burden on respondents, information security

• Determining the data scope defined in Act including:• Compliance with needs of domestic and

EU users,• Quality of data source,• Coherence and comparability of results

from census 2011 and 2002,

Census Quality

5

Page 6: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

DATA ACQUISITION

6

Page 7: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

7

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Data acquisition

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 8: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Files format:• Flat files,• XML files,• Local Databases XML files integration,

Data acquisition

8

Page 9: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Data acquisition - Portal

9

Page 10: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Datasources Quality aspects: accuracy, timeliness and punctuality, comparability and coherence, costs including the burden on respondents, information security• Assessment of data sources quality for census:

• analyses of methodological compliance of concepts definitions from registers with those adopted in statistics and the UNECE and EUROSTAT Recommendations for the 2010 Censuses on Population and Housing,• developing methodology for compliance

analyses,• constructing the IT system PiK for describing,

comparing and assessing coherence level,

Census Quality – data acquisition

10

Page 11: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Registers• developing methodology for assessing the

quality: dimensions, quality indicators,• evaluation and description of sources

quality,• MATRIX that represents the possibility of

obtaining the values for the census from registers:• census variable compliance indicators

(methodology compliance indicator), • register suitability indicators (population

coverage indicator for data from the register),

Census Quality – data acquisition

11

Page 12: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Data sets• developing methodology for assessing

the quality,• evaluation and description of data sets

quality,• developing methodology for improving

source data sets quality – rules for: standardization, normalization, de-duplication, editing, imputation, calibration

Census Quality – data acquisition

12

Page 13: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

CENSUS FRAME PREPARATION

13

Page 14: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Citizens, buildings and dwelling list preparing,

Citizens, buildings and dwelling list and statistical data integration,

Census Frame preparing.

Census Frame preparation

14

Goal Frame Preparation,

Random Sample preparation,

Page 15: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Quality of Census Frame

15

Census frame pre-census revision - checking in field by enumerators

Census frame preparation – validation and updating in counties,

Page 16: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Enumerator tracking

Page 17: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND
Page 18: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

18

Page 19: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

19

Page 20: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

20

Page 21: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

21

Page 22: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

22

Page 23: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Census Completeness Monitoring

Page 24: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

24

Page 25: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

TRANSFORMATION TO STATISTICAL REGISTER

25

Page 26: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

26

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Source data collection and preparation

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 27: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Registers loading into data laboratory envroiment,

Denormalization,

Standarization,

Deduplication,

Validation,

Data completion,

Vocabulary validation and automatic correction,

Statistical files (register) generation,

Source data collection and preparation

27

Page 28: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Collecting dataQuality aspects: accuracy, costs including the burden on respondents, information security

• Collecting data from information systems• Central registers,• Distributed registers,

• format / file structure (XSD schemas),• data transfer platform,• application for encrypted data transfer,• application for validation and data set control

Census Quality – collection and preparation

28

Page 29: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Data loading to Operational Microdatabase,

Validation

Manual and automatic correction (cleaning),

Deduplication,

Variables calculating,

Source data loading and correction

29

Page 30: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

30

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

CAxI

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 31: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

•CAII - Computer Assisted Internet Interview,•CAPI - Computer Assisted Personal Interview,•CATI - Computer Assisted Telephone Interviewing.

CAxI

CAxI

31

CAXI

Page 32: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

• Collecting data from respondents: CAII, CAPI, CATI;• CAxI input validation:

• Numerical data validation (answers within boundaries)• Cross question arithmetical validation• Hints and automatic answer completion• Dictionaries and drop down menus

• CAxI logical validation: • Answers determined by questions• Cross question logical validation• Data collection logical paths

Census Quality – data collection by electronic questionare

32

Page 33: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Data storingQuality aspects: information security

• Data storing in Operational Microdata Base,• Notification of Operational Microdata Base

to registration by General Inspector for Protection of Personal Data,

Census Quality

33

Page 34: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

GOLDEN RECORD,

34

Page 35: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

35

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Golden Record generation

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 36: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

36

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Export to Analitycal Microdata Base

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 37: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Integration with Census Frame and CAxI data,

Validation,

Correction,

Operational Imputation,

Transfer proper values to Golden Record,

Golden Record generation

37

Registers 1..n

CAxI

Golden Record

OMB Layers

Page 38: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Transition Tables Preparing,

Golden Records anonymisation,

Transfer to Analitycal Microdatabase,

Export to Analitycal Microdata Base

38

Page 39: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Data processingQuality aspects: accuracy

• Developing quality indicators for data sets at each stage of data processing and the procedures for calculating their value,

• Developing procedures for bringing data from administrative sources to full compliance or minimum discrepancy with appropriate methodology adopted in statistics,

• Developing procedures for normalization, editing of data sets from the administrative systems, including the imputation of data (administrative data sets),

• Developing procedures for synchronization of data from administrative systems,• Developing rules for linking data from different administrative systems,• Developing rules for linking data from administrative systems with data from CAII, CAPI, CATI,• Developing rules for calculation of Golden Record census variables,• Developing rules for anonymisation of Golden Record census data.

Census Quality

39

Page 40: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

ANALITYCAL MICRODATABASE

40

Page 41: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

41

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Analitycal Microdata Base

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 42: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Analitycal Microdata Base - process

42

Process

data

Load dat a and m et adat aI nt egrat e dat aCl assi f y and code dat aEdi t and val i dat e dat aI m put eD er i ve new var i abl esWageAggregat eCreat e fil es

Analyse

Disse

minate

Archive

Manage metainformation

Manage quality

Page 43: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Functionality

43

AdministrationInformation

Security Management

Data Processing

Information Analisys

Requirement and Product Management

Dissemination

Metadata

Quality Management

Analitycal Microdatabase

Page 44: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Development of census resultsQuality aspects: relevance, accuracy, comparability and coherence

• Developing rules for missing data completion - imputation and calibration,• Developing rules for creating derived objects - creation of new objects

(households, families),• Developing a model / method of data estimation with the use of the data

from administrative systems and sample surveys,• Developing rules for calculating data outputs.

Census Quality

44

Page 45: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

DISEMINATION

45

Page 46: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Dissemination of census resultsQuality aspects: relevance, timeliness and punctuality, accessibility and clarity, comparability and coherence, information security

• Designing Analitycal Microdata Base features including compliance with users needs, accessibility and clarity of census data.

Census Quality - disemination

46

Page 47: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

METAINFORMATION MANAGEMENT

47

Page 48: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

48

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Metadata server

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 49: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Metainformation management

49

Metainformation

Definition

BussinesReferencial

Conceptual Methodical Quality

Structural

Technical

System

Postprocessing

Page 50: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Census Metadata SystemQuality aspects: accessibility and clarity

• Developing quality indicators at each stage of census and the procedures for calculating their value.

Census Quality – metainformation

50

Page 51: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

51

POLAND