eurostat secondary data: collection and use presented by arnout van delden methodologist statistics...

33

Upload: gabriel-reeve

Post on 15-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands
Page 2: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Eurostat

Secondary data: collection and use

Page 3: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Presented by

Arnout van DeldenMethodologist

Statistics Netherlands

Page 4: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Secondary data

Page 5: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Secondary sources

Statistical sources

Administrative sources

Organic sources

Public administrative sources

Private administrative sources

- Survey data from other organizations

- Trade register - Tax data - Medical register - Base register

- Product price data - Call detail records - Electricity data

- Dwelling prices on internet - GPS data - Social media messages

Page 6: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

SecondarySources

Registers

Base registers

Statistical registers

specific

Page 7: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

PAST PRESENT FUTURE

Page 8: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Official Statistics

Post-war II Identifiers Concepts: variable, units, time Population registers Administrative Census

– Denmark (1981), Finland (1991), Netherlands (2001)

Page 9: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Use (EU/EFTA Survey 2010)

Frame Observations Auxiliary data Model parameters Data quality

             

  admin data only

admin and survey data

survey data only

not specified non response Total

BR 12,0 16,0 2,0 30

SBS 10,5 11,5 4,7 0,7 2,7 30

STS 4,0 11,0 14,0 0,0 1,0 30

Prodcom 0,0 10,0 13,0 1,0 2,0 26

Page 10: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

In sum

Many types of data sources Long history Potentially very useful

Page 11: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

CollectionExistenceAccess

Page 12: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Existence

• Data protection act• Organisation registers data under DPA

Page 13: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Existence

• Data protection act• Organisation registers data under DPA

Page 14: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Access

Element Explanation

Legislation National Statistics Act

Public approval Informed consent

Identification codes Base registers (business, dwellings, …)

Reliable data Obliged to report errors; multi users

Cooperation Contacts with administration authorities

Page 15: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

In Sum

Explore potential data sources Access: legal uses and public consent

Page 16: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Proper use

Page 17: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands
Page 18: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Exploration phase

Source

Meta

Page 19: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Processing phase: data useful?y = 1,1625x

0

500

1.000

1.500

2.000

2.500

0 200 400 600 800 1.000 1.200 1.400 1.600

x 10

00 E

UR

x 1000 EUR

Omzet KS

omze

t BTW

y = 0,9743x

0

200

400

600

800

1.000

1.200

1.400

1.600

1.800

0 200 400 600 800 1.000 1.200 1.400

x 10

00 E

UR

x 1000 EUR

Omzet KS

omze

t BTW

March ‘04 Dec ‘04Turnover Sample Survey Turnover Sample SurveyTurn

over

VAT d

ata

Page 20: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Data patterns

Unit Period Value Unit Period Value

2022253 Q1 3000 222201 Q1 2000

2022253 Q2 3000 222201 Q2 2500

2022253 Q3 3000 222201 Q3 0

2022253 Q4 4561 333301 Q4 2200

Page 21: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Issues to consider

Dimension Issues Methods

Time Reporting delays Now casting, imputation

Reporting < > Statistical period

Harmonisation (time series)

Representation Administrative units Linkage

Coverage errors Business register

Measurement Data patterns Model/time series

Corrections Updates

Different meaning Analyse

Page 22: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Administrative data:

Many merits

Explore

More than adding up

Page 23: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Access

Page 24: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Access

Set of base registers• data re-used• report errors• 1 contact person in NSI• large dependency users

Page 25: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Properties of Administrative data

1 Collected externally

2 Administrative goal

3 Different objectives

4 Subject to changes

Page 26: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

2 Can I use of a specific data source?

What ‘steps’ are needed?

• Existence• Access• Fitness for use• Fall back scenario’s• Processing

Page 27: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Processing: data integration

Register F R A M E

Tax Unit Tax Unit Legal UnitStatistical Unit

3 2 1

Survey 4Statistical Unit

• Linkage• Micro-integration• Imputation/weighting• Macro-integration

Page 28: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Fall back scenarios

Quarterly turnover from Survey en Admin data– Risk only data from month 1 and 2– Model: missing units predicted from respondents– Indicator: how many and which units to call

Page 29: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Fall back scenarios

• Risk analyses• Strategy fall back scenario

– Obtain missing data elsewhere?– Model-based approach– Inform users– Postpone publication

Page 30: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Processing: robust estimation

• Medical expenses (volume, prices)• Coding system for medical treatments • First coding in 2008• Coding slightly revised 2009• New coding system 2010

Page 31: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Fitness for use

Dimension Description

Technical Checks Technical usability of file and data

Accuracy 1) Closeness to true values, 2) Correctness, reliability

Completeness Describe the corresponding set of real-world objects and variables

Time-related dimension Rime and/or stability related

Integrability Capable of undergoing integration or of being integrated.

Data

Page 32: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Use

Type of use Example Source typePopulation frame Chamber of Commerce data for Business

RegisterBase register

Source for observations

VAT data for quarterly turnover estimates Public admin source

Auxiliary data Internet data to verify the NACE code of enterprises

Organic source

Estimation of model parameters

Energy supplier data for average energy consumption for CPI

Private admin source

Audit quality of statistical data

Social security data to assess quality employment position based on sampling data

Public admin source

Page 33: Eurostat Secondary data: collection and use Presented by Arnout van Delden Methodologist Statistics Netherlands

Concluding remarks

• Merits– Reduction response burden– Detailed & Longitudinal – Longitudinal data

• Consequences– Relations with administrative data holder– Prone to changes