combining administrative and survey data: potential benefits and impact on editing and imputation...

11
Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on Statistical Data Editing Ljubljana, 9-11 May 2011 De Giorgi V., Luzi O., Oropallo F., Seri G., Siesto G. ISTAT, Italy

Upload: rhoda-norris

Post on 25-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

UNECE Work Session on Statistical Data EditingLjubljana, 9-11 May 2011

De Giorgi V., Luzi O., Oropallo F., Seri G., Siesto G.

ISTAT, Italy

Page 2: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

UNECE Work Session on Statistical Data Editing

Ljubljana, 9-11 May 2011

Outline

• Background

• The Istat re-design project for SBS

• Paper contents

• Some results / conclusions

Page 3: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

UNECE Work Session on Statistical Data Editing

Background

European Regulation on Structural Business Statistics (SBS) (March 2008) states that “MSs can integrate data available in different information sources, including administrative data, to reduce statistical burden on enterprises and production costs “

MEETS program (Modernisation of European Enterprises and Trade Statistics) approved by the European Council and Parliament (Dec 2008) in order to

“..promote efficiency within the ESS moving towards a more integrated and efficient statistical system, reviewing the production methods and foster a dynamic approach to achieving cost efficiency “

lLjubljana, 9-11 May 2011

Page 4: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

UNECE Work Session on Statistical Data Editing

The Istat re-design project for SBS

Completely revise the SBS production system, moving from the current production model (where admin data are mainly used to integrate statistical survey data) to a new integrated system (where admin data represent the core of information on SBS and direct surveys supply complementary information on specific sub-populations and/or economic variables), in order to save costs and reduce statistical burden on enterprises

lLjubljana, 9-11 May 2011

The Italian economic system is characterized by a large amount of small and medium enterprises (SMEs) (number of employees<100)

• ~95% of the Italian enterprises

• sum up for ~47% of Total Employees~ 27% of Total Turnover~ 33% of Total Value Added

Page 5: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

UNECE Work Session on Statistical Data Editing

The Istat re-design project for SBS

Explored administrative sources for the economic variables of SME

Italian Statistical Business Register (BR) resulting from the integration of data from statistical and administrative sources (Tax Register, Social Security Register, Register of the Electric Power Board, etc.)

Financial Statements (FS): cover corporate enterprises; are collected by the Chambers of Commerce (~ 650,000 companies covering less than 20% of the BR, and ~57% in terms of persons employed)

Sector Studies (SS) Survey: carried out by the Italian Fiscal Authority to collect enterprises detailed costs and income items (~4 million enterprises having the 30,000 ≤Turnover ≤ 7,5 million euro)

lLjubljana, 9-11 May 2011

Page 6: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

Potential data/process flow in the new integrated SME production process

UNECE Work Session on Statistical Data Editing

Ljubljana, 9-11 May 2011

FS (Ch. of Commerce) SS Other fiscal sources BR

Electronic data acquisition (Portal + XBRL) Data validation and imputation/Data Integration

Data Frame Sample Survey

Data validation and imputation

Data Integration

Data validation and imputation

Data Warehouse

Page 7: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

UNECE Work Session on Statistical Data Editing

Paper contents

Illustrate preliminary exploratory analyses and experimental results on FS and SS aiming to:

Ljubljana, 9-11 May 2011

• Assess the usability and quality of available administrative sources for estimating key variables for the SME population

• Obtain information on potential E&I costs/benefits deriving from integrating administrative and survey data on SMEs

Page 8: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

UNECE Work Session on Statistical Data Editing

Ljubljana, 9-11 May 2011

Analysed quality dimensions of admin sources

• Coverage

• Covered SME population

• Potential impact on estimates due to the direct replacement of statistical data with admin data

• Accuracy (at elementary and distributional level)

• Completeness

• Available consistent variables (in terms of definitions) in the source

• Potential impact on estimates due to imputation of item non responses

Priorities can be assigned to sources based on these criteria (e.g. in case of overlapping units/variables in different sources)

Page 9: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

SME target population coverage (percent) of FS and SS, in terms of number of enterprises (ENT) and number of employees (EMP) by economic activity (Year 2007)

UNECE Work Session on Statistical Data Editing

Ljubljana, 9-11 May 2011

FS SDS-F SDS-G TOTAL ENT EMP ENT EMP ENT EMP ENT EMP C-Mining and quarrying 49.9 69.8 39.5 23.9 0.1 0.0 89.5 93.8 D-Manufacturing 22.5 54.5 64.8 37.9 0.0 0.0 87.3 92.4 E-Electricity, gas and water supply 57.5 81.8 2.1 0.6 0.1 0.0 59.7 82.4 F-Construction 14.3 33.4 72.9 56.5 0.1 0.0 87.2 89.9 G-Wholesale and retail trade; repair of motor vehicles, motorcycles and personal and household goods

11.1 30.7 77.0 60.1 0.1 0.0 88.2 90.9

H-Hotels and restaurants 10.8 24.5 75.1 66.1 0.0 0.0 85.9 90.7 I-Transport, storage and communication 16.3 47.6 68.7 39.9 1.1 0.3 86.1 87.8 J-Financial intermediation 6.1 13.9 72.8 68.0 6.3 4.3 85.2 86.3 K-Real estate, renting and business activities 13.9 31.5 23.9 22.8 49.9 34.6 87.7 88.9 M-Education 19.2 45.6 22.6 15.9 1.5 0.5 43.3 62.0 N-Health and social work 4.6 31.1 2.9 4.2 81.9 55.3 89.4 90.7 O-Other community, social and personal service activities

7.8 26.2 64.5 53.9 3.7 1.8 76.0 81.9

TOTAL 13.2 37.0 56.6 45.1 17.0 7.8 86.8 90.0

Page 10: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

UNECE Work Session on Statistical Data Editing

Ljubljana, 9-11 May 2011

Method NACE REEI RMSE 17 0,050 0,060 52 0,060 0,100

NND (3 digits Nace+Legal form+CS

sign) 55 0,110 0,160

17 0.020 0.040 52 0.030 0.030

RR (2 digits Nace+Legal form)

55 0.010 0.020 17 0.024 0.032 52 0.013 0.020

Median (3 digits Nace+Legal form+Size+CS sign) 55 0.001 0.003

Variable “Changes in stocks of finished products and work in progress”: potential impact of missing data imputation by domain (2 digits Nace’s code) and imputation method (Year 2007)

100

1 ,

,,

ˆ

ˆˆ

100

1

ID

oriCsfp

DimpCsfpk

DoriCsfpD

T

TTREEI

(Mean) Relative Estimation Error due to Imputation/Estimation

Page 11: Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on

Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey

Conclusions

Results obtained so far, partially presented in the paper, confirm the high quality of the analyzed admin sources in terms of accessibility, coverage and accuracy

It is expected that in the new SME production system, additional validation activities due to the direct use admin in combination with survey data will be highly compensated by costs and burden reduction

Additional research is needed particularly on the side of sampling strategy and questionnaire design for the new SME sample survey

The preliminary process implementation is planned by the end of 2011

UNECE Work Session on Statistical Data EditingLjubljana, 9-11 May 2011