harry goossens - essnet coordinator

18
ESS-net DWH The statistical data warehouse: a central datahub, integrating new datasources and statistical output Harry Goossens - ESSnet Coordinator Head Data Service Centre at Statistics Netherlands [email protected] ESSnet on microdata linking and data warehousing in statistical production UNECE - Seminar on New Frontiers for Data Collection Geneva, 31 October - 2 November 2012

Upload: jemima

Post on 31-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

ESSnet on microdata linking and data warehousing in statistical production. The statistical data warehouse: a central datahub, integrating new datasources and statistical output. Harry Goossens - ESSnet Coordinator Head Data Service Centre at Statistics Netherlands [email protected]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH

The statistical data warehouse: a central datahub,

integrating new datasources and statistical output

Harry Goossens - ESSnet CoordinatorHead Data Service Centre at Statistics Netherlands

[email protected]

ESSnet on microdata linking and data warehousing in statistical production

UNECE - Seminar on New Frontiers for Data CollectionGeneva, 31 October - 2 November 2012

Page 2: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 2

Background ESS-net

Challenges

Explaining the statistical data warehouse (S-DWH)

Elements of the S-DWH

- Business architecture

- GSBPM mapping

Meta data

Content

Page 3: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH

ESSnet on microdata linking and data warehousing

in statistical production

Page 4: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 4

ESS-net coordinator:

Statistics Netherlands (CBS)

Co-partners:

Estonia, Italy, Lithuania, Portugal, Sweden, UK

Starting date:

4 October 2010

SGA 1: first year, till 3 October 2011

SGA 2: last 2 years, till 3 October 2013

ESSnet Partnership

Page 5: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 5

Provide assistance in: the development and implementation of a maximum efficient statistical process for business and trade statistics, independent of any (technical) specific architecture

Results in daily statistical practice: increase the efficiency of data processing

in statistical production systems maximize the reuse of already collected data

a 'data warehouse' approach to statistics

General Objectives ESSnet DWH

Page 6: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 6

Decrease of costs & administrative burden versusincrease of efficiency & flexibility

Rapidly changing demand for information:- growing need for more information on more topics- decreasing lifecycle of policymakers, quicker delivery

Disclosure of all new data sources coming from global use of modern technology

Make optimal use of all available data sources (existing & new)

The Challenges

Page 7: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 7

The Statistical Data Warehouse

A central ‘statistical data store’ for managingall available data of interest, regardles of its source, enabling the NSI to produce necessary information (= statistics !)

and to (re)use available data to create new data / new outputs.

A central data hub to connect and integrate all available data sources, supporting statistical production AND data collection processes by providing:

a detailed and correct overview/insight of all available data sources a framework for adequate data governance, including metadata management, confidentiality aspects and data authorisation flexible data storage and data exchange between processes access to registers sampling frames (BR, etc);

Page 8: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 8

AggregateStatistics

AggregateStatistics

Microdata

Dataextracts

Dataextracts

Dataextracts

Dataset

Dataset

Dataset

Backbones(BR eg.)

Selectedsample

Selectedsample

Admin datasource

Admin datasource

BBsnapshots

Storage, combination

OutputsInput dataInput reference frame

Sta

ging

are

a

Wor

king

dat

a

Rules for generating samples etc.

Rules for updating BB

Page 9: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 9

A system or set of integrated systems, designed to handle the processing of statistical data in the production of statistics, comprimising: technical facilities for storing and processing data, receiving data in and producing outputs in a flexible way rules for updating the sources for the DWH definitions necessary to achieve those samples / sources

The S-DWH is a concept that provides an architectural model of the statistical data flow, from data collection to statistical output

Explaining the S-DWH

Page 10: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 10

The S-DWH Business Architecture

Conceptualisation of how to build up a S-DWH A common model for the total statistical process

and data flow Provide optimal organisation of all structured data,

enabling re-use, creation of new data etc. 4 Layers, covering all statistical activities

‒ Sources‒ Integration‒ Interpretation & Analysis‒ Data Access / Output

Page 11: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 11

The layered architecture of the S-DWH, with focus on the data sources used in each layer

Page 12: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 12

Use the GSBPM as common language to identify and locatethe various phases on the 4 S-DWH layers

Mapping the S-DWH on the GSBPM

Page 13: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 13

The S-DWH is a logically coherent central data store, not necessarily one single physical unit.

Metadata is vital in the governance, satisfying 2 essential needs:

to guide statisticians in processing and controlling the statistical data

to inform users by giving insight in the exact meaningof the statistical data

The vertical metadata layer enables to search all (meta)data in the 4 layers and, if permitted, give access to the data.

Managing the S-DWH

Page 14: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 14

Meta data layer

Source Layer

Integration Layer

Interpretation and Data Analysis Layer

Data Access Layer

Page 15: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 15

Framework:

General meta data definitions

Meta data for the S-DWH

Use of meta data models

Meta data standards & norms

Meta data quality & governance

Categories & subsets

Minimum requirements

Meta data - the DNA of the S-DWH

Page 16: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 16

S-DWH meta data requirements

Subsets Standards & Norms

ISO 11179

Internal rulesGuidelines

Mata data model S-DWH Gatekeeper

Page 17: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH 17

Implementation of a S-DWH has huge organisational impact:

It means: moving from single operations to integrated, generic processes

It needs: a redesign of the statistical process

It asks: new IT systems, tools, high investments

It is: a new way of working

Only changing systems will not do the trick,changing people is the key to success

Organisational aspects

Page 18: Harry Goossens  -  ESSnet Coordinator

ESS-net DWH

Thank you !

ESSnet on data warehousing