harry goossens - essnet coordinator
DESCRIPTION
ESSnet on microdata linking and data warehousing in statistical production. The statistical data warehouse: a central datahub, integrating new datasources and statistical output. Harry Goossens - ESSnet Coordinator Head Data Service Centre at Statistics Netherlands [email protected]. - PowerPoint PPT PresentationTRANSCRIPT
ESS-net DWH
The statistical data warehouse: a central datahub,
integrating new datasources and statistical output
Harry Goossens - ESSnet CoordinatorHead Data Service Centre at Statistics Netherlands
ESSnet on microdata linking and data warehousing in statistical production
UNECE - Seminar on New Frontiers for Data CollectionGeneva, 31 October - 2 November 2012
ESS-net DWH 2
Background ESS-net
Challenges
Explaining the statistical data warehouse (S-DWH)
Elements of the S-DWH
- Business architecture
- GSBPM mapping
Meta data
Content
ESS-net DWH
ESSnet on microdata linking and data warehousing
in statistical production
ESS-net DWH 4
ESS-net coordinator:
Statistics Netherlands (CBS)
Co-partners:
Estonia, Italy, Lithuania, Portugal, Sweden, UK
Starting date:
4 October 2010
SGA 1: first year, till 3 October 2011
SGA 2: last 2 years, till 3 October 2013
ESSnet Partnership
ESS-net DWH 5
Provide assistance in: the development and implementation of a maximum efficient statistical process for business and trade statistics, independent of any (technical) specific architecture
Results in daily statistical practice: increase the efficiency of data processing
in statistical production systems maximize the reuse of already collected data
a 'data warehouse' approach to statistics
General Objectives ESSnet DWH
ESS-net DWH 6
Decrease of costs & administrative burden versusincrease of efficiency & flexibility
Rapidly changing demand for information:- growing need for more information on more topics- decreasing lifecycle of policymakers, quicker delivery
Disclosure of all new data sources coming from global use of modern technology
Make optimal use of all available data sources (existing & new)
The Challenges
ESS-net DWH 7
The Statistical Data Warehouse
A central ‘statistical data store’ for managingall available data of interest, regardles of its source, enabling the NSI to produce necessary information (= statistics !)
and to (re)use available data to create new data / new outputs.
A central data hub to connect and integrate all available data sources, supporting statistical production AND data collection processes by providing:
a detailed and correct overview/insight of all available data sources a framework for adequate data governance, including metadata management, confidentiality aspects and data authorisation flexible data storage and data exchange between processes access to registers sampling frames (BR, etc);
ESS-net DWH 8
AggregateStatistics
AggregateStatistics
Microdata
Dataextracts
Dataextracts
Dataextracts
Dataset
Dataset
Dataset
Backbones(BR eg.)
Selectedsample
Selectedsample
Admin datasource
Admin datasource
BBsnapshots
Storage, combination
OutputsInput dataInput reference frame
Sta
ging
are
a
Wor
king
dat
a
Rules for generating samples etc.
Rules for updating BB
ESS-net DWH 9
A system or set of integrated systems, designed to handle the processing of statistical data in the production of statistics, comprimising: technical facilities for storing and processing data, receiving data in and producing outputs in a flexible way rules for updating the sources for the DWH definitions necessary to achieve those samples / sources
The S-DWH is a concept that provides an architectural model of the statistical data flow, from data collection to statistical output
Explaining the S-DWH
ESS-net DWH 10
The S-DWH Business Architecture
Conceptualisation of how to build up a S-DWH A common model for the total statistical process
and data flow Provide optimal organisation of all structured data,
enabling re-use, creation of new data etc. 4 Layers, covering all statistical activities
‒ Sources‒ Integration‒ Interpretation & Analysis‒ Data Access / Output
ESS-net DWH 11
The layered architecture of the S-DWH, with focus on the data sources used in each layer
ESS-net DWH 12
Use the GSBPM as common language to identify and locatethe various phases on the 4 S-DWH layers
Mapping the S-DWH on the GSBPM
ESS-net DWH 13
The S-DWH is a logically coherent central data store, not necessarily one single physical unit.
Metadata is vital in the governance, satisfying 2 essential needs:
to guide statisticians in processing and controlling the statistical data
to inform users by giving insight in the exact meaningof the statistical data
The vertical metadata layer enables to search all (meta)data in the 4 layers and, if permitted, give access to the data.
Managing the S-DWH
ESS-net DWH 14
Meta data layer
Source Layer
Integration Layer
Interpretation and Data Analysis Layer
Data Access Layer
ESS-net DWH 15
Framework:
General meta data definitions
Meta data for the S-DWH
Use of meta data models
Meta data standards & norms
Meta data quality & governance
Categories & subsets
Minimum requirements
Meta data - the DNA of the S-DWH
ESS-net DWH 16
S-DWH meta data requirements
Subsets Standards & Norms
ISO 11179
Internal rulesGuidelines
Mata data model S-DWH Gatekeeper
ESS-net DWH 17
Implementation of a S-DWH has huge organisational impact:
It means: moving from single operations to integrated, generic processes
It needs: a redesign of the statistical process
It asks: new IT systems, tools, high investments
It is: a new way of working
Only changing systems will not do the trick,changing people is the key to success
Organisational aspects
ESS-net DWH
Thank you !
ESSnet on data warehousing