eurostat november 2015 eurostat unit b3 – it and standards for data and metadata exchange...

Post on 18-Jan-2016

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Eurostat

November 2015Eurostat Unit B3 – IT and standards for data and metadata exchange

Jean-Francois LEBLANCChristian SEBASTIAN

SDMX IT ToolsIntroduction

Eurostat

Table of contents

1. Where are we?2. Standardization3. Why do we need a model?4. GSBPM Generic Statistical Business Process Model

1. Phases2. Key features3. Other uses

5. Standards – Relations6. GSIM Generic Statistical Information Model

2

Eurostat

Table of contents

7. SDMX & DDI8. SDMX

1. Why?2. Benefits3. Costs4. Opportunities5. Impacts6. From 1.0 to 2.17. The SDMX components8. SDMX in practice

9. Summary3

Eurostat

1. Where are we?

• Dramatic changes in the environment of official statistics producers (e.g. data deluge)

• Modernization of statistical information system seen as a question of survival for the sector of official statistics

• Standardization viewed as a key enabler for modernization

• "Standards-based” industrialization of statistical production

4

Eurostat

2. Standardization

• Why is it necessary? • Harmonization• Reusability and interoperability• Shared solutions across statistical institutes

• What does it imply?• Common processes• Common tools• Common methodologies

5

Eurostat

2. Standardization

• Industry Standards• GSBPM - Generic Statistical Business Process Model• GSIM - Generic Statistical Information Model• SDMX - Statistical Data and Metadata eXchange• DDI - Data Documentation Initiative

• Other major standards • RDF - Resource Description Framework• LOD - Linked Open Data• JSON - JavaScript Object Notation• XBRL - eXtensible Business Reporting Language

GSBPM

GSIMSDMX

DDI

6

Eurostat

3. Why do we need a model?

• To define and describe statistical processes in a coherent way

• To standardize process terminology• To compare and benchmark processes within and

between organisations• To identify synergies between processes• To inform decisions on systems architectures and

organisation of resources

7

Eurostat

4. GSBPM Generic Statistical Business Process Model

• Applicable to all activities undertaken by producers of official statistics -> data outputs

• Used by National and international statistical organisations

• Independent of data source, can be used for:• Surveys / censuses• Administrative sources / register-based statistics• Mixed sources

8

Eurostat

4.1 GSBPM - Phases

9

Eurostat

4.2 GSBPM – Key features

Not a linear model• Sub-processes do not have to be followed in a

strict order• It is a matrix with many possible paths, including

iterative loops within and between phases• Some iterations of a regular process may skip

certain sub-processes

10

Eurostat

4.3 GSBPM – Other uses

• Harmonizing statistical computing systems • Facilitating sharing of statistical software• Framework for process quality management• Structure for storage of documents • Measuring operational costs

11

Eurostat

5. Standards - Relations

Statisticsproduction

GSBPM GSIM

TechnologyMethods

Conceptual

Practical

SDMX, DDI, RDF, ISO-11179, …

Informationconcepts

Statisticalconcepts

Statisticalhow-to

Productionhow-to

12

Eurostat

6. GSIM Generic Statistical Information Model

GSIM

Other standard

s

DDI

SDMX

Implementationstandards

Conceptualmodel

13

Eurostat

7. SDMX & DDI

• DDI offers a very rich model for the documentation of micro-data

• SDMX offers a very integrated exchange platform for statistical outputs (IT architectures, tools, web services)

integration of the complete production process

The combined use of both standards could allow a higher level of

14

Eurostat

SDMX

8. SDMX Statistical Data and Metadata eXchange

World Bank

UNSD

15

Eurostat

8.1 SDMX – Why?

• The exchange of statistical data and metadata is complex, resource intensive and expensive

• In the past, national and international organisations had developed specific approaches and solutions

• Opportunities and challenges related to new technologies for machine to machine exchange were coming up, e.g. XML, web services.

SDMX is the global answer to this.

16

Eurostat

8.2 SDMX - Benefits

• Efficiency• Reduced burden after low investment• Consistent and comparable data and metadata messages

produced by different organizations• Harmonized statistical processes, offering new ways of data

and metadata exchange (such as data hubs) • Web-based dissemination formats are provided that are

computer “readable” and easier to update.

17

Eurostat

8.3 SDMX - Costs

• Development/maintenance of the SDMX standards and guidelines done by the international sponsoring institutions (supported by NSIs)

• Standards are public and open source

• IT tools are created by sponsoring or other organizations and made freely available

• Capacity building by individual sponsoring institutions

• User community input by means of open process

• Low investment cost – gradual implementation

18

Eurostat

8.4 SDMX - Opportunities

• Across domains

• Across organizations

19

Simplification

StandardizationHarmonization

• Streamline data flows• Central management

(SDMX Registry)

• Software tools• Data sharing• Data structures

• Concepts• Code lists

Eurostat

8.5 SDMX - Impacts

• Reduced reporting burden via common formats adopted by international organizations for data and metadata exchange

• User-friendly access when publishing national dataand metadata on the web via global standards for data formats, catalogs/registries and associated services

• Improved management and analysis of data via global guidelines for metadata vocabularies and repositories in common formats

• Replicable models and tools for statistical information systems at national levels

20

Eurostat

8.6 SDMX – From 1.0 to 2.1

21

Version 1.0

Version 2.0

Version 2.1

SDMX recognised and supported as the preferred standard

2008SDMX accepted at UN level

September 2004 February 2008 April2011 November 2005

Version 2.0

SDMX-EDISDMX-MLSDMX Registry

Version 1.0

GESMES/TS

Eurostat

8.7 The SDMX Components

Describe statistics in a standard way Objects and their relationships

Data Structure Definition (DSD), Concepts, Code List

Central management and standard access SDMX Registry, SDMX Web Services

Cross Domain Concepts Cross Domain Code Lists Statistical Domains Metadata Common Vocabulary

Push Provider generates and sends file to receiver

Pull Provider opens web service to data Receiver downloads regularly

Hub Special case of pull: receiver downloads on end user request

22

Eurostat

9. Summary

• To enable a modernized statistical production, standards are the key

• Standards at different levels are being used in an increasingly coherent way

• GSBPM and GSIM provide conceptual models and facilitate communication

• SDMX, DDI and other standards provide implementation models which can be used in a coordinated way

• There are now more technologies than just GESMES and XML: a coherent overall model is critical

23

Eurostat

24

Introduction

top related