describing statistical registers in sdmx and ddi: a comparison arofan gregory metadata technology...

17
Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Upload: dylan-bates

Post on 31-Dec-2015

212 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Describing Statistical registers in SDMX and DDI: A Comparison

Arofan GregoryMetadata Technology

Eurostat, June 4-6, 2013Luxembourg

Page 2: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Overview

• Introduction• Technical Approach• The SDMX Example• The DDI Example• Comparison• Conclusions

Page 3: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Introduction

• A document is currently being prepared by Eurostat showing how statistical registers can be marked up in SDMX and also in DDI– This is a practical work, designed to illustrate what

each approach looks like– The registers used are the Banca d’Italia Debt

Securities Register and the EuroGroup Register (EGR) at Eurostat

Page 4: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Technical Approach

• A proposal was made about how DDI and SDMX could be used to describe the same data so that the different formats could be transformed losslessly from SDMX to DDI and back– This came out of the UN/ECE SDMX-DDI Dialogue effort

• This approach uses a specific style of SDMX modelling

• It also uses a proposed subset of the overall elements of DDI

Page 5: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Technical Approach (Continued)

• The Banca d’Italia and Eurostat are both able to express their register data using SDMX

• Other registers have been documented with DDI• The approach used here emphasized the

interchange between the standards– This may not be as optimized a use of SDMX as that

of the Banca d’Italia– It also differs from work within Eurostat modelling

the EGR data with SDMX

Page 6: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

The SDMX Example

• The EGR data set is a series of transaction which occur irregularly over time

• There is a large set of attributes which make up the data which is required to be reported– About the financial institution– About the debt products being registered

• At any point in time, the existing register can be understood as a rectangular data set

Page 7: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

The SDMX Example (Continued)Columns are SDMX measures in a “measure dimension”

Rows are individual transactions (product registrations)

Page 8: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Dimensions

• The approach to dimensionalizing the table is:– The first column in the table identifies transactions (it has an

“Indentity” type)– The Measures are a measure dimension– Time would be added as a dimension

• The fact that some of the attributes could also be defined as dimensions in SDMX is ignored (this is a poor optimization)– All attributes in the register are defined as measures

• This approach will work for almost any rectangular data file

Page 9: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Supplemental Metadata

• There is some additional metadata we may wish to have about any given register

• This can be expressed as SDMX Reference Metadata

Page 10: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

The Whole Package

• To fully describe the registers, we would use– An SDMX DSD– An SDMX Reference Metadata Structure– An SDMX DataSet– A SDMX Reference Metadata Report

Page 11: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

The DDI Example

• The DDI example does not provide a dimensionalized description of the register data– It uses the standard description for describing unit-record

data– This looks a lot like the SDMX dimensionalized description– The first column has a transaction identifier (each row is a

case)– The columns are the “variables” in the data set

• As for SDMX, the codelists (in DDI, “Codes” and “Categories”) are described

• The dataset is not encoded in XML, but is an ASCII file

Page 12: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

The DDI Example (Continued)

• If there was additional information which needed to be exchanged, this would be contained inside the same DDI XML document, using the explicit fields for describing it (methodology, processing information, etc.)

Page 13: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

The DDI Package

• A DDI XML instance with all the metadata• An ASCII file

Page 14: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Comparison and Consideration

• Both techniques – SDMX and DDI – provide an interchangeable way of using the standards to describe the data

• This is a very typical use of DDI – the register data is just another microdata set

• This is an expected use of SDMX – the register is seen as a dimensionalized data set– It is a deeply cross-sectional one

Page 15: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Comparison and Consideration (Cont.)

• A decision to use one technique or the other would be driven by:– What standard an organization uses– What tools provide needed functionality on the

data

Page 16: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Comparison and Consideration (Cont.)

• Using this approach, the data is the same whether it is in SDMX or DDI – There are some conventions about transforming

identifiers• This approach could apply to any “microdata” set

for interchange between DDI and SDMX• “Microdata” being understood as any set of unit

records with a set of attributes attached– But only a single record structure is allowed within the

file

Page 17: Describing Statistical registers in SDMX and DDI: A Comparison Arofan Gregory Metadata Technology Eurostat, June 4-6, 2013 Luxembourg

Conclusions

• It is possible to describe register data with both SDMX and DDI

• The preference for one approach over the other is not based on the merits of the standards themselves, but on other considerations– Tools and needed functionality– Organizational competencies