![Page 1: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/1.jpg)
Better data quality through global data and metadata sharing
Agne Bikauskaite and Håkan Linden
Eurostat
European Conference on Quality in Official Statistics (Q2014)Vienna, 3-5 June 2014
![Page 2: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/2.jpg)
Outline
1. Context
2. A data sharing model
3. The necessary preconditions
4. Implementing Eurostat's data sharing strategy
5. Conclusions and outlook
![Page 3: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/3.jpg)
Context
General objectives
•Reduce reporting burden on NSIs•More efficient use of resources in International Organisation (IO)•Ensure high quality and consistency of data of official statistics •Improve global data exchange and dissemination
![Page 4: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/4.jpg)
A data sharing model
European statistics: From national to Eurostat
EU Member state
EU Member state
EU Member state
EU Member state
EU Member state
EU Member state
EU Member state
EU Member state
Data Validation Data Validation
EurostatEurostat
![Page 5: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/5.jpg)
A data sharing model
EU countriesEU countriesOECD countries
(non-EU countries only)
OECD countries(non-EU countries
only)
Other countries (non-OECD countries
only)
Other countries (non-OECD countries
only)
Eurostat - ECB
Eurostat - ECB
OECDOECD
IMF, UN, WB, ILO, BIS, other IOs
IMF, UN, WB, ILO, BIS, other IOsU
SERS
USERS
Eurostat as international hub for European statistics
![Page 6: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/6.jpg)
The necessary pre-conditions
• Internationally agreed technical and statistical standards
• Internationally agreed data structures
• Maintenance agreements
• Internationally agreed data validation
• Streamlined data exchange processes
![Page 7: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/7.jpg)
It consists of technical and statistical standards, guidelines, an IT service infrastructure and IT tools.
SDMX provides •technical/statistical standards•new exchange modes (hubs) •clear rules and responsibilities
SDMX
ISO IS 17369
Statistical Data and Metadata Exchange(SDMX)
![Page 8: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/8.jpg)
Organisation scheme
Concepts
Code lists
Concept Schemes
Provision Agreement
SDMX describes the data and metadata exchange
DSDs
maintainer SDMX Registry
![Page 9: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/9.jpg)
Describing the data exchange
Who?
What?
When? Who?
Where?How?
What?
![Page 10: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/10.jpg)
Cross-domain concepts and code listsCross-domain concepts and code lists
Statistical subject-matter domainsStatistical subject-matter domains
Metadata common vocabularyMetadata common vocabulary
Recommendations to harmonise implementations
Organisation 1 Organisation 2 Organisation 3
interoperability
Content-Oriented guidelines
![Page 11: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/11.jpg)
• Code lists describe dimensions in data tables, giving a meaning to the data.
• Code lists are based on:
• official statistical classifications such as NACE, NUTS, ISCO, etc.
• The ESS and SDMX Content Oriented Guidelines
• domain specific codifications
• A standard code list is a code list already harmonised
• Standard code lists should be used all along the statistical business process: data design, collection, aggregation, dissemination, exchange, archiving.
Implementing Eurostat's data sharing strategyStandardisation of structural metadata
![Page 12: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/12.jpg)
Implementing Eurostat's data sharing strategyRecommendations for the SCL creation
RECOMMENDED RULES ESS SDMX COMMENTS
Input: Official information ⱴ ⱴ
Coding A-Z + 0-9 + - + _ A-Z + 0-9 + _ In SDMX “–“ (dash) is not allowed (to avoid confusion with operator "minus")
Codes starting with letter ⱴ ⱴ With some exceptions
Meaningful codingⱴ ⱴ Less homogeneity in coding in SDMX (due to
involvement of several different partners)
Aggregates are possible ⱴ ⱴ
To be used all along the statistical business process
ⱴ ⱴ
May be referenced by several statistical concepts
ⱴ ⱴ
Based on clear guidelines ⱴ ⱴ
Maintenance agency ⱴ ⱴ ESS: Eurostat Unit B5SDMX: Statistical Working Group (SWG)
Versioning system ⱴ ⱴ In future registries
Generic conceptⱴ ⱴ In SDMX is special CL for generic codes
In ESS generic codes are implemented in each SCL when it is needed
![Page 13: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/13.jpg)
Implementing Eurostat's data sharing strategySDMX standards into ESS structural metadata
In purpose to improve quality of the data comparability and clarity is needed:
• To use identical SCLs in the ESS and in the SDMX• To transpose the SDMX guidelines into the ESS code lists• To adapt the ESS standard codes into the SDMX DSDs
![Page 14: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/14.jpg)
Implementing Eurostat's data sharing strategyOverview of the ESS SCLs
• 504 ESS CLs • 194 ESS SCLs released in Ramon
•12 fully SDMX compliant•110 SDMX compliant (except Generic codes)
![Page 15: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/15.jpg)
Implementing Eurostat's data sharing strategyStandardisation of Reference Metadata
![Page 16: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/16.jpg)
WASTE (end of life vehicles, packaging, electronic waste)
WINE
FARM STRUCTURE
MIP STATISTICS
HICP/ Compliance monitoring EHIS (Education, health and social protection) R&D (CIS 2012) Annual crops
PRAG ESAW AES (Education, Science and Culture) LCI (Labour Cost Index) INFOSOC (Information Society) BUSINESS REGISTER
HICP LFS-Q, LFS-A EU-SILC FATS STS (Short Term Statistics) WASTE AEI (Pesticides) EDUCAT JVC (Job Vacancy Stats) PRODCOM EXTERNAL TRADE (3rd countries) COSAEA URBANREG R&D TOURISM PERMANENT CROPS CENSUS HOUSING PRICES HPS
Over 30 Eurostat domains are in various phases of ESS Reference metadata standardisation.
This concerns about 35% of all eligible Eurostat processes.
Implementing Eurostat's Reference metadata sharing strategy
![Page 17: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/17.jpg)
Implementing Eurostat's data sharing strategy The Eurostat established methodology
![Page 18: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/18.jpg)
Implementing Eurostat's data sharing strategyin ESS
GNI INT TRADE * CENSUS AVIATION(Gross National Income)
WATER R&D* FISHERIES *
JVS / LCI MARITIME ORCHARDS / PEST.
EGR IS TICPESTICIDES USE (Eurogroup register) (Trade in currency)
ESSPROS TEC WASTE
NATIONAL A/cs *STS(Short-term Statistics) BOP *
EDUCATION * FDI *
Over 20 Eurostat domains are in various phases of SDMX implementation.This concerns about 25% of all eligible Eurostat processes.
Job Vacancy Statistics )Labour Cost Index
(Trade by Enterprise Characteristic)
(social protection statistics)
![Page 19: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/19.jpg)
Implementing Eurostat's data sharing strategyDevelopment of the technical infrastructure
Key components:
• SDMX Registries• The Euro-SDMX Registry• The Global SDMX Registry
• SDMX Reference Infrastructure (SDMX-RI)
![Page 20: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/20.jpg)
Implementing Eurostat's data sharing strategyWhat is the EuroSDMX Registry(SER)?
• Eurostat's implementation of the SDMX Registry specifications as published by the SDMX initiative sdmx.org.
• Based on SDMX 2.1 (as published on April 2011) Also capable of importing and exporting SDMX 2.0 artefacts.
• Allows browsing, searching, editing and subscribing to artefacts.
• Advanced access control mechanism for distributed maintenance of artefacts controlling also their visibility.
![Page 21: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/21.jpg)
Home pageHome page
Most recent itemsMost recent items
Access to the content of the
Registry by type
Access to the content of the
Registry by type
Access to the content
of the Registry text
search
Access to the content
of the Registry text
search
Access to the content of the Registry advanced search
Access to the content of the Registry advanced search
![Page 22: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/22.jpg)
Conclusions
• International data co-operation improves the production of accurate, comparable and coherent statistics;
• SDMX promotes an incremental movement toward the data and metadata sharing model;
• The increasing use of SDMX based statistical standards improves the quality of the underlying statistical processes;
• The SDMX technical standards pave the ways for simplified exchange and dissemination processes helping to improve also timeliness and accessibility;
• Statistical integration needs to go hand-in-hand with technical integration and standardisation.
![Page 23: Better data quality through global data and metadata sharing](https://reader036.vdocument.in/reader036/viewer/2022070417/5681553d550346895dc31275/html5/thumbnails/23.jpg)
Outlook
• Much more global data and metadata sharing in the years to come;
• Common data validation and processing procedures are required (from structural validation to content information validation);
• Better metadata driven statistics production systems: the use of standards throughout the processes in combination with common metadata registries ;
• Better harmonised international reference metadata frameworks and templates;
• Broadening the scope of SDMX (versioning of codes, disabling of dimensions, other formats like CSV, flat files etc.);
• Interoperability between information models (GSIM, SDMX, DDI etc.).