ec. · web viewtitle: overview metdata models wp: 1 - metadata deliverable: 1.3 version: 0.5...

22

Click here to load reader

Upload: vutruc

Post on 06-Mar-2018

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

in partnership with

Title: Overview metdata models

WP: 1 - Metadata Deliverable: 1.3

Version: 0.5 (draft) Date: 10 December

2012

Autors:

Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

Statistcics Netherlands (CBS)

ESS - NET

ON MICRO DATA LINKING AND DATA WAREHOUSING IN PRODUCTION OF BUSINESS STATISTICS

Page 2: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

INDEX

1. Introduction 2

2. The use of metadata models and standards 4

2.1 International models and standards 4

2.2 Relevance 7

2.3 Subset mapping 8

3. Best practice cases 9

Appendix 1 13

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

1

Page 3: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

1. IntroductionIn the statistical data warehouse (S-DWH) the metadata satisfies 2 essential needs:

a. to guide statisticians in processing and controlling the statistical production b. to inform end users by giving them insight in the exact meaning of statistical data

In order to meet these 2 essential functions, the statistical metadata must be:correct and reliable (the metadata must give a correct picture of the statistical data), consistent and coherent (the metadata driving the statistical processes and the reporting metadata presented to the end users must be compatible with each other)standardised and coordinated (the data of different statistics are described and documented in the same standardised way)

Since the different users of the (meta)data have diverse needs, it is essential to ensure an effective management of the statistical metadata in the S-DWH. To realise this, the use of a metadata model is a key element in structuring and standardising the statistical metadata within a NSI in a generic way.

In the Metadata framework1 (deliverable 1.1) the roles and purposes, definitions etc. of metadata in the statistical data warehouse are defined in generic terms. The framework defines a metadata model as follows:

[Def 3.6.1] A metadata model is a special case of a data model:an abstract documentation of the structure of metadata used by business processes.

In the context of the S-DWH at least 2 types of metadata models can be distinguished:2.2 as conceptual model that usually gives a high-level overview on how the metadata is

organised, managed, maintained etc. 2.2 as physical model that describes the details of the metadata objects and attributes,

including relations between the metadata objects.

More simple, you could say that a conceptual metadata model is a description of the overall metadata process(es), where the physical model is a structured description of the metadata elements.

In the context of the term (metadata)model also the term standard needs to be reconsidered, as they are often used in relation or even mixed. The following general definition of a model is commonly accepted:

‘A model is a simplified description of an analogue part of the reality.’

For the term standard, often also norm is used as a synonym. The following general definition of standard/norm is commonly accepted:

‘A standard or norm is a document with recognized agreements, specifications or criteria about a product, service or method.’

Looking at the coherence of and/or the differences between both terms a standard/norm generally defines WHAT to be done, a model describes HOW to do it.

1 document.doc

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

2

Page 4: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

For example:the standard/norm ISO 11179 is a international standard defining the representation of metadata in a metadata registry, without a physical representation.

whereasthe Nordic Metadata model provides a basis for organizing and managing metadata, as it describes the metadata systems that are being used in NSIs

In the context of the S-DWH, a metadata model is a standardized representation used to define all necessary metadata elements of statistical information systems, based upon and using 1 or more standards/norms. In these implementations, standards act as checklists for controlling the completeness and correctness of all metadata elements as described by the model.

In this document we focus on the use of metadata models and standards, providing a framework for capturing, maintaining and understanding the metadata when describing statistical data.

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

3

Page 5: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

2. The use of metadata models and standardsIn 2011 this ESSnet sent out a questionnaire for the stocktaking on best practices in ESS-member states, which also included some questions about metadata. One of the issues mentioned as most important to focus on when developing a DWH was metadata to drive the process

Remarkably, almost all NSI mention on the one hand that metadata are “important” or “extremely important” for DWH systems. But on the other hand, 19 of the 24 NSIs admit that meta data are currently implemented in only a few systems. Further inquiries revealed that one reason for this apparent contradiction is that current metadata-models are considered as complex and cumbersome to deal with. Hence, one challenge of the ESSnet might be to provide recommendations about relatively easy to manage metadata models, which allow us to drive DWH systems.

From the metadata perspective it is the ultimate goal to use one single model for statistical metadata, covering the total life-cycle of statistical production. But considering the great variety in statistical production processes (e.g. surveys, micro data analysis or aggregated output), all with their own requirements for handling metadata, it is very difficult and not very likely to agree upon one single model. Biggest risk is duplication of metadata, which you want to avoid of course. This best can be achieved by the use of standards for describing and handling statistical metadata.

2.1 International models and standardsChapter 1.3 of the Metadata Framework briefly describes the models and standards considered most

relevant for a S-DWH. Part B of the Common Metadata Framework of The Metis Group gives a more complete overview of concepts, standards, and models:http://www1.unece.org/stat/platform/display/metis/The+Common+Metadata+Framework

The most important standards in relationship to the use of metadata models are:

▪ ISO / IEC 11179-3 2 ISO/IEC 11179 is a well established international standard for representing metadata in a metadata registry. It has two main purposes: definition and exchange of concepts. Thus it describes the semantics and concepts, but does not handle physical representation of the data. It aims to be a standard for metadata-driven exchange of data in heterogeneous environments, based on exact definitions of data. In particular Part 3 : Registry metamodel and basic attributesPrimary purpose of part 3 is to specify the structure of a metadata registry and also to specify basic attributes which are required to describe metadata items, which may be used in situations where a complete metadata registry is not appropriate.

▪ Neuchâtel Model - Classifications and Variables The main purpose of this model is to provide a common language and a common perception of the structure of classifications and the links between them. The original model was extended with variables and related concepts. The discussion includes concepts like object types, statistical unit types, statistical characteristics, value domains, populations etc.The two models together claim to provide a more comprehensive description of the structure of statistical information embodied in data items.Intended use: For setting up metadata models and frameworks inside statistical offices several models are used as a source or starting point. The Neuchâtel model is one of those models.

2 Homepage for ISOIEC 11179Information Technology – Metadata registries / http://metadata-stds.org11179/#A3

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

4

Page 6: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

References - Classifications: http://www1.unece.org/stat/platform/download/attachments/14319930/Part+I+Neuchatel_version+2_1.pdf?version=1References - Variables: http://www1.unece.org/stat/platform/download/attachments/14319930/Neuchatel+Model+V1.pdf?version=1

▪ Corporate Metadata Repository Model (CMR) This statistical metadata model integrates a developmental version of edition 2 of ISO/IEC 11179 and a business data model derivable from the Generic Statistical Business Process Model. It includes the constructs necessary for a registry. Forms of this model are in use at the US Census Bureau at Statistics Canada.Intended use: The model is a framework for managing all the statistical metadata of a statistical office. It accounts for survey, census, administrative, and derived data; and it accounts for the entire survey life-cycle.References:http://www.unece.org/stats/documents/1998/02/metis/11.e.pdf for overview paper on the subject. See also Gillman, D. W. "Corporate Metadata Repository (CMR) Model", Invited Paper, University of Edinburgh -Proceedings of First MetaNet Conference, Voorburg, Netherlands, 2001.Relationships to other standards:ISO/IEC 11179 and Generic Statistical Business Process Model

▪ Nordic Metamodel, version 2.2 The Nordic Metamodel was developed by Statistics Sweden, and has become increasingly linked with their popular "PC-Axis" suite of dissemination software. It provides a basis for organizing and managing metadata for data cubes in a relational database environment.Intended Use: The Nordic Metamodel is used to describe the metadata system behind several implementations of PC-Axis in national and international statistical organizations, particularly those using MS SQL Server as a platform. Maintenance organization: Statistics Sweden (with input from the PC-Axis Reference Group)References: PC AXIS SQL metadata base

▪ Common Warehouse Metamodel (CWM) Specification for the metadata in support of exchange of data between tools.Intended use: As a means for recording the metadata to achieve data exchange between tools.Maintenance organization: OMG - Object Management GroupISO Standard Number: ISO/IEC 19504References: See OMG web site (http://www.omg.org), and specifically http://www.omg.org/technology/documents/formal/cwm_mip.htm

▪ SDMX Statistical Data and Metadata eXchange, SDMX, was initiated by seven international organisations to foster standards for the exchange of statistical information. SDMX has its focus on macro data, even though the model also supports micro data. It is an adopted standard for delivering and sharing data between NSIs and Eurostat. Sharing the results from the latest Population Census is perhaps the most advanced example, so far.

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

5

Page 7: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

Recently SDMX more and more has evolved to a framework with several sub frameworks for specific use:- ESMS- SDMX-IM- ESQRS- MCV- MSD

▪ References: See SDMX web site ( http://sdmx.org ), and specifically http://sdmx.org/?page_id=10 for standards

▪ DDI The Data Documentation Initiative (DDI) has its roots in the data archive environment, but with its latest development, DDI 3 or DDI Lifecycle, it has become an increasingly interesting option for NSIs. DDI is an effort to create an international standard for describing data from the social, behavioural, and economic sciences. It is based on XML. DDI is supported by a non-profit international organisation, the DDI Alliance. References: http://www.ddialliance.org

▪ GSIM The Generic Statistical Information Model (GSIM) is a reference framework of information objects, which enables generic descriptions of data and metadata definition, management, and use throughout the statistical production process. As a common reference framework for information objects, the GSIM will facilitate the modernisation of statistical production by improving communication at different levels:- Between the different roles in statistical production

(statisticians, methodologists and information technology experts);- Between the statistical subject matter domains;- Between statistical organisations at the national and international levels.The GSIM is designed to be complementary to other international standards, particularly the Generic Statistical Business Process Model (GSBPM). It should not be seen in isolation, and should be used in combination with other standards. References: Websitehttp://www1.unece.org/stat/platform/display/metis/Generic+Statistical+Information+Model+(GSIM)GSIM Version 0.3http://www1.unece.org/stat/platform/download/attachments/65373325/GSIM+v0_3.doc?version=1

▪ MMX metadata framework The MMX metadata framework is not an international standard, it is a specific adaptation of several standards by a commercial company. The MMX Metamodel provides a storage mechanism for various knowledge models. The data model underlying the metadata framework is more abstract in nature than metadata models in general. The MMX framework is used by Statistics Estonia, so it needs to be considered from the point of practical experiences.

Appendix 1 gives a more comprehensive and thorough overview of models and standards.

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

6

Page 8: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

2.2 Relevance

As not all models/standards we enlisted are relevant in the context of the S-DWH, we made a selection of the ones who are and need to be study more in depth. For this we used the following 4

selection criteria:

Topicality Date of last change/last reference on the internet ?

Are there (still) new developments of the model/standard ?

Support Is there an organisation that is in charge of the maintenance of the standard/model ?

Usage How extensive is the usage of the model ?Are there many / few users ?

Usability Is the model/standard difficult or easy to use ?Do we think it is usable in of the S-DWH ?

We made a first selection of relevance by scoring each model on the 4 categories:

    Criteria Advise      topicality support usage usability  

Mod

els

/ Sta

ndar

ds

ISO/IEC 11179-3 + +/- +/- +/- relevant  

Neuchâtel Model +/- +/- + +/- relevant  

CMR - - +/- - not relevant  

Nordic Metamodel + + + + relevant  

CWM - - +/- - not relevant   SDMX:

+ + + ? relevant   * SDMX-IM +/- +/- +/- +/- relevant  

* EPMS ? ? ? ? unclear  

* ESMS + ? ? ? relevant  

* ESQRS + ? ? ? relevant  

* MCV ? ? - ? not relevant  

*MSD ? ? ? ? unclear  

DDI + +/- +/- ? relevant  

GSBPM + +/- +/- ? relevant  

GISM + +/- +/- ? relevant  

MMX +/- +/- +/- +/- relevant  

               Legend              + good            +/- moderate          - not good            ? more research needed                                  

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

7

Page 9: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

In this first selection we made following considerations:▪ A model/standard is rated relevant if it has at least 1 ‘+’▪ A model/standard is rated not relevant if it has at least 1 ‘-’▪ A model/standard is rated unclear if it has mainly ‘?’▪ If a model/standard has mainly ‘+/-‘ we considered the overall context:

- SDMX-IM is rated relevant as it is a subset of SDMX- MMX is rated relevant as it is a key element in the new S-DWH of Statistics Estonia,

and we want to consider it in the general discussion.

2.3 Subset mapping

For the (possible) use in the S-DWH it is necessary to first map the relevant models/standards on the metadata subsets from the framework. Goal is to indicate for each subset which model/standard is to be considered and useful.

In this mapping the GSBPM is not matched as WP 3 has made a mapping of the GSBPM on the S-DWH (deliverable 3.1).

    Metadata subsets    Statistical Process Quality Technical Authorisation

Mod

el /

Stan

dard

ISO/IEC 11179-3 no yes no no ? Neuchâtel Model yes no no no ? Nordic Metamodel yes no no no ? SDMX: yes ? no yes ? * SDMX-IM no no no yes ? * ESMS yes yes ? no ? * ESQRS no no yes no ? DDI yes no no yes ? GSIM yes yes no ? ? MMX yes yes no yes yes

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

8

Page 10: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

3. Best practice cases

Based upon the information from the stocktaking, uniform best practice descriptions are made for the specific NSIs, with special focus on the use, role and function of metadata in the S-DWH. This case studies provide good insight into the NSIs with the most developed metadata systems. For more specific information on the (possible) use of a metadata model, further research will be performed. In this paragraph w give an overview of the BP cases with focus on the metadata model use. In the Annex we add some complete BP-case descriptions.

NSI Statistics Estonia

Metadata Model(s) : YES

- MMX MOF3 Metadata model for the central metadata repository (iMeta)- Neuchatel model for statistical activity codes/descriptions, variable classifiers (in code lists)

for statistical reference metadata and statistical structural metadata- XDTL metamodel for process metadata- Relational Database Metamodel for technical metadata

Metadata System : YES

iMeta is the central metadata repository, based on MMX MOF 3 metadata model, which enables to manage several different meta models. SE manages both reference metadata and structural metadata (including process metadata, technical metadata, user roles and privileges etc).

The S-DWH (conformed collection of datasets) consists of data processed and prepared for analysis. In the data warehouse variables (columns) are linked with variable descriptions in iMeta. Data sets in Data Warehouse are versioned. Data sets are mutually linked with common dimensions and facts in different data sets are unique (avoid data duplication in different data sets).

NSI Office for National Statistics (ONS) - UK

Metadata Model(s) : NO Within ONS there is no standardised / centralised metadata model used.There was a metadata model developed several years ago which was based upon the Neuchatel model, but it was not put into operation.

At the moment a prototype S-DWH (covering multi-mode data collection and subsequent statistical phases) is being developed and the metadata in that is closely integrated with the data objects.

There is need for standardising the metadata / use of a metadata model

Metadata System : NO ONS has no central metadata system / repository.There is no standard approach and coordination for the metadata management yet.

There is a Standards and Guidance repository (Lotus Notes database) which is used to document metadata about the statistical processes used across the different statistical domains, but it is not to a specific structure or template.

There are various specific systems / repositories which have been developed for separate statistical processes and development projects independently. Currently, there is no coordinated approach.

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

9

Page 11: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

NSI Statistics Netherlands (CBS)

Metadata Model(s) : YES

CBS (methodology) has developed a dedicated CBS metadata model specifically for describing the so called steady states, inspired by Swedish model and Neuchatel (among others).

It is a generic OBJECT model, dedicated to describe statistical datasets, in a conceptual, non technical and uniform way. It treats micro data and macro data differently.It focuses on the description of structured reference (business) metadata

Technical metadata in separate, standard XML files, no part of the metadata model.Process and Quality metadata are part of metadata model but not yet standardised; free format text or separate documentations.

Metadata System : YES

The Data Service Centre (DSC) is the implementation of the CBS metadata model.The DSC-concept is passive and (meta)data oriented – steady states concept.The DSC is the central ‘data vault’ and metadata repository, linking:

▪ conceptual, describing metadata▪ technical metadata, in separate, standard XML files▪ statistical data, as standardised flat ASCII files (‘steady states’)▪ all other documentation (Word, PDF, Excel etc.)

Basic concepts:▪ Storage of DATA (steady states) after each processing step,

WITH METADATA (no data without metadata !)▪ Strict distinction between the statistical data that are actually processed

and the metadata that describe the definitions, the quality and the process activities▪ Steady states are explicitly designed for re-use of statistical data.▪ The metadata are generally accessible and are standardised as much as possible.

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

10

Page 12: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

NSI Italian National institute for Statistics (Istat) - IT

Metadata Model(s) : NO Within Istat there is no standardised/centralised metadata model used. A conceptual layer model, called Osi (Objects-Information Frames) has been implemented for surveys/sources metadata, to specify terminologies and information frames. The role of metadata in mainly descriptive.

Metadata System : NO Istat has no central metadata system/repository.There is no standard approach and coordination for the metadata management yet.

A centralised quality information system for metadata management, called Sidi manages metadata concerning the production processes of surveys. Sidi (Surveys Information System) has been designed as a tool for monitoring the quality of the Istat surveys from both a qualitative and a qualitative viewpoint. Therefore it also allows for calculating and disseminating standard quality indicators for surveys.

A centralised data management system, called Armida (ARchivio MIcro DAti – Micro data source), manages not only data but also metadata concerning surveys. Armida has been developed as a tool for allowing end users of survey data to access them at a micro level instead of macro level. The access is assisted as regard confidentiality.

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

11

Page 13: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

NSI Central Statistical Office (CSO) - Ireland

Metadata Model(s) : NO

Metadata System : NO

The CSO currently provides a significant amount of metadata with the data it disseminates. However, as no formal metadata standards or management systems have been adopted, the CSO runs the risk of disseminating poor quality metadata.

The role of Metadata: Mainly descriptive. So far the metadata capturing is still minimal. Both S-DWH implementations (ADC and DMS) have an own metadata layer. In the DMS metadata must be entered as datasets/tables are created. There is no specific model being used.In general there is now some research done about DDI, as it looks like that DDI is being revitalised.

In 2009 CSO has defined a Metadata Strategy, outlining a vision for metadata:The standardisation of metadata management across the CSO organisation.

The vision takes into account the needs of both the compilers and users of CSO statistics.The vision sees providers of metadata taking responsibility for its maintenance and updating, to specified standards, using simple web based applications. Users of metadata will have easy access to all available and relevant metadata and that metadata will be maintained to a defined standard.

So far the implementation of the strategy could not be realised due to lack of resources/capacity. Planning is to start in 2013.

The table below gives a shirt overview of the current availability of metadata at CSO.

Metadata Is this metadata available now?

Related metadata standard/ model

Is CSO metadata meeting this standard?

1. A catalogue of releases & publications

No - not in a systematic and easily accessible format

Dublin Core CSO does not have a central catalogue of releases and publications, with Dublin Core metadata. However, all releases and publications are available on CSO website

2. Databank Yes Nordic model Yes – in place as part of CSO output databases

3. Variable definitions

No - not in a systematic and easily accessible format

Neuchatel, ISO11179 No – no capturing of variable definitions to standard

4. Classifications Yes CARS Yes 5. Quality reports Work-in-progress Office standard,

SDDS/DQAF Yes – work in progress, However will need to investigate how to link Office standard to SDDS/DQAF IMF requirements

6. Questionnaires Yes Dublin Core No 7. Survey methodology & system processing notes

Ad-hoc availability & standard across the Office

To be decided No

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

12

Page 14: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

NSI Statistics Sweden (SCB)

Metadata Model(s) : YES

The Swedish Statistics act prescribes that all official statistics must be documented in a Description of the Statistics (DoS). It is a document in two parts: General information (contents, responsible authority, purpose, use) and a Quality declaration (accuracy, timeliness, coherence, availability). The current version of the model has been in use since 1999.

A metadata model for documentation of a survey round, its result (final observation register) and the production methods used (frame and sampling procedures, data collection, estimation, data processing system), SCBDOK was developed at SCB and has been in use for about 20 years.

Documentation of objects, populations, variables and value domains (including classifications) is a sub model of SCBDOK, called the MetaPlus model. This model was developed at SCB in 2004-06 as a replacement for an older model. It is based on ISO 11179, with some additions to support local needs. It includes a complete implementation of the Neuchâtel model for classifications.

The output databases are supported by a metadata model, MacroMeta, which was developed in-house in 1994-96.The current version is also known as the Nordic Metamodel.

Metadata System : YES

SCB does not have one single coherent metadata system, but several loosely connected ones.

DoS and SCBDOK have been implemented as documentation templates based on Microsoft Word with user instructions. Both documents are published on SCB’s web site.

The implementation of MetaPlus forms SCB’s repository for micro metadata. Currently, it is primarily a documentation system where users can search and reuse definitions and descriptions, but its design allows for it to be the basis of a metadata layer in a future data warehouse.

The implementation of the MacroMeta (the Nordic Metamodel) is an integral part of SCB’s Internet based dissemination of aggregated data or statistics (Sweden’s Statistical Databases, SSD).

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

13

Page 15: ec. · Web viewTitle: Overview metdata models WP: 1 - Metadata Deliverable: 1.3 Version: 0.5 (draft) Date: 10 December 2012 Autors: Jos Dressen, Michel Lindelauf, Harry Goossens NSI:

ESSnet on Data WarehousingHarry Goossens, Jos Dressen, Michel Lindelauf (CBS)

Appendix 1

Comprehensive and thorough overview of metamodels and standards.

Overview and recommendations on metadata modelsversion 0.4 – draft / 9 May 2012

14