metadata projects and tasks at statistics finland metis 2010 saija ylönen [email protected]

23
Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen [email protected]

Upload: bruce-price

Post on 19-Jan-2016

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Metadata projects and tasks at Statistics Finland

METIS 2010

Saija Ylö[email protected]

Page 2: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Organizational chart

11/03/2010 2Saija Ylönen

Page 3: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Co-operating parties of the metadata tasks: organizational units

IT Managementsituated in the Secretariat of the Director Generalco-ordinates the general information architecture, of which

metadata tasks form one element Classification and Metadata Services

situated in the IT and Statistical Methods departmentoperational unitactive role in developing of metadata

Dissemination Servicessituated in the IT and Statistical Methods departmentdevelops the metadata connected with the dissemination

11/03/2010 3Saija Ylönen

Page 4: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Metadata Co-ordination Group

Originally a co-operation group for persons working with metadata issues in the support function departments of SF

The objective at present is to intensify the co-operation between the statistics departments and the parties responsible for general metadata work

Comprised of members working on metadata and permanent members from all statistics department

Goal is to widen knowledge about metadata and metadata systems and to give an opportunity to the statistics departments to discuss their metadata needs with metadata specialists

11/03/2010 4Saija Ylönen

Page 5: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

CoSSI Steering Group and CoSSI model

Foundation for the metadata system Modular, xml-based model for describing statistical tables,

classifications, concepts, variables, general information on statistical documents, and quality, etc.

Expandable CoSSI Steering Group is in charge of mastering and

developing the model according to user needs in a manner that will not expose its main structure to risk

11/03/2010 5Saija Ylönen

Page 6: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Definition of metadata

1) Statistical metadata variable and data descriptions classifications, concepts

2) Statistical data quality quality reports statistical method descriptions

3) Metadata of statistical documents or products producers publication information field or subject area

11/03/2010 6Saija Ylönen

Page 7: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Definition of metadata II

4) Process metadata a) technical metadata

technical metadata guide the workflow of data production, makes it possible to follow data production and documents the working process.

b) conceptual process metadatatechnical information of data and variables which are

used in producing data. E.g. minimum or maximum values, various calculation rules or use of certain classification values

11/03/2010 7Saija Ylönen

Page 8: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Metadata systems at Statistics Finland

11/03/2010 8Saija Ylönen

Page 9: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Metadata systems: present situation

We are in a transitional phase from relational databases to an xml-based environment

Relational databases: classifications, concepts and definitions, archiving database

Xml database eXist: publications, classifications, concepts, data descriptions

11/03/2010 9Saija Ylönen

Page 10: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Relational databases

Built in the 1990’s Used in statistics production but not in all statistical

processes or all statistics Classifications in the relational databases are used in SAS

and Superstar Archiving database is in use in the archiving process Classifications and concepts are generated from the

relational databases to the web pages

11/03/2010 10Saija Ylönen

Page 11: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

XML database

At the moment, the xml database is used mostly in the creation of publications with an Arbortext word processor

Classifications and concepts are copied to the xml database from the relational databases and are ready to use

Tools for utilising metadata objects from the xml database are being constructed

The first metadata tool linked to the xml database is the variable editor

11/03/2010 11Saija Ylönen

Page 12: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Variable editor

For creating and maintaining the descriptions of statistical data and variables

At the testing phase Implementation begins in 2010 Descriptions are saved as xml documents conforming to

the CoSSI model in the eXist/xml database

11/03/2010 12Saija Ylönen

Page 13: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Content and functions of the variable editor

Data descriptions are comprised of a general description of the data, a list of variables and information about an individual variable

General data description includes descriptive information on the entire data document

Variable list interleaf allows management of the list of variables in the data description and selection of the variable whose description needs editing.

11/03/2010 13Saija Ylönen

Page 14: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

11/03/2010 14Saija Ylönen

Variable list interleaf

Page 15: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Variable metadata

11/03/2010 15Saija Ylönen

Field name Description

short name Short identifying name of variable

long name Name of variable in natural language

concept definition Basic conceptual description of variable

operational definition Verbal description of the formation of the variable

deduction rule E.g. programming instructions, mathematical formula, etc.

classification ID Identifier of classification. Refers to a classification in the classification database.

unit of measure Measurement unit of variable

variable modified Date of creation or modification of variable (yyyy-mm-dd)

start of validity Start date of validity of variable (yyyy-mm-dd)

end of validity End date of validity of variable (yyyy-mm-dd)

status Stage of editing of variable: draft, ready, validated

variable group Name of group to which variable belongs. Makes working with long variable lists easier.

work comment Free text field. Contains information only for the use of the maintainer of a description.

Page 16: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Results from the variable editor project

the development of a consistent information architecture the construction of production applications in which

metadata need not be separately produced or manually added to data when publishing or archiving statistics

information service where excessive time need not be spent on searching for metadata, or on actual reproduction of metadata for special compilation assignments

a system from which table column and row headings can in tabulation applications be retrieved in multiple languages for all statistics using the same methods.

11/03/2010 16Saija Ylönen

In addition to actual variable editor application the project also created preconditions for:

Page 17: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Experiences gained during the variable editor project

Various questions concerning standardisation had to be addressed in the project although they were not originally in the projects’ scope of task – they had to be done and they took a lot of time

Because the variable editor project was the first leg in the revision of the metadata system it was subjected to a diversity of expectations

Project was a good test run for the CoSSI model – the data content of the model proved to be exhaustive

11/03/2010 17Saija Ylönen

Page 18: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

The planning and building of a classification editor

Reasons for the renewing of the classification system:the present way of maintaining classifications has been

viewed as inflexible by statisticsrenunciation of the Sybase relational databasesICT strategy: in the next few years the agency will

introduce a common statistical metadata system based on the CoSSI model

Classification editor project 20101) definition stage2) construction stage

11/03/2010 18Saija Ylönen

Page 19: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Goals of the classification editor project

Analyse the service needs required from a centralised classification system

Create maintenance tools for classifications in connection with the CoSSI/eXist metadata store so that the basic maintenance needs of classifications of individual statistics are met in a user-oriented manner which also allows further development of the classification system

Produce the solutions with which the interoperability of the Sybase classification database and the eXist metadatabase can be ensured

Compile user instructions for the editor Pilot test the editor

11/03/2010 19Saija Ylönen

Page 20: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

Benefits of the new classification system

A classification system which serves well will encourage centralised and structured maintenance of classification

The documentation of classifications will improve, making them easy to find for use in-house and for the provision of information service

The new classification system will support smooth movement between data descriptions, variable descriptions and maintenance of classifications and thus improve the efficiency of the maintenance and use of classifications in statistics

11/03/2010 20Saija Ylönen

Page 21: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

General benefits of the common classification system

A centralised classification system eases the workload needed to maintain classifications because classifications are only maintained in one place

Reduces the possibility of errors because classifications are documented in the system consistently so that they are accessible to everybody and easy to find

Improves the efficiency of time use because working hours need not be spent on looking for classifications and trying to find their background information

Makes the classifications used in different statistics visible to everybody and thus creates possibilities for their harmonisation

11/03/2010 21Saija Ylönen

Page 22: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

In conclusion: Why do some statistics departments still have their own metadata systems instead of using the centralized system?

Centralised metadata work progresses too slowly from the perspective of individual statistics – We should rethink our construction and implementation strategy

Common attitude still regards the process of an individual set of statistics as unique, and therefore incapable of exploiting systems that are meant for all statistics – We have to get quick results to prove the benefits of the system

Commitment by the Management and their support to the work is crucial – We have to convince them

11/03/2010 22Saija Ylönen

Page 23: Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

THANK YOU FOR YOUR ATTENTION!

11/03/2010 23Saija Ylönen