valentine charles: linking cultural heritage with kos: the europeana example

32
Linking cultural heritage with KOS the Europeana example Valentine Charles Evolution and variation of classification systems – KnoweScape, Amsterdam, 05.03.2015

Upload: cost-action-td1210

Post on 03-Aug-2015

225 views

Category:

Education


0 download

TRANSCRIPT

Linking cultural heritage with KOS the Europeana example

Valentine Charles

Evolution and variation of classification systems – KnoweScape, Amsterdam, 05.03.2015

Context à  Aggregates metadata from the cultural heritage sector in

Europe •  Libraries, museums, archives and audio-visual archives •  Metadata in 33 languages

à  Provides a portal for users to access data and objects •  http://www.europeana.eu/ in 31 languages

•  Metadata under Creative Commons Zero - public domain

•  Previews and links to source

à  Data distributed via •  API http://labs.europeana.eu/api/ •  Linked Data (currently being updated) http://data.europeana.eu/

Europeana.eu, Europe’s cultural heritage portal 40M objects from 2,200 galleries, museums, archives and libraries

Create a new data framework for richer metadata à  Europeana Data Model (EDM)

•  Re-uses several existing Semantic Web-based models: Dublin Core, OAI-ORE, SKOS, CIDOC-CRM…

•  More granular metadata •  links e.g. between objects and context entities (persons, places)

•  multilingual & semantic linked data for contextual resources (e.g. Concepts)

à  EDM gives support for contextual resources (semantic layer)

Rely on KOS to solve a problem of data integration

à Create a “semantic layer” on top of connected cultural heritage objects •  Include multilingual “value vocabularies” •  From Europeana’s providers or from third-party data sources

Contextual entities Representing (real-world) entities related to a provided object as fully fledged resources, not just strings

edm:Agent foaf:name skos:altLabel

rdaGr2:biographicalInformation

rdaGr2:dateOfBirth

skos:Concept skos:prefLabel skos:altLabel

skos:broader

skos:related

skos:definition….

edm:TimeSpan skos:prefLabel dcterms:isPartOf edm:begin edm:end ….

edm:Place wgs84_pos:lat wgs84_pos:long skos:prefLabel skos:note dcterms:isPartOf….

Encourage data providers to contribute their own vocabularies

à Benefit from data links made at data providers’ level à Ingestion of vocabularies is made possible if the vocabularies used the data structures EDM expects

•  For instance SKOS for concept

à  For other vocabularies, Europeana does custom mappings

An example the integration of AAT URIs in EDM

hourglasses@en uurglazen@nl

reloj de las horas@es

http://vocab.getty.edu/aat/300206197 edm:ProvidedCHO

Hourglass urn:imss:instrument:401058

skos:Concept  http://vocab.getty.edu/

aat/300198626

skos:prefLabel

skos:prefLabel skos:prefLabel

skos:broader

dc:type

Demo with AAT and PartagePlus vocabularies

à http://www.europeana.eu/portal/search.html?query=sabliers&rows=24&qf=PROVIDER%3A%22Museo+Galileo+-+Istituto+e+Museo+di+Storia+della+Scienza%22&qt=false

à  http://www.europeana.eu/portal/search.html?query=Brooch&rows=24&qf=PROVIDER%3A%22Partage+Plus%22&qt=false

Vocabularies currently supported by Europeana

Challenge #1

à Europeana needs to regularly check that vocabularies have not changed at source:

•  Changes in concepts’ identifiers

•  Changes in the description of concepts (which would require a new mapping)

Challenge #2

à  Some of the vocabularies supported by Europeana have been developed by projects •  Issue of sustainability who maintains the vocabulary when the

project ends? What happens to the data?

Europeana also manages its own vocabulary– WWI example

à  Europeana developed a series of domain specific “sub-sites”

à  Europeana 1914-1918 (http://www.europeana1914-1918.eu/ ) developed its own vocabulary based on a subset of LCSH •  Terms translated in 10 languages and linked to id.loc.gov

•  Published in SKOS via the OpenSkos vocabulary service

http://data.europeana.eu/concept/loc/sh85148236

Challenge #3

à  Creation of caches of existing LOD vocabularies •  Europeana needs to keep track of the updates at the vocabulary

provider side.

à  The enrichment done on the Europeana side lives separately from the source vocabulary.

Multilingual Access to Subjects (MACS)

à  MACS project has produced manual and semi automatic alignments between: •  Library of Congress Subject Heading (LCSH) •  RAMEAU

•  Schlagwortnormdatei (SWD)

è 120,000 links created

à  MACS is integrated in The European Library as links included in all bibliographic data.

An example of a MACS record before and after additions by The European Library : -  ARK identifiers -  LOD URIs

Enrichments added through

MACS

The subject enriched record in EDM for delivery

to Europeana

Automatic enrichment based on KOS

Goal: Contextualization which goes beyond the scope of a particular platform

Object External Dataset and Vocabulary

Automatic enrichment process in Europeana

•  Metadata fields in resource descriptions

•  Selection of potential rules to match

•  Matching the values of the metadata fields to values of the contextual resources

•  Adding contextual links

•  Selecting the values from the contextual resource

•  Augmentation of the index with the labels picked from the vocabulary

Analysis

Linking

Augmentation

Vocabularies selection requirements In the context of Europeana a target vocabulary should be: à  Technically available (through Linked Data or in dedicated

repositories), properly documented, and in open access;

à  well-connected together, e.g. equivalent elements in other vocabularies are indicated; •  Key to avoid duplication and redundancy

à  Multilingual

Enrichment Types and Vocabularies

Enrichment Type Target vocabulary Source metadata fields

Places GeoNames dcterms:spatial, dc:coverage

Concepts GEMET, DBpedia, dc:subject, dc:type

Agents DBpedia dc:creator, dc:contributor

Time Semium Time

dc:date, dc:coverage, dcterms:temporal, edm:year

Europeana enrichment- an example

Challenge #4

à  A significant change change in the target vocabulary implies •  an update of the retrieved RDF files and a new deployment of

the enrichment framework (and/or)

•  An update of the enrichment rules

Challenge #5

à  Europeana data providers might also perform enrichment on their side

à  Europeana has currently no mecanism to separate the

(curated) links to contextual resources by data providers from (automatic) enrichments by providers.

Challenge #6

à  Automatic enrichment has flaws and problems •  For instance linking any print to the physical “pressure” concept

because of its German “Druck” alternative label.

à Incorrect enrichments lead to •  Devaluation of curated metadata •  Loss of trust from providers •  Irrelevant search results •  Bad user experiences

To conclude

à  Europeana continues to focus on pivot vocabularies such as Wikidata, Agrovoc to improve its search and retrieval services.

à  We now investigates how to use more domains specific vocabularies for dedicated services.

à  We also work on the definitions of best practices and evaluation methods for enrichment •  http://pro.europeana.eu/get-involved/europeana-tech/

europeanatech-task-forces/evaluation-and-enrichments

Thank you

Valentine Charles

[email protected]

Toolbox Replace text and adjust size

Replace text and adjust size

Replace text and adjust size Replace text and

adjust size

Replace text and adjust size

Replace text and adjust size