enriching cultural heritage data with dbpedia

19
Enriching Cultural Heritage Data with DBpedia Antoine Isaac | DBpedia Community Meeting 2016 Netherlands, Public Domain 1660 - 1625, Rijksmuseum Anonymous Arrival of a Portuguese ship

Upload: antoine-isaac

Post on 16-Apr-2017

1.231 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Enriching Cultural Heritage Data with DBpedia

Enriching Cultural Heritage Data with DBpediaAntoine Isaac | DBpedia Community Meeting 2016

Netherlands, Public Domain

1660 - 1625, Rijksmuseum

Anonymous

Arrival of a Portuguese ship

Page 2: Enriching Cultural Heritage Data with DBpedia

Title hereCC BY-SA

Europeana?

Europeana EssentialsCC BY-SA

Enriching Cultural Heritage Data with DBpediaCC BY-SA

Europeana Collections homepageEuropeana| CC BY-SA

Page 3: Enriching Cultural Heritage Data with DBpedia

Title hereCC BY-SA

Title hereCC BY-SA

Europeana EssentialsCC BY-SA

Enriching Cultural Heritage Data with DBpediaCC BY-SA

Europeana aggregation infrastructureEuropeana| CC BY-SA

Europeana?

Page 4: Enriching Cultural Heritage Data with DBpedia

Europeana has many data challenges

Enriching Cultural Heritage Data with DBpediaCC BY-SA

We aggregate very heterogeneous metadata

• More than 48M objects• 3,500 galleries, libraries, archives and museums• 50 languages• From all EU countries• Level of quality varies greatly

Page 5: Enriching Cultural Heritage Data with DBpedia

Title hereCC BY-SA

Title hereCC BY-SA

Enriching Cultural Heritage Data with DBpediaCC BY-SA

Linked Open Data

Europeana Linked Open Data video on VimeoEuropeana | CC BY-SA

Page 6: Enriching Cultural Heritage Data with DBpedia

Europeana Linked Data StrategyOur efforts and lines of work

Enriching Cultural Heritage Data with DBpediaCC BY-SA

• The Europeana Data Model (EDM) offers a way to represent richer (linked) data

• We apply an enrichment strategy to link source data to reference data, including DBpedia

Will be discussed in Parallel Session 2:

• We encourage data providers to contribute links between objects and (their own) vocabularies

• We encourage alignment activities between domain vocabularies

Page 7: Enriching Cultural Heritage Data with DBpedia

Title hereCC BY-SA

Title hereCC BY-SA

Europeana EssentialsCC BY-SA

The Europeana Data Model

Enriching Cultural Heritage Data with DBpediaCC BY-SA

Clavecin, Bartolomeo Cristofori Cite de la Musique, MIMO - Musical Instruments Museums Online|CC BY-NC-SA

Europeana Data Model exampleEuropeana| CC BY-SA

Page 8: Enriching Cultural Heritage Data with DBpedia

Title hereCC BY-SA

Title hereCC BY-SA

Europeana EssentialsCC BY-SA

Create a “semantic layer” on top of cultural heritage objects

Enriching Cultural Heritage Data with DBpediaCC BY-SA

Include multilingual “value vocabularies” (e.g. thesauri represented SKOS)

from Europeana’s providers or from third-party data sources

Page 9: Enriching Cultural Heritage Data with DBpedia

Semantic enrichment, a solution for better quality data? Automatic and manual enrichment are more and more commonly used in digital libraries to:

• normalise data

• “standardize data” by linking it to authority resources

• improve multilingual coverage in datasets

• contextualise resources

Enriching Cultural Heritage Data with DBpediaCC BY-SA

Page 10: Enriching Cultural Heritage Data with DBpedia

The main components of semantic enrichment

CC BY-SA

source objects whose metadata is being enriched set of resources

used to enrich the source metadata

targets can be of different types, from simple uncontrolled strings to resources published as LOD

specify how the enrichment between the source and target should be executed.

SourceTarget

Rules

Enriching Cultural Heritage Data with DBpedia

Page 11: Enriching Cultural Heritage Data with DBpedia

Automatic enrichment process in Europeana

CC BY-SA

selection of metadata fields in descriptions

selection of potential rules to match

matching the values of the metadata fields to values of the contextual resources

adding contextual links

selection of values from the contextual resource

values go into the search index

Analysis

Linking

Augmentation of search index

Enriching Cultural Heritage Data with DBpedia

Page 12: Enriching Cultural Heritage Data with DBpedia

CC BY-SAEnriching Cultural Heritage Data with DBpedia

Page 13: Enriching Cultural Heritage Data with DBpedia

Vocabularies we currently enrich metadata with

CC BY-SAEnriching Cultural Heritage Data with

DBpedia

Entity Class

Target vocabulary Size Metadata Fields subject of Enrichment

Places GeoNames 140,097 dcterms:spatial, dc:coverage

Concepts DBpedia 5,284 dc:subject, dc:type

GEMET 280

Agents DBpedia 161,209 dc:creator, dc:contributor

Time Semium Time 2,566 dc:coverage, dcterms:temporal, dc:date, edm:year

Page 14: Enriching Cultural Heritage Data with DBpedia

Why DBpedia?

CC BY-SA

Building an ecosystem of networked references

• It offers labels in about 124 languages through all its language editions of which 48 match the languages that Europeana supports

• It gives fairly complete and accurate descriptive metadata about entities

• Works great as a “pivot” vocabulary, providing further links to other vocabularies such as Wikidata and Freebase

Page 15: Enriching Cultural Heritage Data with DBpedia

Not everything is perfect

France, Public Domain1921, National Library of FranceAgence de presse Meurisse

Colombes : championnats de France d’Athlétisme :rivière, le speaker

Page 16: Enriching Cultural Heritage Data with DBpedia

Challenges of multilingual automatic enrichment

Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments

Poisonous India or the Importance of a Semantic and Multilingual Enrichment StrategyMarlies Olensky, Juliane Stiller, Evelyn Dröge, MTSR 2012 http://link.springer.com/chapter/10.1007%2F978-3-642-35233-1_25

Page 17: Enriching Cultural Heritage Data with DBpedia

Comparative evaluation of enrichments

CC BY-SAEnriching Cultural Heritage Data with DBpedia

We ran a quantitative evaluation on a sample set enriched by 7 different tools (settings)

http://pro.europeana.eu/taskforce/evaluation-and-enrichments

Page 18: Enriching Cultural Heritage Data with DBpedia

Example of Recommendations that will be explored

CC BY-SAEnriching Cultural Heritage Data with

DBpedia

Define your enrichment goals• Develop better criteria for evaluating enrichment

Choose the right service• enrichment tool more aware of the semantics of the

model

Monitor your enrichment process and re-assess• target dataset could be richer: new terms, new

languages, more granular

Enrichment using a better reference for contextual entities?

You will hear about this in the next session ☺

Page 19: Enriching Cultural Heritage Data with DBpedia

Title hereCC BY-SA

Name of image | Creator

Providing organization| Country, licence

Name of image | CreatorProviding organization| Country, licence

With slides from Valentine Charles, Juliane Stiller, Hugo Manguinhas and Stefan Gradmann