do more with your data

26
Do MORe with your data LoCloud Final Conference 5 th February 2016 Dr. Dimitris Gavrilis Digital Curation Unit - IMIS, Athena Research Center LoCloud is funded by the European Commission's ICT Policy Support Programme

Upload: locloud

Post on 14-Jan-2017

264 views

Category:

Internet


3 download

TRANSCRIPT

Page 1: Do MORe with your data

Do MORe with your data

LoCloud Final Conference5th February 2016

Dr. Dimitris GavrilisDigital Curation Unit - IMIS, Athena Research Center

LoCloud is funded by the European Commission's ICT Policy Support Programme

Page 2: Do MORe with your data

Do MORe with your data

Page 3: Do MORe with your data

• Key characteristics:– Fault-tolerance– High-availability– Elasticity– Scalability

• Key components– Storage layer– Decentralized & scalable

services– Pluggable services

MORe Architecture

Page 4: Do MORe with your data

Micro-service architecture

Validation service mgmt

Validation micro-servicesInput sources

Structure

Schema

Linking

Schematron rules

Data access layer

OAI-PMH

MINT mapping tool

Storage nodes

Core services layer

Input service mgmt Publish serv. mgmt

Publish services

Archive

Elastic Search

RDF Store

OAI-PMH

Omeka

Wikimedia

LoCloud collections

Enrichment service mgmt

Language identification

Thesauri collections

Vocabulary matching

Background links

Geo normalization

Geo coding

Reverse geo-coding

Historic place names

Enrichment micro-services

File-Upload

Page 5: Do MORe with your data

Enrichment micro-services

• 14 enrichment services so far– Thematic– Spatial– Temporal– Other

Page 6: Do MORe with your data

• Enrichment services run on:– Austria– Spain– Greece– Lithuania– Slovenia– Norway

Distributed

Page 7: Do MORe with your data

Validation

• Validation schemes– Flexibility

• Schematron Rule based validation– No more rejected

packages

Page 8: Do MORe with your data

• Get completeness graphs for every package and– schema– element– Per

mandatory/recommended set

Metadata Quality

Page 9: Do MORe with your data

Metadata Quality

• On the fly indexing, analysis and intuitive presentation of – Thematic information– Spatial information – Temporal information

Page 10: Do MORe with your data

Preview

Page 11: Do MORe with your data

Publication

• Publish your enriched data to– Europeana– An RDF Store as LOD– To Elastic Search – Download them in a zip archive

• Publish to multiple targets simultaneously

Page 12: Do MORe with your data

Enrichment micro-services

Page 13: Do MORe with your data

• We have our own Geo-names server

Place names

Page 14: Do MORe with your data

• We have our own PeriodO database

Periods

Page 15: Do MORe with your data

• We have access to over 30 thesauri

AIT (Angewandte Informationstechnik Forschungsgesellschaft mbH

Author Name of vocabulary University of California, Santa Barbara Alexandria Digital Library Feature Type Thesaurus Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)

Archeological Objects Thesaurus Scotland

English Heritage Archeological Sciences Thesaurus English Heritage Building Materials Thesaurus English Heritage Components Thesaurus American Folklore Society Ethnographic Thesaurus English Heritage Event Type Thesaurus English Heritage Evidence Thesaurus English Heritage FISH Archeological Objects Thesaurus Eionet European Environment Information and Observation Network

General Multilingual Environmental Thesaurus GEMET

Federation Internationale des Archives du Film (FIAF)

General Subject headings for Film Archives

The Discovery Programme Irish Monuments The Discovery Programme Irish Periods Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)

Maritime Craft Thesaurus Scotland

English Heritage Maritime Craft Type Thesaurus English Heritage and Royal Commission on the Historical Monuments of England

MDA Archaeological Objects Thesaurus

Royal Commission on the Ancient and Historical Monuments of Wales (RCAHMW)

Monument Thesaurus Wales

Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)

Monument Type Thesaurus

English Heritage Period Thesaurus Royal Commission on the Ancient and Historical Monuments of Wales (RCAHMW)

Period Thesaurus Wales

Bibliographic Standards Committee of the Rare Books and Manuscripts Section (ACRL/ALA)

Relator Terms for Use in Rare Book and Special Collections Cataloguing

Universidad de León

Tesauro de Ciencias de la Documentación

Library of Congress. Prints and Photographs Division

Thesaurus for Graphic Materials 1: Subject Terms

Library of Congress. Prints and Photographs Division

Thesaurus for Graphic Materials 2: Genre and Physical Characteristic Terms

Ministero per i Beni e le Attività Culturali

Thesaurus PICO 4.1

UKAT UK Archival Thesaurus (UKAT) UNESCO UNESCO thesaurus

Page 16: Do MORe with your data

Thesauri mappings

• Map your subject terms to standardized concepts from SKOSified vocabularies– AAT– Perio.do– …

Page 17: Do MORe with your data

• Subject collections showcase– Publically available subject

collections • Seamless integration with

MoRe– Autocomplete search of

terms within thesaurus• Targeted enrichment based

on item level subject terms

Subject collections

Page 18: Do MORe with your data

• Automatically enrichment of content with entries from:– Wikipedia– DBPedia– SKOSified thesauri

UPV/EHU – Universidad del País Vasco

Metadata Enrichment

Page 19: Do MORe with your data

• MORe API allows to run the entire aggregation engine through REST

• Developers area– API key generation– API documentation with

examples– Example Java projects for

NetBeans & Eclipse IDEs

Developers & Creative Industries API Integration

Page 20: Do MORe with your data

Developers & Creative IndustriesPlugins

• Allows developers to create their own enrichment micro-services on their own servers and integrate them into the enrichment process of MoRe.

• Developers have to implement a REST based interface and declare it as an enrichment micro-service in MoRe

Page 21: Do MORe with your data

• 10 more projects are using/evaluating MORe– ARIADNE chose MORe as it’s primary aggregator

• Over 1 million records have been aggregated and published to the ARIADNE portal

– RDA DDRI WG uses MORe• Zero downtime• Zero data loss• New metadata schemas have been integrated • New enrichment services have been developed /

integrated

MORe success stories

Page 22: Do MORe with your data

Thank [email protected]

Page 23: Do MORe with your data

LoCloud is funded by the European Commission's ICT Policy Support Programme

The views and opinions expressed in this presentation are the sole responsibility of the

authors and do not necessarily reflect the views of the European Commission.

Funding

Page 24: Do MORe with your data

Native record (OAI_DC)

Page 25: Do MORe with your data

EDM Record

Missing language attributes

Place label is a concat string of coordinates

Page 26: Do MORe with your data

Enriched EDM Record

Language identification

Vocabulary matching

Geo-normalization

Geo-coding

Enrichment Plan