on libraries & linked data antoine isaac ub utrecht, april 6, 2011
TRANSCRIPT
On Libraries & Linked Data
Antoine Isaac
UB Utrecht, April 6, 2011
Who am I?
• Europeana• Web & Media Lab, Vrije Universiteit Amsterdam
• W3C Library Linked Data group• (2006-2009) W3C Semantic Web Deployment group
SKOS
DemoFollowing one’s nose to subject heading lists as linked data• American LCSH
http://id.loc.gov/authorities/sh85145447#concept
• French RAMEAUhttp://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb11931913j
• German SWDhttp://d-nb.info/gnd/4064689-0
• Agrovochttp://aims.fao.org/aos/agrovoc/c_8309
• STWhttp://zbw.eu/stw/descriptor/14188-0
• Further on to DBPediahttp://dbpedia.org/resource/Water
Demo (fallback option)
Subject heading lists as SKOS linked data• American LCSH http://id.loc.gov• French RAMEAU: http://stitch.cs.vu.nl/rameau• German SWD: http://d-nb.info/gnd/ • mapped using manual links from the MACS project
http://macs.cenl.org
Starting from http://id.loc.gov/authorities/sh85014310#concept
Linked Data?
1. Use URIs as names for things2. Use HTTP URIs so that people can look up those names3. When someone looks up a URI, provide useful information
using standards (RDF, SPARQL)4. Include links to other URIs, so that they can discover more
things
Tim Berners-Lee, http://linkeddata.org/
(Linked) Data Representation
• That subject heading data follows a link-intensive data modelUniform resource identifiers (URI)Resource Description Framework (RDF)
(Linked) Data Representation
• Use more-or-less the same standard vocabularySimple Knowledge Organization System (SKOS)http://www.w3.org/2004/02/skos/ For representing thesauri, classifications, etc. on the
Semantic Web
A SKOS graphanimalscats
UF domestic catsRT wildcatsBT animalsSN used only for domestic
catsdomestic cats
USE catswildcats
SKOS mappings
SKOS provides conceptual links to bridge across different contexts
KOS 1:animalscatswildcats
KOS 2:animalhumanobject
Links in the data
Links in the data
Growing interest for linked data in the library community
Linked Library Cloud beginning 2008
[Ross Singer, Code4Lib2010]
http://code4lib.org/conference/2010/singer
Linked Library “sector” in 2010
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Libraries and LD, the perfect match?
• Libraries have been producing (meta)data for ages• Libraries (often) produce high-quality metadata
Libraries and LD, the perfect match?
• Library metadata was locked in record silos• But it maintain links to the outside world
• Bibliographic and web references• Shared vocabularies• Same books!
Libraries and LD, the perfect match?
LD is about• Citing object• Linking to them• Re-using data
Think of web-native union catalogues
Johan Stapel, Koninklijke Bibliotheek (now bibliotheek.nl)
A vision for the Dutch National Library
A web of cultural heritage data?
?
?
The current portal
Towards semantic search: facets
Building a search engine on top of metadata is difficultIntrinsic quality problems: correctness, coverage
Especially when data is so heterogeneous100s of formatsFrom flat 5-fields records to 100-nodes XML treesLanguage issue!
We currently use a simple, flat interoperability formatQuick-win quickly showing its limits
We can better use institutions’ original metadata
Accommodate their different practicesData structures and semantics
Access objects via a semantic layer of vocabularies for subjects, persons, places…
Semantic ThoughtLab: experimenting solutions
Towards semantics-enabled searchBuilding a "semantic layer" to help accessing content
Towards semantics-enabled search
• Enhance access to Europeana content by semantics– Query expansion, clustering of results
• Exploiting various types of relations– "located in", "lived in", "is more specific concept"…
• Semantics are already there, in metadata and "controlled vocabularies" used in metadata– Thesauri, classifications…
• Requires to make it properly machine-accessible
Europeana Data Model
Trying to evolve towards RDF and Linked Data• Representing objects, persons, places, etc. as
resources• Linking and re-using external sources• (Re-using) richer data modeling features
SKOS, CIDOC-CRM, OAI-ORE
• Enabling domain-specific data profiles• Separating original data from enrichments
http://version1.europeana.eu/web/europeana-project/technicaldocuments/
Prototype: Europeana Thought Lab
http://europeana.eu/portal/thought-lab.html
Clustering of results
Baseline: matching concepts' label
Controlled place name from a vocabulary at the Rijskmuseum
Metadata for the object
A "more specific Egypte"?
A "more specific Egypte"?Metadata for the object
A place more specific than the Egypt one
Semantic information on the Giza place in the Rijskmuseum Vocabulary
Following other relations
Following other relations - creator
Metadata for the object
Controlled person name from a vocabulary at the Rijskmuseum
Following other relations - matchInformation on Gustave Le Gray from the Rijskmuseum Vocabulary
Matched to a "Gustave Le Gray" from another Vocabulary
Enabling bits & pieces
Exploiting semantic links in CH vocabulariesConcept “Giza” narrower than concept “Egypte”
Mapping/alignment between CH vocabulariesLouvre’s “Égypte” equivalent to Rijksmuseum’s “Egypte”
Enrichment of existing metadataThe string “Egypt” in a metadata record indicates the concept of
Egypt defined in Rijksmuseum thesaurus
Challenge #1: Linking
Challenge #1: Linking
Manual mapping of large vocabularies is labour-intensive• LCSH, RAMEAU and SWD mapped in the MACS project
http://macs.cenl.org • SWD and DDC mapped in the CRISS-CROSS project
http://linux2.fbi.fh-koeln.de/crisscross/
Automatic linking is not perfect but can help• STW, AGROVOC…• Some studies (and further pointers) for automatic library
thesaurus alignment in the STITCH projecthttp://stitch.cs.vu.nl
Challenge #1: Linking
• (Semi-)automatic techniques are necessary to– Connect objects to vocabularies (esp. for legacy data)– Connect objects themselves together
• Crowdsourcing?
• Making the way librarians create metadata evolve?
Linking strategy for libraries?
Linking strategy for libraries?
• Links to library-originated sources– VIAF, LCSH, DDC, UDC, Worldcat, PND…
• Links to resources from cultural environment– Museums, archives– Scientific communities: bibliographic data & research data– Publishers– Europeana and other aggregators
Semantic Annotation
Conclusion?
• Linked Data won’t not solve everything right now
• Just a set of techniques and a vision for better sharing, cross-linking and re-use data, fitting the web
• Which is not bad!
If we stop here, thanks for your attention!
Any (more) questions?
Some references
W3C Library LD Incubator
http://www.w3.org/2005/Incubator/lld• 1-year group• OCLC, LC, VU Amsterdam, DNB, etc.
• help increase global interoperability of library data on the Web
• bringing together people involved in Linked Data—in the library community and beyond
• building on existing initiatives and collaboration tracks for the future
Library LDUse Cases
• LLD use cases and case studies (work in progress)
• JISC cases for open bibliographic data http://obd.jisc.ac.uk
http://www.w3.org/2005/Incubator/lld/wiki/UseCases
Useful vocabularies to express data• Dublin Core• SKOS• BIBO• OAI-ORE• FOAF• MADSIn progress• RDA vocabularies• FRBR@IFLACf. Linked Open VocabulariesNote: vocabularies can be combined and articulated together
dublincore.org/www.w3.org/2004/02/skos/
bibliontology.com/www.openarchives.org/ore/
www.foaf-project.org/www.loc.gov/standards/mads/rdf/
metadataregistry.org/rdabrowse.htm
labs.mondeca.com/dataset/lov/
Datasets
• Controlled vocabularies (thesauri, etc.)LCSH, DDC, Agrovoc, VIAF, GND
• Bibliographic dataNat. Libraries of Hungary, Sweden
• Trying to keep track of some on CKANhttp://ckan.net/group/lld
In the Netherlands
• DEN, Bibliotheek.nl, KB, Vrije Universiteit Amsterdam, Beeld en Geluid, UvA Library
• Amsterdam Museum as Linked Data http://semanticweb.cs.vu.nl/lod/am/
• Dutch Culture Link http://sites.google.com/site/dclod11/
• Dublin Core 2011http://dcevents.dublincore.org/index.php/IntConf/dc-2011
Pictures• http://www.europeana.eu/portal/record/
03903/8C5C6AEFF6B50DCCEDF6A23A99DD3A2D66AEB2CC.html• http://www.europeana.eu/portal/record/03912/
E9666896A50FDDE5F7F15A17C11219A7FBCBBC50.html(Europeana links give access to resources on original sites)
First Demo pointers
• American LCSH http://id.loc.gov• French RAMEAU: http://stitch.cs.vu.nl/rameau• German SWD: http://d-nb.info/gnd/ • Agrovoc: http://aims.fao.org/ • STW: http://zbw.eu/stw/ • DBPedia: http://dbpedia.org/