on libraries & linked data antoine isaac ub utrecht, april 6, 2011

Click here to load reader

Post on 31-Mar-2015




0 download

Embed Size (px)


  • Slide 1

On Libraries & Linked Data Antoine Isaac UB Utrecht, April 6, 2011 Slide 2 Who am I? Europeana Web & Media Lab, Vrije Universiteit Amsterdam Web & Media Lab W3C Library Linked Data group (2006-2009) W3C Semantic Web Deployment groupW3C Semantic Web Deployment group SKOS [email protected] Slide 3 Demo Following ones nose to subject heading lists as linked data American LCSH http://id.loc.gov/authorities/sh85145447#concept French RAMEAU http://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb11931913j German SWD http://d-nb.info/gnd/4064689-0 Agrovoc http://aims.fao.org/aos/agrovoc/c_8309 STW http://zbw.eu/stw/descriptor/14188-0 Further on to DBPedia http://dbpedia.org/resource/Water Slide 4 Demo (fallback option) Subject heading lists as SKOS linked data American LCSH http://id.loc.govhttp://id.loc.gov French RAMEAU: http://stitch.cs.vu.nl/rameauhttp://stitch.cs.vu.nl/rameau German SWD: http://d-nb.info/gnd/http://d-nb.info/gnd/ mapped using manual links from the MACS project http://macs.cenl.org http://macs.cenl.org Starting from http://id.loc.gov/authorities/sh85014310#concepthttp://id.loc.gov/authorities/sh85014310#concept Slide 5 Slide 6 Slide 7 Slide 8 Slide 9 Slide 10 Linked Data? 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful information using standards (RDF, SPARQL) 4. Include links to other URIs, so that they can discover more things Tim Berners-Lee, http://linkeddata.org/http://linkeddata.org/ Slide 11 (Linked) Data Representation That subject heading data follows a link-intensive data model Uniform resource identifiers (URI) Resource Description Framework (RDF) Slide 12 (Linked) Data Representation Use more-or-less the same standard vocabulary Simple Knowledge Organization System (SKOS) http://www.w3.org/2004/02/skos/ For representing thesauri, classifications, etc. on the Semantic Web Slide 13 A SKOS graph animals cats UF domestic cats RT wildcats BT animals SN used only for domestic cats domestic cats USE cats wildcats Slide 14 SKOS mappings SKOS provides conceptual links to bridge across different contexts KOS 1: animals cats wildcats KOS 2: animal human object Slide 15 Links in the data Slide 16 Slide 17 Growing interest for linked data in the library community Slide 18 Linked Library Cloud beginning 2008 [Ross Singer, Code4Lib2010] http://code4lib.org/conference/2010/singer Slide 19 Linked Library sector in 2010 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/http://lod-cloud.net/ Slide 20 Libraries and LD, the perfect match? Libraries have been producing (meta)data for ages Libraries (often) produce high-quality metadata Slide 21 Libraries and LD, the perfect match? Library metadata was locked in record silos But it maintain links to the outside world Bibliographic and web references Shared vocabularies Same books! Slide 22 Libraries and LD, the perfect match? LD is about Citing object Linking to them Re-using data Think of web-native union catalogues Slide 23 Johan Stapel, Koninklijke Bibliotheek (now bibliotheek.nl) A vision for the Dutch National Library Slide 24 A web of cultural heritage data? ? Slide 25 ? Slide 26 The current portal Slide 27 Slide 28 Towards semantic search: facets Slide 29 Building a search engine on top of metadata is difficult Intrinsic quality problems: correctness, coverage Especially when data is so heterogeneous 100s of formats From flat 5-fields records to 100-nodes XML trees Language issue! We currently use a simple, flat interoperability format Quick-win quickly showing its limits Slide 30 We can better use institutions original metadata Accommodate their different practices Data structures and semantics Access objects via a semantic layer of vocabularies for subjects, persons, places Semantic ThoughtLab: experimenting solutions Slide 31 Towards semantics-enabled search Building a "semantic layer" to help accessing content Slide 32 Towards semantics-enabled search Enhance access to Europeana content by semantics Query expansion, clustering of results Exploiting various types of relations "located in", "lived in", "is more specific concept" Semantics are already there, in metadata and "controlled vocabularies" used in metadata Thesauri, classifications Requires to make it properly machine-accessible Slide 33 Europeana Data Model Trying to evolve towards RDF and Linked Data Representing objects, persons, places, etc. as resources Linking and re-using external sources (Re-using) richer data modeling features SKOS, CIDOC-CRM, OAI-ORE Enabling domain-specific data profiles Separating original data from enrichments http://version1.europeana.eu/web/europeana-project/technicaldocuments/ Slide 34 Prototype: Europeana Thought Lab http://europeana.eu/portal/thought-lab.html Slide 35 Clustering of results Slide 36 Baseline: matching concepts' label Controlled place name from a vocabulary at the Rijskmuseum Metadata for the object Slide 37 A "more specific Egypte"? Slide 38 Metadata for the object Slide 39 A place more specific than the Egypt one Semantic information on the Giza place in the Rijskmuseum Vocabulary Slide 40 Following other relations Slide 41 Following other relations - creator Metadata for the object Controlled person name from a vocabulary at the Rijskmuseum Slide 42 Following other relations - match Information on Gustave Le Gray from the Rijskmuseum Vocabulary Matched to a "Gustave Le Gray" from another Vocabulary Slide 43 Enabling bits & pieces Exploiting semantic links in CH vocabularies Concept Giza narrower than concept Egypte Mapping/alignment between CH vocabularies Louvres gypte equivalent to Rijksmuseums Egypte Enrichment of existing metadata The string Egypt in a metadata record indicates the concept of Egypt defined in Rijksmuseum thesaurus Slide 44 Challenge #1: Linking Slide 45 Manual mapping of large vocabularies is labour-intensive LCSH, RAMEAU and SWD mapped in the MACS project http://macs.cenl.org SWD and DDC mapped in the CRISS-CROSS project http://linux2.fbi.fh-koeln.de/crisscross/ http://linux2.fbi.fh-koeln.de/crisscross/ Automatic linking is not perfect but can help STW, AGROVOC Some studies (and further pointers) for automatic library thesaurus alignment in the STITCH project http://stitch.cs.vu.nl Slide 46 Challenge #1: Linking (Semi-)automatic techniques are necessary to Connect objects to vocabularies (esp. for legacy data) Connect objects themselves together Crowdsourcing? Making the way librarians create metadata evolve? Slide 47 Linking strategy for libraries? Slide 48 Links to library-originated sources VIAF, LCSH, DDC, UDC, Worldcat, PND Links to resources from cultural environment Museums, archives Scientific communities: bibliographic data & research data Publishers Europeana and other aggregators Slide 49 Semantic Annotation Slide 50 Conclusion? Linked Data wont not solve everything right now Just a set of techniques and a vision for better sharing, cross-linking and re-use data, fitting the web Which is not bad! Slide 51 If we stop here, thanks for your attention! Any (more) questions? Slide 52 Some references Slide 53 W3C Library LD Incubator http://www.w3.org/2005/Incubator/lld 1-year group OCLC, LC, VU Amsterdam, DNB, etc. help increase global interoperability of library data on the Web bringing together people involved in Linked Datain the library community and beyond building on existing initiatives and collaboration tracks for the future Slide 54 Library LD Use Cases LLD use cases and case studies (work in progress) JISC cases for open bibliographic data http://obd.jisc.ac.uk http://www.w3.org/2005/Incubator/lld/wiki/UseCases Slide 55 Useful vocabularies to express data Dublin Core SKOS BIBO OAI-ORE FOAF MADS In progress RDA vocabularies [email protected] Cf. Linked Open Vocabularies Note: vocabularies can be combined and articulated together dublincore.org/ www.w3.org/2004/02/skos/ bibliontology.com/ www.openarchives.org/ore/ www.foaf-project.org/ www.loc.gov/standards/mads/rdf/ metadataregistry.org/rdabrowse.htm labs.mondeca.com/dataset/lov/ Slide 56 Datasets Controlled vocabularies (thesauri, etc.) LCSH, DDC, Agrovoc, VIAF, GND Bibliographic data Nat. Libraries of Hungary, Sweden Trying to keep track of some on CKAN http://ckan.net/group/lld Slide 57 In the Netherlands DEN, Bibliotheek.nl, KB, Vrije Universiteit Amsterdam, Beeld en Geluid, UvA Library Amsterdam Museum as Linked Data http://semanticweb.cs.vu.nl/lod/am/ Dutch Culture Link http://sites.google.com/site/dclod11/ Dublin Core 2011 http://dcevents.dublincore.org/index.php/IntConf/dc-2011 Slide 58 Pictures http://www.europeana.eu/portal/record/03903/8C5C6AEFF6B50DCCEDF6 A23A99DD3A2D66AEB2CC.html http://www.europeana.eu/portal/record/03903/8C5C6AEFF6B50DCCEDF6 A23A99DD3A2D66AEB2CC.html http://www.europeana.eu/portal/record/03912/E9666896A50FDDE5F7F1 5A17C11219A7FBCBBC50.html http://www.europeana.eu/portal/record/03912/E9666896A50FDDE5F7F1 5A17C11219A7FBCBBC50.html (Europeana links give access to resources on original sites) Slide 59 First Demo pointers American LCSH http://id.loc.govhttp://id.loc.gov French RAMEAU: http://stitch.cs.vu.nl/rameauhttp://stitch.cs.vu.nl/rameau German SWD: http://d-nb.info/gnd/http://d-nb.info/gnd/ Agrovoc: http://aims.fao.org/http://aims.fao.org/ STW: http://zbw.eu/stw/http://zbw.eu/stw/ DBPedia: http://dbpedia.org/http://dbpedia.org/

View more