linked open data: opportunities & barriers for archives adrian stevenson locah project manager...
Post on 19-Dec-2015
213 views
TRANSCRIPT
Linked Open Data: Opportunities & Barriers for Archives
Adrian StevensonLOCAH Project Manager
UKOLN, University of Bath, UK
Archives 360, Society of American ArchivistsChicago, USA
26th August 2011
The goal of Linked Data is to enable people to share structured data on the Web as easily as they can share documents today.Bizer/Cyganiak/Heath Linked Data Tutorial, linkeddata.org
Linked Data Design Issues
• URIs• LD Design Issues• Triples
http://www.w3.org/DesignIssues/LinkedData.html
Triples• Triples statements– ‘Things’ have ‘properties’ with ‘values’– Subject – Predicate - Object
• Triples are the basis of RDF and Linked Data
ArchivalResource
Repository Provides Access To
The Rolling Stones
Keith Richards Is Member Of
LOCAH Project• Linked Open Copac and Archives Hub
• Funded by #JiscEXPO 2/10 ‘Expose’ call– 1 year project. Started August 2010
• Partners & Consultants:– UKOLN, Mimas, Eduserv, Talis, OCLC, Ed
Summers
• http://blogs.ukoln.ac.uk/locah/
What is LOCAH Doing?
• Part 1: Exposing Archives Hub & Copac data as Linked Data
• Part 2: Creating a prototype visualisation
• Part 3: Reporting on opportunities and barriers
ArchivalResource
Finding Aid
EAD Document
Biographical History
Agent
Family Person Place
Concept
Genre Function
Organisation
maintainedBy/maintains
origination
associatedWith
accessProvidedBy/providesAccessTo
topic/page
hasPart/partOf
hasPart/partOf
encodedAs/encodes
Repository(Agent)
Book
Place
topic/page
Language
Level
administeredBy/administers
hasBiogHist/isBiogHistFor
foaf:focus Is-a associatedWith
level
Is-a
language
ConceptScheme
inScheme
ObjectrepresentedBy
PostcodeUnit
Extent
Creation
Birth Death
extent
participates in
TemporalEntity
TemporalEntity
at time
at time
product of
in
Archives Hub Model
We’re Linking Data!• If something is identified, it can be linked to• We take items from our datasets and link them
to items from other datasets
BBCVIAF
DBPedia
Archives Hub
Copac
GeoNames
Enhancing our data• Already have some links:– Time - reference.data.gov.uk URIs– Location - UK Postcodes URIs and Ordnance Survey URIs – Names - Virtual International Authority File• VIAF matches and links widely-used authority files -
http://viaf.org/– Names - DBPedia
• Also looking at:– Subjects - Library Congress Subject Headings and
DBPedia– Open Calais for entity extraction – from ‘bioghist’ field
http://data.archiveshub.ac.uk/
http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer
Visualisation Prototype• Using Timemap –
– Googlemaps and Simile
– http://code.google.com/p/timemap/
• Early stages with this• Will give location and
‘extent’ of archive.• Will link through to
Archives Hub
Key Benefit of Linked Data• API based mashups work against a fixed
set of data sources• Hand crafted by humans• Don’t integrate well
• Linked Data promises an unbound global data space• Easy dataset integration• Generic ‘mesh-up’ tools
Linked Open Data
• Data can be open or closed• Linked Data can be open or
closed• Most benefit gained when data is
open
Some challenges
Data Modelling
• Steep learning curve–RDF terminology “confusing”– Lack of archival examples
• Complexity–Archival description is hierarchical and
multi-level• ‘Dirty’ Data
Linking Subjects
Linking Places
Sustainability• Can you rely on data sources long-term? • Ed Summers at the Library of Congress
created http://lcsh.info• Linked Data interface for LOC subject
headings• People started using it
Library of Congress Subject Headings
Scalability / Provenance
Example by Bradley Allen, Elsevier at LOD LAM Summit, SF, USA, June 2011
• Same issue with attribution• Solutions: Named graphs? Quads? • Best Practice
Licensing
• Ownership of data often not clear• Hard to track attribution• CC0 for Archives Hub and Copac
data
Is Linked Data the Way?
• Enables ‘straightforward’ integration of wide variety of data sources
• Archival data can ‘work harder’• New channels into your data• Researchers are more likely to discover
sources • ‘Hidden' archives collections of become of the
Web
Attribution and CC License
• Sections of this presentation adapted from materials created by other members of the LOCAH Project
• This presentation available under creative commons Non Commercial-Share Alike:
http://creativecommons.org/licenses/by-nc/2.0/uk/