publishing and using linked open data - day 1
DESCRIPTION
A Gentle Introduction to Linked Open Data. Linked Data Use CasesTRANSCRIPT
#lod4h
Publishing and Using Linked Open Data
Richard J. Urban, Ph.D.
School of Library and Information StudiesFlorida State [email protected]@musebrarian
#lod4h
January 7, 2013Monday’s Schedule
• 9:30 - 10:00 Class Session: Participant Introductions
• 10:00- 10:45 Class Session: A Gentle Introduction to Linked Data
• 10:45-11:00 am Break
• 11:00 am- Noon Class Session: Exploring Linked Data Use Cases
• Noon- 1 pm Lunch (on your own)
• 1:00-2:30 pm Class Session: A Gentle Introduction to Linked Data (con't)
• 2:30-2:45 pm Break
• 2:45-3:45pm Class Session: Participant Project Kick-off
• 4:00-5:00 pm Lecture: Seb Chan - Location: Ulrich Recital Hall, Tawes Fine Arts Building
• 5:30 pm-7:00 pm Graduate Student Networking Event Hosted by CUNY and MITH
Location: MITH
0301 Hornbake Library (inside Non-Print Media)
Refreshments Provided
#lod4h
PARTICIPANT INTRODUCTIONS
#lod4h
A GENTLE INTRODUCTION TO LINKED DATA: PART I
#lod4h 5
A Web of Documents
Berners-Lee, T. (1989) Information Management: A Proposalhttp://goo.gl/xh36K
#lod4h 6
World Wide Web
WWW - documents with simple relationships
#lod4h 7
An HTML tree
#lod4h
Document semantics
• XML (and HTML) provides a descriptive markup for documents (including metadata records)
• Even for more complex XML, like TEI, the meaning of many elements is dependent on it’s context within a document instance.
• Interpreting this context requires human intervention.
#lod4h
Organizing the Web
• Human organization• Crawl and Index
– Uses many of the methods used by digital humanities scholars to extract information from web documents.
• Page Rank– Inferring importance from links
#lod4h 10
Database-driven Web:Silos of Data
#lod4h 11
Data-driven documents
WWW - documents with simple relationships
Data
Data
Data
Data
#lod4h 12
•Federated (Z39.50)
•Aggregated (Open Archives Initiative – Protocol for Metadata Harvesting)
•Application Programming Interface (API) (service specific)
#lod4h
Data Semantics
• Often dependent on human interpretation of documents/standards.
• Local data-provider interpretations not always documented or available to data consumers.
#lod4h
LINKED OPEN DATAA New Vision in Two Parts
#lod4h 15
Linked Data Principles
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
4. Include links to other URIs. so that they can discover more things.
#lod4h
“Things” = Resources
A resource can be anything that has identity.
Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources.
The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. Thus, a resource can remain constant even when its content---the entities to which it currently corresponds---changes over time, provided that the conceptual mapping is not changed in the process.
http://www.ietf.org/rfc/rfc2396.txt
#lod4h
Uniform Resource Identifiers
• More than a Uniform Resource Locator (URL)
• Proves a mechanism to name resources in a way that works at Internet scale.
http://en.wikipedia.org/wiki/Uniform_resource_identifier
#lod4h
De-referencing URIs
• When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
• URIs can be used to name non-networked resources (concepts, people, physical objects, etc.)
• Useful if information about these objects can be returned when the name is used.
• CoolURIs for the Semantic Web http://www.w3.org/TR/cooluris/
#lod4h
Resource Description Framework
• A model for representing data– An artificial language with a formal semantic
model – Can be expressed using multiple syntaxes– Simple grammar
• RDF “Triple”– <subject> <predicate> <object>– NAME verb Object– Mona Lisa painted by Leonardo da Vinci
#lod4h 20
It’s a graph!
• That uses URIs
http://ex.org/monaLisa# http://purl.org/dc/terms/creator/http://ex.org/daVinci#
#lod4h 21
From a simple language, we can say complex things.
#lod4h
RDF Data Modeling
• RDF can be used with multiple tools for modeling data
• Simple: RDF Schema (RDFS)• Robust: Web Ontology Language (OWL)
– OWL-Lite– OWL-Full
#lod4h
Limitations
• Best used for simple declarative statements– Difficult to express meta-assertions
i.e. “john believes that sally is 5’ tall” – Data provenance/trust– Negation “sally is not 5’ tall” – Tenseless (need to explicitly model time)– Modeling a “record” (named graphs)
#lod4h
SPARQL
• SPARQL Protocol and RDF Query Language– A query language for RDF– Similar to SQL– Implemented by RDF publication software
(Triplestore)
#lod4h
Link to Other Resources
• Include links to other URIs. so that they can discover more things.
– Link to controlled vocabularies/ontologies– Use existing RDFS/OWL schemas– link different representations of the same
resources together• Associate annotations with resources
#lod4h 26
The Linked Data so far
#lod4h
Linked Open Data Criteria
★ Available on the web (whatever format), but with an open license
★★ Available as machine-readable structured data (e.g. excel instead of image scan of a table)
★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel)
★★★★ All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff
★★★★★ All the above, plus: Link your data to other people’s data to provide context
#lod4h
LINKED DATA USE CASES
#lod4h
Use Cases
• Linked Library Use Caseshttp://www.w3.org/2005/Incubator/lld/XGR-lld-usecase-20111025/
• DHWI Exampleshttp://www.diigo.com/user/musebrarian/dhwi_example
#lod4h
A GENTLE INTRODUCTION TO LINKED DATA: PART II
#lod4h
A Simple Start
• Friend of a Friend (FOAF)http://www.foaf-project.org/
• A simple RDF vocabulary for describing people and their relationships.
#lod4h
FOAF (Turtle) Syntax
@prefix : <http://xmlns.com/foaf/0.1/> .
<http://chi.cci.fsu.edu/person/rurban#>
:name "Richard J. Urban" ;
:givenname “Richard” ;
:familyname “Urban” ;
:website <http://chi.cci.fsu.edu/> ;
:workplacehomepage <http://slis.fsu.edu> ;
:workplacedirectory <http://directory.cci.fsu.edu/richard-urban/> ;
:publications <http://chi.cci.fsu.edu/person/rurban/publications> ;
:mbox_sha1sum <e122ce3b5475f25d5824e02574806b5e116b2662> ;
:weblog <http://www.inherentvice.net> .
#lod4h
• http://goo.gl/PgdqN
#lod4h
FOAF <XML> Syntax
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/">
<rdf:Description rdf:about="http://chi.cci.fsu.edu/person/rurban#">
<foaf:name>Richard J. Urban</foaf:name>
<foaf:givenname>Richard</foaf:givenname>
<foaf:familyname>Urban</foaf:familyname>
<foaf:website rdf:resource="http://chi.cci.fsu.edu/" />
<foaf:workplacehomepage rdf:resource="http://slis.fsu.edu" />
<foaf:workplacedirectory rdf:resource="http://directory.cci.fsu.edu/richard-urban/" />
<foaf:publications rdf:resource="http://chi.cci.fsu.edu/person/rurban/publications" />
<foaf:mbox_sha1sum rdf:resource="e122ce3b5475f25d5824e02574806b5e116b2662" />
<foaf:weblog rdf:resource="http://www.inherentvice.net" />
</rdf:Description>
</rdf:RDF>
#lod4h
#lod4h
Basic Turtle
• Terse RDF Triple Language http://www.w3.org/TeamSubmission/turtle/
• Always start with a @prefix to declare a namespace for each schema you will use in your graph– Can mix/match any published RDF schema
@prefix : <http://xmlns.com/foaf/0.1/> .
#lod4h
FOAF Propertieshttp://xmlns.com/foaf/spec/
• FOAF Core– Agent– Person– name– title– img– depiction (depicts)– familyName– givenName– knows– based_near– age– made (maker)– primaryTopic (primaryTopicOf)– Project– Organization– Group– member– Document– Image
• Social Web– nick– mbox– homepage– weblog– openid– jabberID– mbox_sha1sum– interest– topic_interest– topic (page)– workplaceHomepage– workInfoHomepage– schoolHomepage
– publications– currentProject– pastProject– account– OnlineAccount– accountName– accountServiceHomepage– PersonalProfileDocument– tipjar– sha1– thumbnail– logo
#lod4h
Get Yourself a URI
• Can use a CoolURI based on your homepage
• A mailto:[email protected]• A “blank node”
_:me(although these are discouraged for Linked Data)
#lod4h
@prefix : <http://xmlns.com/foaf/0.1/> .
<http://chi.cci.fsu.edu/person/rurban#>
:name “Richard Urban” ;
:homepage <http://chi.cci.fsu.edu> .
URIs are always enclosed in brackets
Properties start with a colon. Strings are in quotes.
Statements end with a semi-colon…..
Except the last statement ends in a period.
#lod4h
Hands-on
• Open a text editor.• Write a FOAF description for yourself
using the Turtle Syntax. – http://xmlns.com/foaf/spec/– http://www.w3.org/TeamSubmission/turtle/
• Save the file with .ttl extension– yourName.ttl
#lod4h
Publishing Your FOAF
• Put the file online, link it from your website.• Publish using an RDF Triplestore• Using FOAF-based plugins for
Wordpress/Drupal, etc.
#lod4h
Sesame Triple Store
• Let’s use my sandbox for this week:– http://goo.gl/PgdqN
• Select the DHWI repository• Select ADD• Context baseURL: http://chi.cci.fsu/dhwi• Past your Turtle into the RDF box.
• All of us together:
#lod4h
Linking our FOAF together.
• I know we just met, and this is crazy, but…
:knows <http://chi.cci.fsu.edu/person/rurban#>
• Add the URI of anyone else in the class you know.
#lod4h
Some FOAF Humanities Use Cases
• Virtual International Authority Filehttp://www.viaf.org
• Social Networks and Archival Contexthttp://socialarchive.iath.virginia.edu/
• Linking Liveshttp://data.archiveshub.ac.uk/page/person/ncarules/skinnerbeverley1938-1999artist
• dbPedia http://dbpedia.org/data/Abraham_Lincoln.n3
#lod4h
Beyond FOAF
• Organization Ontologyhttp://www.w3.org/TR/vocab-org/
• Encoded Archival Context-Corporate, Personas, Families Ontologyhttp://goo.gl/oFIkW
• Other domain ontologies with representations of people.
#lod4h
BREAK
#lod4h
Participant Projects
• What’s a small linked data project you can complete in the next few days?– Explore modeling questions
• Identify existing models
– Create/transform some data • What data is already out there?
– Publish some examples– Explore potential applications
#lod4h
Tonight’s Events
• 4:00-5:00pm Lecture: Seb Chan– Location: Ulrich Recital Hall in Tawes Fine
Arts Building
• 5:30pm-7:00pm Graduate Student Networking Event– Hosted by CUNY and MITH; Location: MITH,
0301 Hornbake Library inside Non-Print Media
– Refreshments Provided