madrid building blocks of linked data
TRANSCRIPT
Linked Data Building blocks
Victor de Boer
With slides from Knud Hinnerk Moller, Kasper Brandt, Christophe Gueret
Web of Documents (WWW)Linked Documents
From text to data > increased semantics
More and more structured data available online
• Governments
• Social web data
• Medical data
• Museums
• Research data
?
Mo
verum
.com
Web of Documents vs Web of Data
• People are often not interested in documents, they are interested in things (information) – Humans are very good at reading (web)
documents and distilling information
• Computers are very good at calculating, combining and filtering information. But they are very bad at reading documents– We need to help machines understand web data
– Write it down in a way that they can understand
LINKED DATA!!
Web of Documents (WWW)Linked Documents
Web of DataLinked Data
without
Slide stolen from Christophe Gueret
with Linked Data
Slide stolen from Christophe Gueret
http://info.cern.ch/Proposal.html
Tim Berners-Lee (The inventor of the Web)And the Semantic Web
What is Linked Open Data?
Intermezzo
Intermezzo
Open Datais about licenses to allow reuse
Linked Datais about technology for interoperability
Intermezzo
Intermezzo
★Available on the web (whatever format), but with an open license
★★
Available as machine-readable structured data (e.g. excel instead of image scan of a table)
★★★as (2) plus non-proprietary format (e.g. CSV instead of excel)
★★★★
All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff
★★★★★All the above, plus: Link your data to other people’s data to provide context
www.w3.org/designissues/linkeddata.html
Linked Data five star system (TBL)
Intermezzo
Intermezzo
http://lod-cloud.net/
Examples of Linked Data
• Academia, Research
• Community
• Libraries, Museums, Cultural Heritage
• Government and public institutions
(Open Data)
• Media
• Business
How does all this work?
• Data, not documents
• Structured data
• Graph (networked) data!
• W3C Web standards stack
– URIs, HTTP, RDF, RDFa, RDFS, OWL, SPARQL, etc.
Four rules of Linked Data
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful information, using the standards (RDF)
4. Include links to other URIs. so that they can discover more things.
http://www.w3.org/DesignIssues/LinkedData.html
Semantic Web standard for writing down data, information
(Subject, Relation, Object)
Resource Description Framework (RDF)
Painting001 Amsterdamhas_location
Painting001 has_location Amsterdam .
Painting001 title “Nachtwacht”.
title “Nachtwacht”
Use HTTP URIs for Things
• Uniform Resource Identifier (URI) is a string of characters used to identify a name of a resource
• http://rijksmuseum.nl/data/schilderij1
• I can go there (dereference) and then I get information about it – HTML page for humans– RDF data for machines
CURIEs
• Compact URIs
• replace URI up to last element with prefix
• define prefix in Turtle:
http://www.w3.org/TR/curie/
http://www.w3.org/2001/XMLSchema#date
xsd
xsd:date
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
“namespace”
Blank Nodes
• Resources without a URI
– Might have local identifiers
• Used for grouping, re-ification,…
• Hard to use in Linked Data context
Painting001has_creator
name
“El Greco”
Creator_type
Painter
Probability
unlikely
Links
• Link your data to other data
– By establishing RDF triples that point to other people’s data
– By reusing other people’s URIs
Example: Link to Geonames
IDS: document 0002 Country:”Gambia”
Geonames:Gambia
Region: Africa
population : 1593256
N 13° 30' 0'' W 15° 30' 0'
Turtle Syntax
@prefix data: <http://data.example.org/> .
@prefix vocab: <http://voc.example.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
data:shanghai
vocab:located_in data:peoples_republic_of_china ;
vocab:name "Shang-hai"@ga, "Shanghai"@en, "上海"@zh ;
vocab:population "23019148"^^xsd:int .
data:sjtu
vocab:located_in data:shanghai ;
vocab:name "Shanghai Jiao Tong University"@en .
define prefixes
http://www.w3.org/TR/turtle/
abbreviate URIs as CURIEs
group triples with same subject,predicate
group triples withsame subject
Unicode
Other Syntaxes
• RDF/XML– XML-based syntax
– still widely used, but less readable than Turtle
• RDFa– RDF embedded in HTML, using element attributes
• JSON-LD– JSON serialisation
• Named Graph Support: Trig (Turtle), Trix(RDF/XML), N-Quads (N-Triples)
Named Graphs
• divide RDF graph in a dataset into several subgraphs
• each subgraph labelled with a URI
• useful for keeping track of provenance, timestamps, versioning, etc.
• not officially part of the standard, but widely supported by tools (and by SPARQL, see tomorrow)
RDF – Summary
• Graph data model for the Web• Triples (or “statements”):
– <subject> <predicate> <object>– (or <thing> <relationship> <thing>)
• Resources– Things about which we want to make statements– URIs (ideally HTTP URIs)
• Literals:– Values like strings, numbers, dates, booleans, …– Either language tag (zh, en, …) or XML Schema datatype
• Subjects and predicates are always resources• Objects can be resources or literals• Named Graphs (not standard): divide graph into subgraphs
Some Terms to Define Terms (RDF(S))
• rdf:type (or just a in Turtle)– special property to say what kind of a thing ("class") a
resource is
• rdfs:label, rdfs:comment– documentation for humans
• rdfs:Class, owl:Class– this term is a class
• rdfs:Property, owl:DatatypeProperty, owl:ObjectProperty– this term is a property, special kind of property
Some Terms to Define Terms (RDF(S))
• rdfs:subClassOf
– defining class hierarchies
• rdfs:subPropertyOf
– defining property hierarchies
• rdfs:definedBy
– where is this term defined, where can I get the specification?
Reuse things: Vocabularies
• FOAF (Friend of a Friend): People, Organisations, Social Networks
• Dublin Core (Bibliographic): publications, authors, media, etc.
• schema.org (Google, Yahoo!, Bing, Yandex): cross-domain, what search engines are interested in (people, events, products, locations)
• Good Relations: business, products, etc.
rijks:Painting001 Amsterdam
http://purl.org/dc/terms/spatial
Reuse things: Datasets
• GeoNames: Geographical data• DBPedia: RDF version of Wikipedia (also in
Dutch)• GTAA: (Gemeenschappelijke Thesaurus
Audiovisuele Archieven): Persons, topics, AV-terms
• VIAF: Persons
rijks:Painting001 http: //sws.geonames.org/2759794/
http://purl.org/dc/terms/spatial
Publishing Linked Data
Four rules of Linked Data
1. Use URIs to identify things (Resources).
2. Use HTTP URIs so that these things can be referred to and looked up ("dereference") by people and user agents
3. Provide useful information (i.e., a structured description -metadata) about the thing when its URI is dereferenced.
4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.
www.w3.org/DesignIssues/LinkedData.html
So that means that
When I ask for a URI
dbpedia:Amsterdam
I want some data back, describing that resource
Content negotiation
Reply based on preference expressed in HTTP request response header (Accept:)
GET /resource/Amsterdam HTTP/1.1
Host: dbpedia.org
Accept: text/html;q=0.5, application/rdf+xml
I’m ok with HTML… …but I really prefer RDF
text/htmlbody onload="init();" about="dbpedia:Amsterdam">
<div id="header">
<div id="hd_l">
<h1 id="title">About: <a href="dbpedia:Amsterdam">Amsterdam</a></h1>
<div id="homelink">
<!--?vsp if (white_page = 0) http (txt); ?-->
</div>
<div class="page-resource-uri">
An Entity of Type : <a href="http://dbpedia.org/ontology/City">city</a>,
from Named Graph : <a href="http://dbpedia.org">http://dbpedia.org</a>,
within Data Space : <a href="http://dbpedia.org">dbpedia.org</a>
</div>
</div> <!-- hd_l -->
<div id="hd_r">
<a href="http://wiki.dbpedia.org/Imprint" title="About DBpedia">
<img src="/statics/dbpedia_logo.png" height="64" alt="About DBpedia"/>
</a>
</div> <!-- hd_r -->
</div> <!-- header -->
<div id="content">
<p>Amsterdam is de hoofdstad en grootste gemeente van Nederland. De stad, in het Amsterdams ook Mokum genoemd, ligt in de provincie Noord-Holland, aan de monding van de Amstel en aan het IJ. De naam van de stad komt van de ligging bij een in de 13e eeuw aangelegde dam in de Amstel. De plaats kreeg stadsrechten rond 1300 en groeide tot één van de grootste handelssteden ter wereld in de Gouden Eeuw.</p>
text/html
application/rdf+xml<rdf:Description rdf:about="dbpedia:Amsterdam"> <rdf:type
rdf:resource="http://schema.org/City" />
<rdf:type rdf:resource="http://dbpedia.org/ontology/City" />
<rdf:type rdf:resource= "http://dbpedia.org/class/yago/GeoclassCapitalOfAPoliticalEntity" /> <rdf:type rdf:resource="http://dbpedia.org/ontology/Place" /> <rdf:type rdf:resource="http://dbpedia.org/class/yago/CitiesInTheNetherlands" /> <rdf:type rdf:resource="http://dbpedia.org/class/yago/PortCitiesAndTownsInTheNetherlands" />
<rdf:type rdf:resource= "http://dbpedia.org/class/yago/PortCitiesAndTownsOfTheNorthSea" />
<rdf:type rdf:resource= "http://umbel.org/umbel/rc/Location_Underspecified" />
<rdf:type rdf:resource="http://dbpedia.org/ontology/Settlement" />
…
application/x-turtle
<dbpedia:Amsterdam> <dbprop:/subdivisionName> "Amsterdam"@en .
<dbprop:/aprSun> "183"^^<http://www.w3.org/2001/XMLSchema#int> .
<http://www.w3.org/2000/01/rdf-schema#comment> "Amsterdam \u2013 najwi\u0119ksze miasto Holandii i jej stolica konstytucyjna. Wszystkie instytucje rz\u0105dowe …."@pl .
<http://dbpedia.org/ontology/timeZone> <dbpedia:Central_European_Summer_Time> .
<http://xmlns.com/foaf/0.1/name> "Amsterdam"@en .
<http://www.georss.org/georss/point> "52.37305555555555 4.892222222222222"@en .
<dbprop:/yearSun> "1662"^^<http://www.w3.org/2001/XMLSchema#int> .
<http://dbpedia.org/ontology/leaderTitle> "Secretary"@en .
….
What actually should happen
GET /resource/Amsterdam HTTP/1.1
Host: dbpedia.org
Accept: text/html;q=0.5, application/rdf+xmlHTTP/1.1 303 See Other
Location: http://dbpedia.org/data/Amsterdam
Vary: AcceptGET /data/Amsterdam HTTP/1.1
Host: dbpedia.org
Accept: text/html;q=0.5, application/rdf+xml
HTTP/1.1 200 OKContent-Type: application/rdf+xml;charset=utf-8
<?xml version="1.0"?><rdf:RDF
xmlns:units="http://dbpedia.org/units/"xmlns:foaf="http://xmlns.com/foaf/0.1/"xmlns:geon="http://www.geonames.org/ontology#"xmlns:rdfs="http://www.w3.org/2000/01/rdf-
schema#"
Which part of the graph?
Concise bounded description
This notion is also known as "the bnode-closure of a resource“
Symmetric Concise bounded description
is similar to cbd, but includes triples with both URI as subject and object.
DereferencedURI
DereferencedURI
Recipes for publishing Linked Data
1. Serving Linked Data as Static RDF/XML Files
2. Serving Linked Data as RDF Embedded in HTML Files
3. Serving RDF and HTML with Custom Server-Side Scripts
4. Serving Linked Data from Relational Databases
5. Serving Linked Data by Wrapping Existing Application or Web APIs
6. Serving Linked Data from RDF Triple Stores
Tom Heath, Chris Bizer http://linkeddatabook.com/
ClioPatria Triple store
ClioPatria UI
cliopatria.swi-prolog.org powered by
Statistics: Named Graphs
Statistics: predicates in a Named Graph
Local view of a resource
Get external RDF data from the Linked Data Cloud
Query the Linked Data cloud
ClioPatria allows you to dereference external links and load the RDF that is referenced in a separate named graph
The data is only “Linked” if it contains references to resources outside of your domain.– Reuse Properties and Classes from ontologies such as RDF(S), SKOS, OWL
– Link to outside resources (DBPedia, GeoNames,…)
SPARQL endpoint and web interface
Serving Linked Data
• LOD module in ClioPatria
– Server responds with description of requested resource
– Content negotiation
• HTML when text/html
• RDF/XML, JSON, Turtle when appropriate Accept
– Return either Concise Bounded Description or Symmetric Concise Bounded description
PURL.ORG URIs
PURL.org is a service that allows you to redirect URIs to another server.
In this case, we do a partial redirect to our prefixed server
From http://purl.org /collections/example/
to http://eculture.cs.vu.nl:1234/example/lod/collections/example/
Example: Dutch Ships and Sailors
KB Delpher
Dutch-Asiatic Shipping (DAS) –Voyages (Huygens ING)
“VOC Opvarenden”Mustering and payroll information (DANS Easy)
Dutch Ships and Sailors
DAS
GZMVOC
MDB
VOCOPVBegunstig
den
VOCOPVSoldijboek
en
PROV
AAT
VOCOPVOpvaren
den
foaf
owl:sameAs
dss:hasKBLink
rdfs:subClassOf,rdfs:subPropertyOf
dss:DAS link
skos :exactMatch
Modeling in collaboration with historians (1)
dss:Recordmdb:Aanmonstering
mdb:aanmonstering-del_gem-1879-101
dss:Recordmdb:PersoonsContractmdb:persoonscontract-
del_gem-1879-101-16858-Pieter_Hoekstra
dss:Schipmdb:Schip
mdb:schip-del_gem-1879-101-Isadora
dss:shipmdb:ship
“1870-1894"
"Isadora"
rdfs:labeldss:shipname
mdb:scheepsnaam
dss:ShipTypemdb:ScheepsTy
pemdb:schoener
dss:shiptypemdb:scheepstype
“32”
dcterms:identifiermdb:inventarisnummer
mdb:has_KB_article
<http://resolver.kb.nl/resolve?urn=ddd:010063756:mpeg21:a0045:ocr>
mdb:schip-del_gem-1879-137-Isadora
owl:sameAs
dss:has_aanmonstering
mdb:has_person
foaf:Persondss:Person
mdb:Personmdb:persoon-del_gem-1879-101-16858
dss:rank
mdb:rank
dss:Rankmdb:Rang
mdb:matroos
mdb:maandgage
“Pieter"foaf:firstnamemdb:voornaa
m“Hoekstra"
foaf:lastnamemdb:achternaam
Jur Leinenga(Huygens ING) Muster-rolls Northern Provinces1803-1937
Modeling in collaboration with historians (2)
dss:Recordgzmvoc:Telling
gzmvoc:telling-1046-De_Berkel __bnode_
1gzmvoc:aziatischeBemanning
dss:Shipgzmvoc:Schip
gzmvoc: schip-1046-De_Berkel
dss:has_shipgzmvoc:schip
"1046"
“Schip”
“De Berkel”rdfs:label
dss:scheepsnaamgzmvoc:scheepsnaam
dss:ShipTypegzmvoc:Scheepst
ypegzmvoc: type-
Ship
dss:has_shiptypegzmvoc:has_shiptype
gzmvoc:scheepstype
“21”
“Moorsemattroosen”
dss:azRegistratieKop
gzmvoc:azAantalMatrozen
gzmvoc:telling
gzmvoc:heeft DAS heenreis
dss:Recorddas:Voyagedas:voyage-
1918_61
Matthias van Rossum (VU-hist) Payroll information for European
vs Asiatic Sailors (17th / 18th C)
mdb:Schip1 mdb:Kof
mdb:scheepsType
das:ShipX das:Kofship
das:typeOfShip
dss:has_shipType
rdfs:subPropertyOf
rdfs:subPropertyOf
Link properties and classes to interoperability layer
mdb:Schip1 mdb:Kof
mdb:scheepsType
das:ShipX das:Kofship
das:typeOfShip
Aat:Kof
Aat:Platbodems
skos:exactMatch
skos:exactMatch
skos:exactMatch
Vocabulary Links
Links to DBPedia (Ship types, places, ranks)Links to Getty AAT (Ship types, ranks)Links to GeoNames (Places)
DAS (Dutch Asiatic Shipping) examples
http://resources.huygens.knaw.nl/retroboeken/das
http://purl.org/collections/nl/dss/das/voyage-5580_1
http://purl.org/collections/nl/dss/das/voyage-5580_1.ttl
http://purl.org/collections/nl/dss/das/voyage-5580_1.json
http://purl.org/collections/nl/dss/das/voyage-5580_1.rdf