madrid building blocks of linked data

Post on 15-Jul-2015

102 Views

Category:

Education

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Linked Data Building blocks

Victor de Boer

With slides from Knud Hinnerk Moller, Kasper Brandt, Christophe Gueret

Web of Documents (WWW)Linked Documents

From text to data > increased semantics

More and more structured data available online

• Governments

• Social web data

• Medical data

• Museums

• Research data

?

Mo

verum

.com

Web of Documents vs Web of Data

• People are often not interested in documents, they are interested in things (information) – Humans are very good at reading (web)

documents and distilling information

• Computers are very good at calculating, combining and filtering information. But they are very bad at reading documents– We need to help machines understand web data

– Write it down in a way that they can understand

LINKED DATA!!

Web of Documents (WWW)Linked Documents

Web of DataLinked Data

without

Slide stolen from Christophe Gueret

with Linked Data

Slide stolen from Christophe Gueret

http://info.cern.ch/Proposal.html

Tim Berners-Lee (The inventor of the Web)And the Semantic Web

What is Linked Open Data?

Intermezzo

Intermezzo

Open Datais about licenses to allow reuse

Linked Datais about technology for interoperability

Intermezzo

Intermezzo

★Available on the web (whatever format), but with an open license

★★

Available as machine-readable structured data (e.g. excel instead of image scan of a table)

★★★as (2) plus non-proprietary format (e.g. CSV instead of excel)

★★★★

All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff

★★★★★All the above, plus: Link your data to other people’s data to provide context

www.w3.org/designissues/linkeddata.html

Linked Data five star system (TBL)

Intermezzo

Intermezzo

http://lod-cloud.net/

Examples of Linked Data

• Academia, Research

• Community

• Libraries, Museums, Cultural Heritage

• Government and public institutions

(Open Data)

• Media

• Business

How does all this work?

• Data, not documents

• Structured data

• Graph (networked) data!

• W3C Web standards stack

– URIs, HTTP, RDF, RDFa, RDFS, OWL, SPARQL, etc.

Four rules of Linked Data

1. Use URIs as names for things

2. Use HTTP URIs so that people can look up those names.

3. When someone looks up a URI, provide useful information, using the standards (RDF)

4. Include links to other URIs. so that they can discover more things.

http://www.w3.org/DesignIssues/LinkedData.html

Semantic Web standard for writing down data, information

(Subject, Relation, Object)

Resource Description Framework (RDF)

Painting001 Amsterdamhas_location

Painting001 has_location Amsterdam .

Painting001 title “Nachtwacht”.

title “Nachtwacht”

Use HTTP URIs for Things

• Uniform Resource Identifier (URI) is a string of characters used to identify a name of a resource

• http://rijksmuseum.nl/data/schilderij1

• I can go there (dereference) and then I get information about it – HTML page for humans– RDF data for machines

CURIEs

• Compact URIs

• replace URI up to last element with prefix

• define prefix in Turtle:

http://www.w3.org/TR/curie/

http://www.w3.org/2001/XMLSchema#date

xsd

xsd:date

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

“namespace”

Blank Nodes

• Resources without a URI

– Might have local identifiers

• Used for grouping, re-ification,…

• Hard to use in Linked Data context

Painting001has_creator

name

“El Greco”

Creator_type

Painter

Probability

unlikely

Links

• Link your data to other data

– By establishing RDF triples that point to other people’s data

– By reusing other people’s URIs

Example: Link to Geonames

IDS: document 0002 Country:”Gambia”

Geonames:Gambia

Region: Africa

population : 1593256

N 13° 30' 0'' W 15° 30' 0'

Turtle Syntax

@prefix data: <http://data.example.org/> .

@prefix vocab: <http://voc.example.org/> .

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

data:shanghai

vocab:located_in data:peoples_republic_of_china ;

vocab:name "Shang-hai"@ga, "Shanghai"@en, "上海"@zh ;

vocab:population "23019148"^^xsd:int .

data:sjtu

vocab:located_in data:shanghai ;

vocab:name "Shanghai Jiao Tong University"@en .

define prefixes

http://www.w3.org/TR/turtle/

abbreviate URIs as CURIEs

group triples with same subject,predicate

group triples withsame subject

Unicode

Other Syntaxes

• RDF/XML– XML-based syntax

– still widely used, but less readable than Turtle

• RDFa– RDF embedded in HTML, using element attributes

• JSON-LD– JSON serialisation

• Named Graph Support: Trig (Turtle), Trix(RDF/XML), N-Quads (N-Triples)

Named Graphs

• divide RDF graph in a dataset into several subgraphs

• each subgraph labelled with a URI

• useful for keeping track of provenance, timestamps, versioning, etc.

• not officially part of the standard, but widely supported by tools (and by SPARQL, see tomorrow)

RDF – Summary

• Graph data model for the Web• Triples (or “statements”):

– <subject> <predicate> <object>– (or <thing> <relationship> <thing>)

• Resources– Things about which we want to make statements– URIs (ideally HTTP URIs)

• Literals:– Values like strings, numbers, dates, booleans, …– Either language tag (zh, en, …) or XML Schema datatype

• Subjects and predicates are always resources• Objects can be resources or literals• Named Graphs (not standard): divide graph into subgraphs

Some Terms to Define Terms (RDF(S))

• rdf:type (or just a in Turtle)– special property to say what kind of a thing ("class") a

resource is

• rdfs:label, rdfs:comment– documentation for humans

• rdfs:Class, owl:Class– this term is a class

• rdfs:Property, owl:DatatypeProperty, owl:ObjectProperty– this term is a property, special kind of property

Some Terms to Define Terms (RDF(S))

• rdfs:subClassOf

– defining class hierarchies

• rdfs:subPropertyOf

– defining property hierarchies

• rdfs:definedBy

– where is this term defined, where can I get the specification?

Reuse things: Vocabularies

• FOAF (Friend of a Friend): People, Organisations, Social Networks

• Dublin Core (Bibliographic): publications, authors, media, etc.

• schema.org (Google, Yahoo!, Bing, Yandex): cross-domain, what search engines are interested in (people, events, products, locations)

• Good Relations: business, products, etc.

rijks:Painting001 Amsterdam

http://purl.org/dc/terms/spatial

Reuse things: Datasets

• GeoNames: Geographical data• DBPedia: RDF version of Wikipedia (also in

Dutch)• GTAA: (Gemeenschappelijke Thesaurus

Audiovisuele Archieven): Persons, topics, AV-terms

• VIAF: Persons

rijks:Painting001 http: //sws.geonames.org/2759794/

http://purl.org/dc/terms/spatial

Publishing Linked Data

Four rules of Linked Data

1. Use URIs to identify things (Resources).

2. Use HTTP URIs so that these things can be referred to and looked up ("dereference") by people and user agents

3. Provide useful information (i.e., a structured description -metadata) about the thing when its URI is dereferenced.

4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.

www.w3.org/DesignIssues/LinkedData.html

So that means that

When I ask for a URI

dbpedia:Amsterdam

I want some data back, describing that resource

Content negotiation

Reply based on preference expressed in HTTP request response header (Accept:)

GET /resource/Amsterdam HTTP/1.1

Host: dbpedia.org

Accept: text/html;q=0.5, application/rdf+xml

I’m ok with HTML… …but I really prefer RDF

text/htmlbody onload="init();" about="dbpedia:Amsterdam">

<div id="header">

<div id="hd_l">

<h1 id="title">About: <a href="dbpedia:Amsterdam">Amsterdam</a></h1>

<div id="homelink">

<!--?vsp if (white_page = 0) http (txt); ?-->

</div>

<div class="page-resource-uri">

An Entity of Type : <a href="http://dbpedia.org/ontology/City">city</a>,

from Named Graph : <a href="http://dbpedia.org">http://dbpedia.org</a>,

within Data Space : <a href="http://dbpedia.org">dbpedia.org</a>

</div>

</div> <!-- hd_l -->

<div id="hd_r">

<a href="http://wiki.dbpedia.org/Imprint" title="About DBpedia">

<img src="/statics/dbpedia_logo.png" height="64" alt="About DBpedia"/>

</a>

</div> <!-- hd_r -->

</div> <!-- header -->

<div id="content">

<p>Amsterdam is de hoofdstad en grootste gemeente van Nederland. De stad, in het Amsterdams ook Mokum genoemd, ligt in de provincie Noord-Holland, aan de monding van de Amstel en aan het IJ. De naam van de stad komt van de ligging bij een in de 13e eeuw aangelegde dam in de Amstel. De plaats kreeg stadsrechten rond 1300 en groeide tot één van de grootste handelssteden ter wereld in de Gouden Eeuw.</p>

text/html

application/rdf+xml<rdf:Description rdf:about="dbpedia:Amsterdam"> <rdf:type

rdf:resource="http://schema.org/City" />

<rdf:type rdf:resource="http://dbpedia.org/ontology/City" />

<rdf:type rdf:resource= "http://dbpedia.org/class/yago/GeoclassCapitalOfAPoliticalEntity" /> <rdf:type rdf:resource="http://dbpedia.org/ontology/Place" /> <rdf:type rdf:resource="http://dbpedia.org/class/yago/CitiesInTheNetherlands" /> <rdf:type rdf:resource="http://dbpedia.org/class/yago/PortCitiesAndTownsInTheNetherlands" />

<rdf:type rdf:resource= "http://dbpedia.org/class/yago/PortCitiesAndTownsOfTheNorthSea" />

<rdf:type rdf:resource= "http://umbel.org/umbel/rc/Location_Underspecified" />

<rdf:type rdf:resource="http://dbpedia.org/ontology/Settlement" />

application/x-turtle

<dbpedia:Amsterdam> <dbprop:/subdivisionName> "Amsterdam"@en .

<dbprop:/aprSun> "183"^^<http://www.w3.org/2001/XMLSchema#int> .

<http://www.w3.org/2000/01/rdf-schema#comment> "Amsterdam \u2013 najwi\u0119ksze miasto Holandii i jej stolica konstytucyjna. Wszystkie instytucje rz\u0105dowe …."@pl .

<http://dbpedia.org/ontology/timeZone> <dbpedia:Central_European_Summer_Time> .

<http://xmlns.com/foaf/0.1/name> "Amsterdam"@en .

<http://www.georss.org/georss/point> "52.37305555555555 4.892222222222222"@en .

<dbprop:/yearSun> "1662"^^<http://www.w3.org/2001/XMLSchema#int> .

<http://dbpedia.org/ontology/leaderTitle> "Secretary"@en .

….

What actually should happen

GET /resource/Amsterdam HTTP/1.1

Host: dbpedia.org

Accept: text/html;q=0.5, application/rdf+xmlHTTP/1.1 303 See Other

Location: http://dbpedia.org/data/Amsterdam

Vary: AcceptGET /data/Amsterdam HTTP/1.1

Host: dbpedia.org

Accept: text/html;q=0.5, application/rdf+xml

HTTP/1.1 200 OKContent-Type: application/rdf+xml;charset=utf-8

<?xml version="1.0"?><rdf:RDF

xmlns:units="http://dbpedia.org/units/"xmlns:foaf="http://xmlns.com/foaf/0.1/"xmlns:geon="http://www.geonames.org/ontology#"xmlns:rdfs="http://www.w3.org/2000/01/rdf-

schema#"

Which part of the graph?

Concise bounded description

This notion is also known as "the bnode-closure of a resource“

Symmetric Concise bounded description

is similar to cbd, but includes triples with both URI as subject and object.

DereferencedURI

DereferencedURI

Recipes for publishing Linked Data

1. Serving Linked Data as Static RDF/XML Files

2. Serving Linked Data as RDF Embedded in HTML Files

3. Serving RDF and HTML with Custom Server-Side Scripts

4. Serving Linked Data from Relational Databases

5. Serving Linked Data by Wrapping Existing Application or Web APIs

6. Serving Linked Data from RDF Triple Stores

Tom Heath, Chris Bizer http://linkeddatabook.com/

ClioPatria Triple store

ClioPatria UI

cliopatria.swi-prolog.org powered by

Statistics: Named Graphs

Statistics: predicates in a Named Graph

Local view of a resource

Get external RDF data from the Linked Data Cloud

Query the Linked Data cloud

ClioPatria allows you to dereference external links and load the RDF that is referenced in a separate named graph

The data is only “Linked” if it contains references to resources outside of your domain.– Reuse Properties and Classes from ontologies such as RDF(S), SKOS, OWL

– Link to outside resources (DBPedia, GeoNames,…)

SPARQL endpoint and web interface

Serving Linked Data

• LOD module in ClioPatria

– Server responds with description of requested resource

– Content negotiation

• HTML when text/html

• RDF/XML, JSON, Turtle when appropriate Accept

– Return either Concise Bounded Description or Symmetric Concise Bounded description

PURL.ORG URIs

PURL.org is a service that allows you to redirect URIs to another server.

In this case, we do a partial redirect to our prefixed server

From http://purl.org /collections/example/

to http://eculture.cs.vu.nl:1234/example/lod/collections/example/

http://cliopatria.swi-prolog.org

powered by

Example: Dutch Ships and Sailors

KB Delpher

Dutch-Asiatic Shipping (DAS) –Voyages (Huygens ING)

“VOC Opvarenden”Mustering and payroll information (DANS Easy)

Dutch Ships and Sailors

DAS

GZMVOC

MDB

VOCOPVBegunstig

den

VOCOPVSoldijboek

en

PROV

AAT

VOCOPVOpvaren

den

foaf

owl:sameAs

dss:hasKBLink

rdfs:subClassOf,rdfs:subPropertyOf

dss:DAS link

skos :exactMatch

Modeling in collaboration with historians (1)

dss:Recordmdb:Aanmonstering

mdb:aanmonstering-del_gem-1879-101

dss:Recordmdb:PersoonsContractmdb:persoonscontract-

del_gem-1879-101-16858-Pieter_Hoekstra

dss:Schipmdb:Schip

mdb:schip-del_gem-1879-101-Isadora

dss:shipmdb:ship

“1870-1894"

"Isadora"

rdfs:labeldss:shipname

mdb:scheepsnaam

dss:ShipTypemdb:ScheepsTy

pemdb:schoener

dss:shiptypemdb:scheepstype

“32”

dcterms:identifiermdb:inventarisnummer

mdb:has_KB_article

<http://resolver.kb.nl/resolve?urn=ddd:010063756:mpeg21:a0045:ocr>

mdb:schip-del_gem-1879-137-Isadora

owl:sameAs

dss:has_aanmonstering

mdb:has_person

foaf:Persondss:Person

mdb:Personmdb:persoon-del_gem-1879-101-16858

dss:rank

mdb:rank

dss:Rankmdb:Rang

mdb:matroos

mdb:maandgage

“Pieter"foaf:firstnamemdb:voornaa

m“Hoekstra"

foaf:lastnamemdb:achternaam

Jur Leinenga(Huygens ING) Muster-rolls Northern Provinces1803-1937

Modeling in collaboration with historians (2)

dss:Recordgzmvoc:Telling

gzmvoc:telling-1046-De_Berkel __bnode_

1gzmvoc:aziatischeBemanning

dss:Shipgzmvoc:Schip

gzmvoc: schip-1046-De_Berkel

dss:has_shipgzmvoc:schip

"1046"

“Schip”

“De Berkel”rdfs:label

dss:scheepsnaamgzmvoc:scheepsnaam

dss:ShipTypegzmvoc:Scheepst

ypegzmvoc: type-

Ship

dss:has_shiptypegzmvoc:has_shiptype

gzmvoc:scheepstype

“21”

“Moorsemattroosen”

dss:azRegistratieKop

gzmvoc:azAantalMatrozen

gzmvoc:telling

gzmvoc:heeft DAS heenreis

dss:Recorddas:Voyagedas:voyage-

1918_61

Matthias van Rossum (VU-hist) Payroll information for European

vs Asiatic Sailors (17th / 18th C)

mdb:Schip1 mdb:Kof

mdb:scheepsType

das:ShipX das:Kofship

das:typeOfShip

dss:has_shipType

rdfs:subPropertyOf

rdfs:subPropertyOf

Link properties and classes to interoperability layer

mdb:Schip1 mdb:Kof

mdb:scheepsType

das:ShipX das:Kofship

das:typeOfShip

Aat:Kof

Aat:Platbodems

skos:exactMatch

skos:exactMatch

skos:exactMatch

Vocabulary Links

Links to DBPedia (Ship types, places, ranks)Links to Getty AAT (Ship types, ranks)Links to GeoNames (Places)

DAS (Dutch Asiatic Shipping) examples

http://resources.huygens.knaw.nl/retroboeken/das

http://purl.org/collections/nl/dss/das/voyage-5580_1

http://purl.org/collections/nl/dss/das/voyage-5580_1.ttl

http://purl.org/collections/nl/dss/das/voyage-5580_1.json

http://purl.org/collections/nl/dss/das/voyage-5580_1.rdf

Thank you!

Victor de Boer

http://victordeboer.comv.de.boer@vu.nl

top related