creating linked data 2/5 semtech2011

48
Creating Linked Data Juan F. Sequeda Semantic Technology Conference June 2011

Upload: juan-sequeda

Post on 11-May-2015

1.759 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Creating Linked Data 2/5 Semtech2011

Creating Linked Data

Juan F. SequedaSemantic Technology Conference

June 2011

Page 2: Creating Linked Data 2/5 Semtech2011

Linked Data is a set of best practices to publish and interlink data

on the web

Page 3: Creating Linked Data 2/5 Semtech2011

Linked Data Principles

1. Use URIs as names for things

2. Use HTTP URIs so that people can look up (dereference) those names.

3. When someone looks up a URI, provide useful information.

4. Include links to other URIs so that they can discover more things.

Page 4: Creating Linked Data 2/5 Semtech2011

1) Use URIs as names for things

Page 5: Creating Linked Data 2/5 Semtech2011

1) Use URIs as names for things

• Uniform Resource Identifiers identify real world objects and abstract concepts– Not only web documents and digital content– People, places, locations, my car– Know somebody, from somewhere

Page 6: Creating Linked Data 2/5 Semtech2011

1) Use URIs as names for things

http://juansequeda.com/foaf.rdf#me http://www.w3.org/People/Berners-Lee/card#i

http://xmlns.com/foaf/0.1/knows

Page 7: Creating Linked Data 2/5 Semtech2011

1) Use URIs as names for things

• http://juansequeda.com/foaf.rdf#me– Identifies the person

• http://juansequeda.com/foaf.rdf– Identifies an RDF document

Page 8: Creating Linked Data 2/5 Semtech2011

2) Use HTTP URIs so that people can look up (dereference) those names.

Page 9: Creating Linked Data 2/5 Semtech2011

2) Use HTTP URIs so that people can look up (dereference) those names.

• HTTP protocol is the Web’s universal access mechanism

• Linked Data only uses HTTP URIs– URI: unique name– HTTP URI: universal means of access to the URI

• HTTP URIs should be dereferencable

Page 10: Creating Linked Data 2/5 Semtech2011

Dereference a URI?

Page 11: Creating Linked Data 2/5 Semtech2011
Page 12: Creating Linked Data 2/5 Semtech2011
Page 13: Creating Linked Data 2/5 Semtech2011
Page 14: Creating Linked Data 2/5 Semtech2011
Page 15: Creating Linked Data 2/5 Semtech2011
Page 16: Creating Linked Data 2/5 Semtech2011

What’s with the redirection?

Page 17: Creating Linked Data 2/5 Semtech2011
Page 18: Creating Linked Data 2/5 Semtech2011
Page 19: Creating Linked Data 2/5 Semtech2011
Page 20: Creating Linked Data 2/5 Semtech2011
Page 21: Creating Linked Data 2/5 Semtech2011
Page 22: Creating Linked Data 2/5 Semtech2011
Page 23: Creating Linked Data 2/5 Semtech2011
Page 24: Creating Linked Data 2/5 Semtech2011
Page 25: Creating Linked Data 2/5 Semtech2011
Page 26: Creating Linked Data 2/5 Semtech2011
Page 27: Creating Linked Data 2/5 Semtech2011
Page 28: Creating Linked Data 2/5 Semtech2011

RDFa

<html>…

<div xmlns:dc=“http://purl.org/dc/elements/1.1/”><h2 property=“dc:title”>The trouble with

Bob</h2><h3 property=“dc:creator”>Alice</h3>….

</div>…</html>

Page 29: Creating Linked Data 2/5 Semtech2011
Page 30: Creating Linked Data 2/5 Semtech2011

Minting HTTP URIs

• If you own the domain name and run a web server at that location, mint URIs in this namespace

• I own the domain mycompany.com• I run a webserver http://mycompany.com• I now can mint URIs in this namespace:– http://mycompany.com/person/Juan-Sequeda

Page 31: Creating Linked Data 2/5 Semtech2011

Create Cool URIs

• If you don’t control a namespace, don’t misuse it– http://www.imdb.com/title

• Avoid implementation details– http://foo.mycompany.com:8080/person.php?id=

123&format=rdf• Use Natural Keys within URI– http://mycompany.com/person/Juan-Sequeda– http://mycompany.com/person/123

Page 32: Creating Linked Data 2/5 Semtech2011

Three different URIs• URI for the real world object (non-information resource)

– http://dbpedia.org/resource/London– http://id.mycompany.com/person/Juan-Sequeda– http://mycompany.com/person/Juan-Sequeda– http://www.juansequeda.com/foaf.rdf#me

• URI for the HTML document (information resource) that describes the real world object– http://dbpedia.org/page/London– http://pages.mycompany.com/person/Juan-Sequeda– http://mycompany.com/person/Juan-Sequeda.html

• URI for the RDF document (information resource) that describes the real world object– http://dbpedia.org/data/London– http://data.mycompany.com/Juan-Sequeda– http://mycompany.com/person/Juan-Sequeda.rdf– http://www.juansequeda.com/foaf.rdf

Page 33: Creating Linked Data 2/5 Semtech2011

3) Provide useful information

Page 34: Creating Linked Data 2/5 Semtech2011

3) Provide useful information

• How do we provide useful information in document form on the web? HTML

• How do we provide useful information in data form on the web RDF

• Different ways of serializing RDF– RDF/XML– RDFa– N3– turtle

Page 35: Creating Linked Data 2/5 Semtech2011

RDFsubject – predicate – object

Coldplay is the artist of Viva la VidaColdplay is the artist of Viva la Vida

http://dbpedia.org/resource/Coldplay

http://dbpedia.org/resource/Viva_la_Vida

http://dbpedia.org/ontology/artist

Page 36: Creating Linked Data 2/5 Semtech2011

prefix dbpedia-owl: <http://dbpedia.org/ontology/> prefix foaf: <http://xmlns.com/foaf/0.1/>prefix dbprop: <http://dbpedia.org/property/> prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>

http://dbpedia.org/resource/Coldplay

http://dbpedia.org/resource/Viva_la_Vida

http://dbpedia.org/resource/London

dbpedia-owl:artist foaf:name

dbprop:origin

geo:longgeo:lat

“Coldplay”

51.507778-0.128056

Page 37: Creating Linked Data 2/5 Semtech2011

<http://dbpedia.org/resource/Coldplay> <http://dbpedia.org/ontology/artist> <http://dbpedia.org/resource/Viva_la_Vida> . <http://dbpedia.org/resource/Coldplay> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Band> .

<?xml version="1.0" encoding="utf-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

<dbpedia-owl:Band xmlns:dbpedia-owl="http://dbpedia.org/ontology/" rdf:about="http://dbpedia.org/resource/Coldplay"><dbpedia-owl:artist

rdf:resource="http://dbpedia.org/resource/Viva_la_Vida"/> </dbpedia-owl:Band></rdf:RDF>

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<http://dbpedia.org/resource/Coldplay>a <http://dbpedia.org/ontology/Band> ; <http://dbpedia.org/ontology/artist>

<http://dbpedia.org/resource/Viva_la_Vida> .

ntriples

RDF/XML

turtle

Page 38: Creating Linked Data 2/5 Semtech2011

HTML

<div>My name is Bob Smith, but people call me Smithy. Here is my home page:<a href="http://www.example.com">www.example.com</a>.I live in Albuquerque, NM and work as an engineer at ACME Corp.My friends:<a href="http://darryl-blog.example.com">Darryl</a>,<a href="http://edna-blog.example.com">Edna</a></div>

Page 39: Creating Linked Data 2/5 Semtech2011

RDFa (RDF in HTML)<div xmlns:v="http://rdf.data-vocabulary.org/#" typeof="v:Person"> My name is <span property="v:name">Bob Smith</span>, but people call me <span property="v:nickname">Smithy</span>. Here is my homepage: <a href="http://www.example.com" rel="v:url">www.example.com</a>. I live in <span rel="v:address"> <span typeof="v:Address"> <span property="v:locality">Albuquerque</span>, <span property="v:region">NM</span> </span> </span> and work as an <span property="v:title">engineer</span> at <span property="v:affiliation">ACME Corp</span>. My friends: <a href="http://darryl-blog.example.com" rel="v:friend">Darryl</a>, <a href="http://edna-blog.example.com" rel="v:friend">Edna</a></div>

Page 40: Creating Linked Data 2/5 Semtech2011

What to publish?

• Literal Triples<http://www.bbc.co.uk/music/artists/cc197bad-dc9c-440d-a5b5-d52ba2e14234#artist>

<foaf:name> “Coldplay”

• Outgoing Links<http://www.bbc.co.uk/music/artists/cc197bad-dc9c-440d-a5b5-d52ba2e14234#artist>

<owl:sameAs> <http://dbpedia.org/resource/Coldplay>

• Incoming Link<http://www.bbc.co.uk/music/artists/18690715-59fa-4e4d-bcf3-8025cf1c23e0#artist>

<mo:member_of> <http://www.bbc.co.uk/music/artists/cc197bad-dc9c-440d-a5b5-d52ba2e14234#artist>

Page 41: Creating Linked Data 2/5 Semtech2011

What to publish?

• Description of the data set– Semantic Sitemaps– voiD (Vocabulary of Interlinked Datasets)

• Provenance Metadata• Licenses Information

Page 42: Creating Linked Data 2/5 Semtech2011

Vocabularies (or Schemas or Ontologies)

• Create your own using– Simple Knowledge Organization Systems (SKOS)• Taxonomy

– RDF Vocabulary Description Language (RDF Schema)• Light weight vocabularies

– Web Ontological Language (OWL)• Highly expressive and capable of inferencing

Page 43: Creating Linked Data 2/5 Semtech2011

Vocabularies (or Schemas or Ontologies)

• Reuse vocabularies– Dublin Core: metadata attributes– Friend of a Friend (FOAF): persons and relationships– Semantically Interlinked Online Communities (SIOC):

describing users, posts, blogs, etc– Description of a Project (DOAP)– Music Ontology– Programmes Ontology: TV and radio programs– Good Relations: describing products and services– Review Vocabulary– Basic Geo (WGS84) Vocabulary

Page 44: Creating Linked Data 2/5 Semtech2011

4) Include links to other things

Page 45: Creating Linked Data 2/5 Semtech2011

4) Include links to other things

• Set external RDF links into other data sources on the Web– Subject of the triple is in the namespace of one

data set– Object of the triple is a URI in the namespace of

another data set• Connect siloed data islands• Enable discovery

Page 46: Creating Linked Data 2/5 Semtech2011

4) Include links to other things

• Relationship Links<http://www.bbc.co.uk/music/artists/cc197bad-dc9c-440d-a5b5-d52ba2e14234#artist> <http://xmlns.com/foaf/0.1/based_near>

<http://dbpedia.org/resource/London>

• Identity Link<http://www.bbc.co.uk/music/artists/cc197bad-dc9c-440d-a5b5-d52ba2e14234#artist>

<http://www.w3.org/2002/07/owl#sameAs><http://dbpedia.org/resource/Coldplay>

• Vocabulary Links<http://purl.org/ontology/mo/image>

<http://www.w3.org/2000/01/rdf-schema#subPropertyOf><http://xmlns.com/foaf/0.1/depiction>

Page 47: Creating Linked Data 2/5 Semtech2011

Which predicate for linking to choose?

• Depends on your domain• Is it widely used?– owl:sameAs– foaf:knows– foaf:based_near– …

• If you create your own, relate it to a widely used predicate

Page 48: Creating Linked Data 2/5 Semtech2011

How to create the links?

• Manually– Works for small and static data sets– I want to find another URI that identifies the same

real object that I have• Sindice and Falcons provide index of URIs by keyword

• (Semi) Automatic– Record Linkage/Identity Resolution/Co-reference– Silk: http://www4.wiwiss.fu-berlin.de/bizer/silk/– LIMES: http://aksw.org/Projects/limes