the web of linked data -...

54
WebDB 2010 June 6 th 2010 Indianapolis USA June 6 th , 2010, Indianapolis, USA The Web of Linked Data The Web of Linked Data A global public dataspace on the Web A global public dataspace on the Web Christian Bizer Freie Universität Berlin

Upload: others

Post on 14-Aug-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

WebDB 2010June 6th 2010 Indianapolis USAJune 6th, 2010, Indianapolis, USA

The Web of Linked DataThe Web of Linked DataA global public dataspace on the WebA global public dataspace on the Web

Christian Bizer Freie Universität Berlin

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 2: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Outline

1. Foundations of Dataspaces and Linked Data Where do they overlap?

2. The Web of Linked Data What data is out there?

3. Linked Data Applications Wh t i b i d ith th d t ? What is being done with the data?

4 Remarks on4. Remarks on Identity

Self-descriptive Data Self-descriptive Data

Pay-as-you-go Integration

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 3: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

The Dataspace Vision

Alternative to classic data integration systems in

P ti f d t

order to cope with growing number of data sources.

Properties of dataspaces may contain any kind of data

(structured semi-structured unstructured)(structured, semi structured, unstructured)

require no upfront investment into a global schema

provide for data-coexistence provide for data coexistence

give best-effort answers to queries

rely on pay-as-you-go data integration rely on pay as you go data integration

Franklin M Halevy A and Maier D : From Databases to DataspacesFranklin, M., Halevy, A., and Maier, D.: From Databases to Dataspaces A new Abstraction for Information Management, SIGMOD Rec. 2005.

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 4: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Dataspace Architecture

Christian Bizer: The Web of Linked Data (6/6/2010)Source: Franklin et al: From Databases to Dataspaces, SIGMOD Rec. 2005.

Page 5: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Linked Data Principles

Set of best practices for publishingSet of best practices for publishing structured data on the Web in accordance with the general architecture of the Web.

1 Use URIs as names for things1. Use URIs as names for things.

2. Use HTTP URIs so that people can look up those names.

3. When someone looks up a URI, provide useful RDFinformation.

4. Include RDF statements that link to other URIs so that they can discover related things.

Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2006

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 6: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Architecture of the classic Web

Single global information spaceWeb

BrowsersSearch Engines

Single global information space

S ll t f i l t d dSmall set of simple standards1. HTML as document format2 HTTP URL

HTTP

HTML HTMLHTML

2. HTTP URLs as globally unique IDs retrieval mechanism

hyper-links

retrieval mechanism

3. Hyperlinks to connect everything

B CA B CA

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 7: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Web 2.0 APIs and Mashups

No single global dataspaceMashup

No single global dataspace

Sh t iShortcomings

1. APIs have proprietary interfaces

WebAPI

2. Mashups are based on a fixed set of data sources

3 Y t t h li k

WebAPI

WebAPI

WebAPI

3. You can not set hyperlinks between data items within different APIs

A B C DA B C D

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 8: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Web APIs slice the Web into Walled Gardens

Christian Bizer: The Web of Linked Data (6/6/2010)Image: Bob Jagensdorf, http://flickr.com/photos/darwinbell/, CC-BY

Page 9: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Linked Data

Extend the Web with a single global dataspace1. by using RDF to publish structured data on the Web2. by setting links between data items within different

data sources

RDF RDF RDF RDF RDFRDF

RDF

RDF

RDF

RDF

RDF RDF

RDF

RDF

RDF

RDFlink

RDFlinks

RDFlinks

RDFlinks

B CA D E

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 10: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

The RDF Data Model

rdf:type

f f

foaf:Personrdf:type

pd:cygri

Richard Cyganiakfoaf:name

foaf:based neardbpedia:Berlin

foaf:based_near

Flexible graph-based data model.

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 11: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Entities are identified with HTTP URIs

rdf:typepd:cygri

f f

foaf:Personrdf:type

Richard Cyganiakfoaf:name

foaf:based neardbpedia:Berlin

foaf:based_near

HTTP URIs take the role of global primary keys.

d i htt // i h d i k d /f f df# ipd:cygri = http://richard.cyganiak.de/foaf.rdf#cygridbpedia:Berlin = http://dbpedia.org/resource/Berlin

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 12: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Resolving URIs over the Web

rdf:type

3 405 259f f

foaf:Personrdf:type

pd:cygri

3.405.259dp:populationRichard Cyganiak

foaf:name

foaf:based near

skos:subject

dbpedia:Berlinfoaf:based_near

d Citi i G

skos:subject

dp:Cities_in_Germany

The HTTP protocol brings together identification and retrie al again

Christian Bizer: The Web of Linked Data (6/6/2010)

retrieval again.

Page 13: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Following Links deeper into the Web

rdf:type

3 405 259f f

foaf:Personrdf:type

pd:cygri

3.405.259dp:populationRichard Cyganiak

foaf:name

foaf:based near

skos:subject

dbpedia:Berlinfoaf:based_near

d Citi i G

skos:subject

db di H bskos:subject

dp:Cities_in_Germanydbpedia:Hamburg

dbpedia:Muenchen skos:subject

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 14: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

The Disco – Hyperdata Browser

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 15: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 16: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Properties of the Web of Linked Data

Global, distributed dataspace built on a simple set of standards RDF, URIs, HTTP

Entities are connected by links creating a global data graph that spans data sources and

enables the discovery of new data sources.

Provides for data-coexistence Everyone can publish data to the Web of Linked Data Everyone can publish data to the Web of Linked Data

Everyone can express their personal view on things

Everybody can use the schemata that they like for this Everybody can use the schemata that they like for this

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 17: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

2. Linked Data Deployment on the Web

Is this real?

RDF RDF RDF RDF RDFRDF

RDF

RDF

RDF

RDF

RDF RDF

RDF

RDF

RDF

RDFlink

RDFlinks

RDFlinks

RDFlinks

B CA D E

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 18: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

W3C Linking Open Data Project

Grassroots community effort toy publish existing open license datasets as Linked Data on the Web interlink things between different data sources

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 19: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

LOD Datasets on the Web: May 2007

Over 500 million RDF triples

Christian Bizer: The Web of Linked Data (6/6/2010)

p Around 120,000 RDF links between data sources

Page 20: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

LOD Datasets on the Web: September 2008

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 21: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

LOD Datasets on the Web: July 2009

Christian Bizer: The Web of Linked Data (6/6/2010)

Over 13.1 billion RDF triples Over 142 million RDF links between data sources

Page 22: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

DBpedia – An Interlinking Hub in the Web of Data

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 23: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

DBpedia

community effort to extract structured information from Wikipediainformation from Wikipedia.

provides data about 3.4 million things 312 000 persons 312,000 persons 140,000 organizations 413,000 places413,000 places 94,000 music albums 49,000 films 146,000 species …

provides identifiers for many common things http://dbpedia.org/resource/Calgary

overlaps with many other data sources on the Web

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 24: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

The LOD effort is losing track with the diagram :-)

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 25: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 26: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 27: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Uptake in Life Sciences

W3C Linking Open Drug Data Effort

Bio2RDF Project

Allen Brain Atlas

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 28: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Uptake in the Libraries Community

Institutions publishing Linked Data Library of Congress (subject headings)

German National Library (PND dataset and subject headings)

Swedish National Library (Libris - catalog)

Hungarian National Library (OPAC and Digital Library)

German Central Library of Economics (subject headings)

Workshop: Semantic Web in Bibliotheken (SWIB09) Köln, 24. und 25. November 2009

http://www.swib09.de/

W3C Library Linked Data Incubator Group

Open Archives Object Reuse and Exchange (OAI-ORE) Standard

Christian Bizer: The Web of Linked Data (6/6/2010)

p j g ( )

Page 29: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Uptake in the Media Industry

publish data as RDF/XML and/or

embed data into HTML using RDFa embed data into HTML using RDFa

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 30: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

The Structural Continuum

The Web of Linked Data is interwoven with the classic Web.

Unstructured data: HTML

The Web of Linked Data is interwoven with the classic Web.

Unstructured data: HTML

Semi-structured data: RDFa embed into HTML

Structured data: RDF/XML

Services using named entity recognition to annotate texts with Linked Data URIsto annotate texts with Linked Data URIs Open Calais (Thomsons Reuters) for news

Z t ( t t ) f bl t Zemanta (startup) for blog posts

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 31: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

3. Linked Data Applications

What can I do with this?

Search Engines

Linked DataMashups

Linked DataBrowsers EnginesMashupsBrowsers

Thing Thing Thing Thing ThingThing

Thing

Thing

Thing

Thing

Thing Thing

Thing

Thing

Thing

typedlinks

typedlinks

typedlinks

typedlinks

B CA D E

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 32: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Linked Data Browsers

P id f i ti b t d tProvide for navigating between data sources in order to explore the dataspace.

Tabulator Browser (MIT, USA)

Marbles (FU Berlin, DE)

OpenLink RDF Browser (OpenLink, UK)p ( p )

Zitgist RDF Browser (Zitgist, USA)

Di H d t B (FU B li DE)Disco Hyperdata Browser (FU Berlin, DE)

Fenfire (DERI, Irland)

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 33: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 34: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

DBpedia Mobile

Displays DBpedia data on a mapon a map

Provides for navigating into other data sources

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 35: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Web of Data Search Engines

C l th d t d id b t ff tCrawl the dataspace and provide best-effort query answers over crawled data.

Falcons (IWS, China)

Sig.ma (DERI, Ireland)

Swoogle (UMBC, USA)

VisiNav (DERI, Ireland)

W t (O U i it UK)Watson (Open University, UK)

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 36: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 37: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 38: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 39: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

What are the big players doing?

Yahoo! and Google have started to crawl Linked Data in its RDFa serialization as well as MicroformatsRDFa serialization as well as Microformats.

Yahoo! provides access to crawled data through the Yahoo BOSS API

is using the data within Yahoo Search Monkey to make search results f l d i ll limore useful and visually appealing.

Google uses crawled RDF data for its Social Graph API

uses crawled data to enhance search results snippets f i d lfor reviews and people.

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 40: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Yahoo! Search Monkey

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 41: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Facebook’s Open Graph Protocol

Facebook imports RDFa data from external web sites.

For instance: IMDb, Microsoft, NHL, Posterous

Rotten Tomatoes, TIME, Yelp

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 42: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

4. Remarks on

1 Identif1. Identify

2. Self-descriptive Data p

3. Pay-as-you-go Integration

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 43: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Identity

Real world objects are identified with multiple URIs.

Coupling of identification and retrieval.

Data-coexistence: Everybody can say everything Data-coexistence: Everybody can say everything about everything.

Wrapper around the

Linked Data websiteof our research group

Wrapper around the DBLP bibliography

http://www4.wiwiss.fu-berlin.de/is-group/resource/persons/Person4

http://dblp.l3s.de/d2r/resource/authors/Christian_Bizer

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 44: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Identity Resolution

Publication of owl:sameAs links on the Web.

<http://www4.wiwiss.fu-berlin.de/is-group/resource/persons/Person4>

owl:sameAs

Pay as you go Identity Management

<http://dblp.l3s.de/d2r/resource/authors/Christian_Bizer> .

Pay-as-you-go Identity Management Cheap to set up: Just put a wrapper in front of your DB

(for instance using D2R Server)(for instance using D2R Server)

Later: You or somebody else invests effort into identity resolution

related approach: iTrail hints (Vas Salles et al., VLDB 06, 07) related approach: iTrail hints (Vas Salles et al., VLDB 06, 07)

How to create owl:sameAs links? A t ti b d d l ti t hi d i ti Automatic based on declarative matching descriptions

(for instance using the Silk Linking Framework )

Manually (for instance like within Ueberblick org)

Christian Bizer: The Web of Linked Data (6/6/2010)

Manually (for instance like within Ueberblick.org)

Page 45: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Pay-As-You-Go Data Integration

1. Stage: RAW DATA NOW! don’t care too much about the schema don t care too much about the schema

just publish your data as RDF on the Web

http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html

2 Stage: Increase the usefulness of your data and2. Stage: Increase the usefulness of your data and ease data integration by making it self-descriptive.

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 46: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Enable Clients to retrieve the Schema

Clients can resolve the URIs that identify vocabulary terms in order to get their RDFS or OWL definitions.

Some data on the Web<http://richard.cyganiak.de/foaf.rdf#cygri>

foaf:name "Richard Cyganiak" ;

Some data on the Web

rdf:type <http://xmlns.com/foaf/0.1/Person> .

Resolve unknown term

RDFS or OWL definition

Resolve unknown term http://xmlns.com/foaf/0.1/Person

<http://xmlns.com/foaf/0.1/Person>

rdf:type owl:Class ;

RDFS or OWL definition

rdfs:label "Person";

rdfs:subClassOf <http://xmlns.com/foaf/0.1/Agent> ;

rdfs:subClassOf <http://xmlns.com/wordnet/1.6/Agent> .

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 47: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Reuse Terms from Common Vocabularies

Common Vocabularies Friend-of-a-Friend for describing people and their social network

SIOC for describing forums and blogs

SKOS for representing topic taxonomies

Organization Ontology for describing the structure of organizations

GoodRelations for describing products and business entities

Music Ontology for describing artists, albums, and performances

Review Vocabulary provides terms for representing reviews

Common sources of identifiers (URIs) for real world objects LinkedGeoData and Geonames: Locations

GeneID and UniProt: Life science identifiers

Dbpedia: Wide range of things

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 48: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Publish Schema Mappings on the Web

<http://xmlns com/foaf/0 1/Person>

Schema Mapping<http://xmlns.com/foaf/0.1/Person>

owl:equivalentClass

<http://dbpedia.org/ontology/Person> .

Simple Mappings: OWL owl:equivalentClass, owl:equivalentProperty

Complex Mappings: R2Rg provides value transformation functions

structural transformations

Pay-as-you-go Aspecty y g p1. Use a mix of common vocabularies and proprietary terms

2. You or somebody else publishes schema mappings afterwards

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 49: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Somebody-Pays-As-You-Go

The overall data integration effort is Fix 

Overall Data  Integration

split between the data publisher, the data consumer and third parties.

Data Publisher publishes data as RDF

IntegrationEffort

publishes data as RDF

publishes data in a self-descriptive fashion

sets links and publishes mappings sets links and publishes mappings

Third PartiesThird 

Publisher‘s set links pointing at your data

publish mappings to the Web

Party Effort

Publisher‘sEffort

Data Consumer has to do the rest

Consumer‘sEffort

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 50: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Hands on: How to play around with Linked Data

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 51: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Hands on: How to play around with Linked Data

1. Get some data using a crawler for instance: LDspider (GPL license)

http://code.google.com/p/ldspider/

2. Store the data using for instance: Virtuoso (GPL), Sesame (BSD), Jena TDB (BSD)

or any relational database or column store you like

decision help: Berlin SPARQL Benchmark (Nov 2009)

3. Query and analyze the data using the SPARQL query language using the SPARQL query language

SPARQL 1.1 adds support for aggregates, subqueries, negation

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 52: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Shortcut: Billion Triples Challenge Dataset

Download the Billion Triples Challenge Dataset 3.2 billion triples (27GB gzipped)

crawled from the public Web of Linked Data in March/April 2010

http://challenge.semanticweb.org/

If you do something interesting with the data submit your results to the challenge until October 1st

present your results at the 9th International Semantic Web Conference (ISWC2010), November 2010, Shanghai, China

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 53: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Summary

Linked Data moves the dataspace vision to a global scaleand adds the social/community aspect to it.

The Web of Linked Data is growing rapidlyg g p y active deployment communities in different domains

might have exceeded the critical massg

Great playground for experimentation dataspace profiling dataspace profiling

probabilistic and approximate schema mapping

data fusion data quality and trust data fusion, data quality, and trust

What will the user interfaces look like?

Will search engines turn into answer engines? Will search engines turn into answer engines?

Christian Bizer: The Web of Linked Data (6/6/2010)

Page 54: The Web of Linked Data - uni-mannheim.dewifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-WebDB-WebOf… · LOD Datasets on the Web: July 2009 Christian Bizer: The Web of Linked

Thanks!

References Overview Article Overview Article

Christian Bizer, Tom Heath, Tim Berners-Lee: Linked Data – The Story So Farhttp://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf

Linking Open Data Project Wiki http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData

Tutorial on How to Publish Linked Data on the Web Tutorial on How to Publish Linked Data on the Webhttp://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/

3rd Linked Data on the Web Workshop at WWW2010

Christian Bizer: The Web of Linked Data (6/6/2010)

3 Linked Data on the Web Workshop at WWW2010http://events.linkeddata.org/ldow2010/