linked data technology and status

45
Linked Data & Semantic Web Technology Linked Data Technology & Status Dr. Myungjin Lee

Upload: myungjin-lee

Post on 19-Jan-2015

3.408 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Linked Data

Technology & Status

Dr. Myungjin Lee

Page 2: Linked Data Technology and Status

Linked Data & Semantic Web Technology

The Semantic Web

an elemental syntax

for content structure

within documents

a simple language

for expressing data models,

which refer to objects ("resources")

and their relationships

more vocabulary

for describing properties and classes

a vocabulary for describing

properties and classes

of RDF-based resources

a protocol and query language

for semantic web data sources

to exchange rules

between many "rules languages"

a string of characters used to identify a name or a resource

Linked Data & Semantic Web Technology http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/#(24)

Page 3: Linked Data Technology and Status

Linked Data & Semantic Web Technology

What is Linked Data?

Linked data describes a method of publishing structured

data so that it can be interlinked and become more useful.

The Semantic Web isn't just about

putting data on the web. It is about

making links, so that a person or

machine can explore the web of data.

With linked data, when you have some of

it, you can find other, related, data. - A roadmap to the Semantic Web by Tim Berners-Lee

http://www.w3.org/DesignIssues/LinkedData.html

Page 4: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Four Principles of Linked Data

1. Use URIs to identify things.

2. Use HTTP URIs so that these things can be referred

to and looked up ("dereferenced") by people and user

agents.

3. Provide useful information about the thing when its

URI is dereferenced, using standard formats such as

RDF/XML.

4. Include links to other, related URIs in the exposed

data to improve discovery of other related

information on the Web.

http://www.w3.org/DesignIssues/LinkedData.html

Page 5: Linked Data Technology and Status

Linked Data & Semantic Web Technology

5 Star Linked Data

★ Available on the web (whatever format) but with an

open licence, to be Open Data

★★ Available as machine-readable structured data (e.g.

excel instead of image scan of a table)

★★★ as (2) plus non-proprietary format (e.g. CSV instead

of excel)

★★★★ All the above plus, Use open standards from W3C

(RDF and SPARQL) to identify things, so that people

can point at your stuff

★★★★★ All the above, plus: Link your data to other people’s

data to provide context

http://www.w3.org/DesignIssues/LinkedData.html

Page 6: Linked Data Technology and Status

Linked Data & Semantic Web Technology

The Basic Requirements for Linked Data

an elemental syntax

for content structure

within documents

a simple language

for expressing data models,

which refer to objects ("resources")

and their relationships

a vocabulary for describing

properties and classes

of RDF-based resources

a protocol and query language

for semantic web data sources

a string of characters used to identify a name or a resource

Linked Data & Semantic Web Technology

Page 7: Linked Data Technology and Status

Linked Data & Semantic Web Technology http://www.google.co.kr/search?q=namdeamun

Page 8: Linked Data Technology and Status

Linked Data & Semantic Web Technology

URI, Thing, and Representation

Thing

URI

Representation

http://data.kdata.kr/resource/Namdaemun

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">

<head>

<title>Namdaemun | kdata.kr</title>

<link rel="alternate" type="application/rdf+xml" href="http://data.kdata.kr/data/Namdaemun" title="RDF" />

</head>

<body onLoad="init();">

<div id="header">

<div>

<h1 id="title">Namdaemun</h1>

<div id="homelink"> &nbsp;at <a href="http://kdata.kr">kdata.kr</a>

identifies

and

names

represents

looks up

URI

http://dbpedia.org/resource/Namdaemun

URI

http://data.kdata.kr/resource/Sungnyemun

links

refers

Person

Machine

http://www.slideshare.net/lysander07/open-hpi-semweb02part1

Page 9: Linked Data Technology and Status

Linked Data & Semantic Web Technology http://www.w3.org/TR/cooluris/

Page 10: Linked Data Technology and Status

Linked Data & Semantic Web Technology

URIs for Real-World Objects

• Be on the Web

– Given only a URI, machines and people should be

able to retrieve a description about the resource

identified by the URI from the Web.

• Be unambiguous

– There should be no confusion between identifiers for

Web documents and identifiers for other resources.

http://www.w3.org/TR/cooluris/

Page 11: Linked Data Technology and Status

Linked Data & Semantic Web Technology

URIs for Real-World Objects

<URI-of-alice> a foaf:Person;

foaf:name "Alice";

foaf:mbox <mailto:[email protected]>;

foaf:homepage <http://www.example.com/people/alice> .

ID

RDF HTML

Resource identifier (URI)

RDF document URI HTML document URI

for web browsers for semantic web applications

http://www.w3.org/TR/cooluris/

Page 12: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Distinguishing between Representations and Descriptions

Generic

Document

RDF HTML

http://data.kdata.kr/page/Namdaemun

http://data.kdata.kr/page/Namdaemun.rdf http://data.kdata.kr/page/Namdaemun.html

text/html application/rdf+xml

Thing

http://data.kdata.kr/resource/Namdaemun

303 redirect

content

negotiation

Page 13: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Cool URIs

• Simplicity

– short and mnemonic

• Stability

– remain as long as possible

• Manageability

– issue your URIs in a way that you can manage

http://www.w3.org/TR/cooluris/

Page 14: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Designing URI Sets for the UK Public Sector

• URIs:

– name the set and describe its characteristics

– identify for the real-world ‘Things’ in a single

concept

– provide a means of looking up data on the web

– provide mechanisms to:

• lookup an Identifier URI and be redirected to its Document

URI

• discover and get each of the Representation URIs

URI Type URI structure Examples

Identifier http://{domain}/id/{concept}/{reference} http://education.data.gov.uk/id/school/78

https://www.gov.uk/government/publications/designing-uri-sets-for-the-uk-public-sector

http://data.gov.uk/resources/uris

Page 15: Linked Data Technology and Status

Linked Data & Semantic Web Technology

URI Design Principles:

Creating Unique URIs for Government Linked Data

• URI Template:

'http://' BASE '/' 'id' '/' ORG '/' CATEGORY ( '/' TOKEN )+

• States and Territories

– Owner

• federal

– Suggested

• http://BASE/id/us/state/NAME

– Example

• http://logd.tw.rpi.edu/id/us/state/Vermont

http://logd.tw.rpi.edu/instance-hub-uri-design

Page 16: Linked Data Technology and Status

Linked Data & Semantic Web Technology

XML (Extensible Markup Language)

• a textual data format for the representation of

arbitrary data structures over the Internet

• both human-readable and machine-readable

<title>

W3C Demonstrates …

</title>

<date>

12 February 2013

</date>

<body>

W3C invites media,

analysts, and other attendees

of Mobile World Congress

</body>

Content

title

date

body

bold1

bold2

Structure

title

date

body

bold1

bold2

Presentation

XML DTD

XML Schema

XSLT

XSL-fo

XPath

Concept

Related

Recommendations

http://en.wikipedia.org/wiki/Xml

Page 17: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Data Representation of XML

• Various ways to represent data using XML

– Myungjin Lee is Hye-jin’s husband.

• We need a method to represent data on abstract

level.

<conjugalrelation>

<husband>Myungjin Lee</husband>

<wife>Hye-jin Han</wife>

</conjugalrelation>

<conjugalrelation husband=“Myungjin Lee”>

<wife>Hye-jin Han</wife>

</conjugalrelation>

<conjugalrelation husband=“Myungjin Lee” wife=“Hye-jin Han” />

Page 18: Linked Data Technology and Status

Linked Data & Semantic Web Technology

RDF (Resource Description Framework)

• a general method for conceptual description or

modeling of information that is implemented in

web resources, using a variety of syntax formats

– Myungjin Lee is Hye-jin’s husband.

hasWife

http://en.wikipedia.org/wiki/Resource_Description_Framework

Page 19: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Data Representation of RDF

hasWife

http://semantics.kr/myungjinlee http://semantics.kr/hye-jinhan http://semantics.kr/rel/hasWife

Subject

URI reference

Predicate

URI reference

Object

URI reference or Literal

Triple

Page 20: Linked Data Technology and Status

Linked Data & Semantic Web Technology

RDF Example

http://www.cars.com/car#A6

http://www.cars.com/car#Car

http://www.cars.com/car#Gasoline

http://www.cars.com/car#GDI

http://www.cars.com/car#Auto_8-Speed http://www.cars.com/car#Sedan

4

http://www.cars.com/car#AWD

115”

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

http://www.cars.com/car#transmission

http://www.cars.com/car#wheelbase

http://www.cars.com/car#engine

http://www.cars.com/car#fuel

http://www.cars.com/car#drivetrain

http://www.cars.com/car#doors

http://www.cars.com/car#body_style

Page 21: Linked Data Technology and Status

Linked Data & Semantic Web Technology

RDF Serialization

• N-Triples – RDF Test Cases, W3C Recommendation, 10 February 2004

– a line-based, plain text serialization format for storing and transmitting RDF data

• Notation 3 (N3) – a shorthand non-XML serialization of RDF models, designed with human-

readability in mind

– much more compact and readable than XML RDF notation

• Turtle (Terse RDF Triple Language) – W3C Candidate Recommendation, 19 February 2013

– a format for expressing data in the Resource Description Framework (RDF) data model

– a subset of Notation3 (N3) language, and a superset of the minimal N-Triples format

• RDF/XML – W3C Recommendation, 10 February 2004

– an XML syntax for writing down and exchanging RDF graphs

http://en.wikipedia.org/wiki/N-Triples

http://en.wikipedia.org/wiki/Notation3

http://en.wikipedia.org/wiki/Turtle_(syntax)

Page 22: Linked Data Technology and Status

Linked Data & Semantic Web Technology

<http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/title> "Tony Benn" .

<http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/publisher> "Wikipedia" .

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:dc="http://purl.org/dc/elements/1.1/">

<rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn">

<dc:title>Tony Benn</dc:title>

<dc:publisher>Wikipedia</dc:publisher>

</rdf:Description>

</rdf:RDF>

@prefix dc: <http://purl.org/dc/elements/1.1/>.

<http://en.wikipedia.org/wiki/Tony_Benn> dc:title "Tony Benn";

dc:publisher "Wikipedia".

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix dc: <http://purl.org/dc/elements/1.1/> .

@prefix ex: <http://example.org/stuff/1.0/> .

<http://www.w3.org/TR/rdf-syntax-grammar>

dc:title "RDF/XML Syntax Specification (Revised)" ;

ex:editor [ ex:fullname "Dave Beckett";

ex:homePage <http://purl.org/net/dajobe/>

] .

N-Triple

RDF/XML

N3

Turtle

Page 23: Linked Data Technology and Status

Linked Data & Semantic Web Technology http://www.w3.org/TR/rdf11-concepts/

Page 24: Linked Data Technology and Status

Linked Data & Semantic Web Technology

RDF 1.0 vs RDF 1.1

RDF 1.0 RDF 1.1

Resource Identification URI IRI (Internationalized

Resource Identifier)

Multiple RDF Graphs X O

HTML content for literal

value X rdf:HTML

Page 25: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Recommendations of RDF

http://www.w3.org/standards/techs/rdf#w3c_all

Page 26: Linked Data Technology and Status

Linked Data & Semantic Web Technology

RDF Schema

• W3C Recommendation, 10 February 2004

• to define classes and properties that may be

used to describe classes, properties and other

resources

• RDF Schema allows

– Definition of Classes

– Definition of Properties and Restrictions

– Definition of Hierarchies

http://www.slideshare.net/lysander07/openhpi-22

Page 27: Linked Data Technology and Status

Linked Data & Semantic Web Technology

RDF Schema Example

car:Car

car:Vehicle

rdfs:subClassOf

rdf:Property

car:body_style rdfs:domain

rdfs:range

rdfs:Class

rdf:type

rdf:type

car:Style

rdf:type

car:A6

rdf:type

car:Sedan rdf:type car:body_style

ABox - assertion component

TBox - terminological component

Page 28: Linked Data Technology and Status

Linked Data & Semantic Web Technology

RDF Semantics

• to provide a formal meaning based on a model-

theoretic semantics in its abstract syntax

<x, y> is in IEXT(I(rdfs:subClassOf))

if and only if x and y are in IC

and ICEXT(x) is a subset of ICEXT(y)

car:Car

car:Vehicle

rdfs:subClassOf

car:A6

rdf:type

rdf:type

Page 29: Linked Data Technology and Status

Linked Data & Semantic Web Technology

SPARQL

• Why do we need a query language for RDF?

– Why de we need a query language for RDB?

– to get to the knowledge from RDF

• SPARQL Protocol and RDF Query Language

– to retrieve and manipulate data stored in Resource

Description Framework format

– to use SPARQL via HTTP

http://www.slideshare.net/lysander07/openhpi-semweb03part1

Page 30: Linked Data Technology and Status

Linked Data & Semantic Web Technology

SPARQL Example

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?name ?email

WHERE {

?person a foaf:Person.

?person foaf:name ?name.

?person foaf:mbox ?email.

}

RDF Knowledge Base

?name ?email

Myungjin Lee [email protected]

Gildong Hong [email protected]

Grace Byun [email protected]

Page 31: Linked Data Technology and Status

Linked Data & Semantic Web Technology

SPARQL Query Forms

• SELECT query

– Used to extract raw values from a SPARQL endpoint, the results are returned in a table format.

• CONSTRUCT query

– Used to extract information from the SPARQL endpoint and transform the results into valid RDF.

• ASK query

– Used to provide a simple True/False result for a query on a SPARQL endpoint.

• DESCRIBE query

– Used to extract an RDF graph from the SPARQL endpoint, the contents of which is left to the endpoint to decide based on what the maintainer deems as useful information.

http://en.wikipedia.org/wiki/SPARQL

Page 32: Linked Data Technology and Status

Linked Data & Semantic Web Technology

OWL (Web Ontology Language)

• knowledge representation languages for

authoring ontologies

• If you need more expressiveness OWL

– such as,

Man Woman ∩ = Ø

Person Person descendant

Person descendant

descendant

Husband Wife 1:1

_01 Action hasGenre

ActionMovie

subClassOf

Genre

type

Page 33: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Linked Data Service

What more do we need?

Triple Store RDBMS HTML HTML

HTML

SPARQL

R2RML

Linked Data Platform

RDFa

GRDDL

RDF

Knowledge +

Page 34: Linked Data Technology and Status

Linked Data & Semantic Web Technology

R2RML

• RDB to RDF Mapping Language

• W3C Recommendation 27 September 2012

• a language for expressing customized mappings

from relational databases to RDF datasets

<http://data.example.com/employee/7369> rdf:type ex:Employee.

<http://data.example.com/employee/7369> ex:name "SMITH".

@prefix rr: <http://www.w3.org/ns/r2rml#>.

@prefix ex: <http://example.com/ns#>.

<#TriplesMap1>

rr:logicalTable [ rr:tableName "EMP" ];

rr:subjectMap [

rr:template "http://data.example.com/employee/{EMPNO}";

rr:class ex:Employee;

];

rr:predicateObjectMap [

rr:predicate ex:name;

rr:objectMap [ rr:column "ENAME" ];

].

R2RML

Result

RDB

http://www.w3.org/TR/r2rml/

Page 35: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Linked Data Platform

• A set of best practices and simple approach for

a read-write Linked Data architecture, based on

HTTP access to web resources that describe

their state using RDF

• W3C Working Draft 25 October 2012

http://www.w3.org/TR/ldp/

Page 36: Linked Data Technology and Status

Linked Data & Semantic Web Technology

RDFa (the Resource Description Framework in attributes)

• W3C Recommendation, 07 June 2012

• to express machine-readable data in Web

documents like HTML, SVG, and XML

Example<p vocab="http://schema.org/" resource="#manu" typeof="Person">

My name is

<span property="name">Manu Sporny</span>

and you can give me a ring via

<span property="telephone">1-800-555-0199</span>.

<img property="image" src="http://manu.sporny.org/images/manu.png" />

</p>

http://www.w3.org/TR/xhtml-rdfa-primer/

Page 37: Linked Data Technology and Status

Linked Data & Semantic Web Technology

GRDDL (Gleaning Resource Descriptions from Dialects of Languages)

• a mechanism and markup format for Gleaning

Resource Descriptions from Dialects of

Languages to obtain RDF triples out of XML

documents, including XHTML

<html xmlns:grddl='http://www.w3.org/2003/g/data-view#'

grddl:transformation="glean_title.xsl getAuthor.xsl">

<head>

<title>Are You Experienced?</title>

</head>

...

<xsl:stylesheet version="1.0">

<xsl:template match="/">

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

<rdf:Description rdf:about="{$subject}">

<dc:title>

<xsl:value-of select="/html:html/html:head/html:title"/>

</dc:title>

</rdf:Description>

</rdf:RDF>

</xsl:template>

</xsl:stylesheet>

<rdf:RDF>

<rdf:Description rdf:about="">

<dc:title>Are You Experienced?</dc:title>

</rdf:Description>

</rdf:RDF>

HTML

glean_title.xsl

RDF

http://www.w3.org/TR/grddl/

Page 38: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Jena Platform

Linked Data Service

Triple Store RDBMS HTML HTML

HTML

SPARQL

TDB & SDB

Jena API

Fuseki

ARQ & LARQ

http://jena.apache.org/

Page 39: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Openlink Virtuoso

• a middleware and database engine hybrid that

combines the functionality of a traditional

RDBMS, ORDBMS, RDF, XML, etc.

– Relational Data Management

– RDF Data Management

– XML Data Management

– Free Text Content Management & Full Text

Indexing

– Document Web Server

– Linked Data Server

– Web Application Server

– Web Services Deployment (SOAP or REST)

http://virtuoso.openlinksw.com/

Page 40: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Openlink Virtuoso Coverage

Linked Data Service

Triple Store RDBMS HTML HTML

HTML

SPARQL

Sponger

SPARQL Server

Storage and Inference

Page 41: Linked Data Technology and Status

Linked Data & Semantic Web Technology

The Linking Open Data cloud diagram

Linked Data & Semantic Web Technology

http://lod-cloud.net/

Page 42: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Media

User Generated Content

Publications

Government

Geographic

Cross-Domain

Life Sciences

Linked Data & Semantic Web Technology

Domain Number of datasets Triples (Out-)Links

Media 25 18,4185,2061 5044,0705

Geographic 31 61,4553,2484 3581,2328

Government 49 133,1500,9400 1934,3519

Publications 87 29,5072,0693 1,3992,5218

Cross-domain 41 41,8463,5715 6318,3065

Life Sciences 41 30,3633,6004 1,9184,4090

User-generated Content 20 1,3412,7413 344,9143

Total 295 316,3421,3770 5,0399,8829

http://www.slideshare.net/lysander07/13-semantic-web-technologies-linked-data-semantic-search

Page 43: Linked Data Technology and Status

Linked Data & Semantic Web Technology

Linked Data & Semantic Web Technology

KDATA (Linked Data for Korea) Domain Triples

3,899

44,278

2,969

126,469

1,130

2,833

5,539

47,340

228,872

4,450

5,392

109,101

1,155

WiFi 1,671

KDATA 808

4,535

10,605

80,156

49,799

3,256

9,418

2,429

16,212

14,300

6,931

39,218

115,099

139,608

1,077,472

http://kdata.kr/index.jsp

Page 44: Linked Data Technology and Status

Linked Data & Semantic Web Technology

<rdf:RDF>

<rdf:Description rdf:about="http://data.kdata.kr/data/Namdaemun?output=rdfxml">

<rdfs:label>RDF description of Namdaemun</rdfs:label>

<foaf:primaryTopic>

<kdc:StateDesignatedHeritage rdf:about="http://data.kdata.kr/resource/Namdaemun">

<rdfs:label>남대문</rdfs:label>

<rdfs:label>숭례문</rdfs:label>

<foaf:depiction rdf:resource="20060227132556895000.jpg"/>

<owl:sameAs rdf:resource="http://dbpedia.org/resource/Namdaemun"/>

...

</rdf:RDF>

http://data.kdata.kr/resource/Namdaemun

HTML

RDF

select ?s

where {

?s rdf:type <http://data.kdata.kr/class/NationalTreasure> .

?s rdfs:label "남대문" .

}

SPARQL

Page 45: Linked Data Technology and Status

Conte

nts

Searc

h o

n th

e S

em

antic

Web

Dr. Myungjin Lee

e-Mail : [email protected] Twitter : http://twitter.com/MyungjinLee

Facebook : http://www.facebook.com/mjinlee SlideShare : http://www.slideshare.net/onlyjiny/