the semantic web riccardo rosati dottorato in ingegneria informatica sapienza università di roma...

41
The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Upload: vincent-nelson

Post on 18-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

The Semantic Web

Riccardo Rosati

Dottorato in Ingegneria InformaticaSapienza Università di Roma

a.a. 2006/07

Page 2: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

The Semantic Web - course overview 2

Overview

the aim of this course is to provide an introduction to the Semantic Web...

... with emphasis on two aspects:• emphasis on AI-related aspects, in particular

knowledge representation and reasoning• emphasis on database-related aspects, in particular

data integration

Page 3: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

The Semantic Web - course overview 3

Overview

• Lecture 1: Introduction to the Semantic Web

• Lecture 2: The XML layer

• Lecture 3: The RDF layer

• Lecture 4: The Ontology layer 1– Description Logics, OWL

• Lecture 5: The Ontology layer 2– reasoning in OWL, OWL species

Page 4: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Overview

• Lecture 6: The Ontology layer 3– OWL technologies: OWL Tools, QuOnto

• Lecture 7: The rule layer

• Lecture 8: The RDF layer 2– RDF semantics

• Lecture 9: The Ontology layer 4– query answering in OWL

• Lecture 10: Handling inconsistency (Jan Chomicki)

Page 5: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Lecture 1

Introduction to the Semantic Web

Page 6: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 6

What is the Semantic Web?

• “The Semantic Web is a Web of actionable information—information derived from data through a semantic theory for interpreting the symbols.”

• “The semantic theory provides an account of ‘meaning’ in which the logical connection of terms establishes interoperability between systems”

(Shadbot, Hall, Berners-Lee, The Semantic Web revisited, IEEE Intelligent Systems, May 2006)

Page 7: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 7

The Semantic Web: why?

• search on the Web: problems...• ...due to the way in which information is stored on

the Web• Problem 1: web documents do not distinguish

between information content and presentation (“solved” by XML)• Problem 2: different web documents may

represent in different ways semantically related pieces of information

• this leads to hard problems for “intelligent” information search on the Web

Page 8: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 8

Separating content and presentation

Problem 1: web documents do not distinguish between information content and presentation

• problem due to the HTML language• problem “solved” by current technology

– stylesheets (HTML, XML)– XML

• stylesheets allow for separating formatting attributes from the information presented

Page 9: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 9

Separating content and presentation

• XML: eXtensible Mark-up Language• XML documents are written through a user-

defined set of tags • tags are used to express the “semantics” of the

various pieces of information

Page 10: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 10

XML: example

HTML:<H1>Seminari di Ingegneria del Software</H1>

<UL> <LI>Teacher: Giuseppe De Giacomo

<LI>Room: 7 <LI>Prerequisites: none</UL>

XML:<course>

<title>Seminari di Ingegneria del Software </title>

<teacher>Giuseppe De Giacomo</teacher><room>1AI, 1I</room><prereq>none</prereq>

</course>

Page 11: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 11

Limitations of XML

XML does not solve all the problems:• legacy HTML documents• different XML documents may express

information with the same meaning using different tags

Page 12: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 12

The need for a “Semantic” Web

Problem 2: different web documents may represent in different ways semantically related pieces of information

• different XML documents do not share the “semantics” of information

• idea: annotate (mark-up) pieces of information to express the “meaning” of such a piece of information

• the meaning of such tags is shared! shared semantics

Page 13: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 13

The Semantic Web initiative

viewpoint:

the Web = a web of data

goal:

to provide a common framework to share data on the Web across application boundaries

main ideas:• ontology• standards• “layers”

Page 14: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 14

The Semantic Web Tower

Page 15: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 15

The Semantic Web Layers

• XML layer

• RDF + RDFS layer

• Ontology layer

• Proof-rule layer

• Trust layer

Page 16: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 16

The XML layer

• XML (eXtensible Markup Language) – user-definable and domain-specific markup

• URI (Uniform Resource Identifier)– universal naming for Web resources

– same URI = same resource

– URIs are the “ground terms” of the SW

• W3C standards

Page 17: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 17

The RDF + RDFS layer

RDF = a simple conceptual data modelW3C standard (1999)

RDF model = set of RDF triples

triple = expression (statement)

(subject, predicate, object)

• subject = resource• predicate = property (of the resource)• object = value (of the property)=> an RDF model is a graph

Page 18: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 18

The RDF + RDFS layer

Example of RDF graph:

http://www.w3.org/TR/REC-rdf-syntax/

“Ora Lassila”

dc:Creator

“1999-02-22”

dc:Date

“W3C”

dc:Publisher

Page 19: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 19

The RDF + RDFS layer

• RDFS = RDF Schema• “vocabulary” for RDF• W3C standard (2004)

example:

Person

Student Researcher

subClassOfsubClassOf

Jeentype

hasSuperVisordomain range

Frank

type

hasSuperVisor

RDFS

RDF

Page 20: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 20

The Ontology layer

ontology = shared conceptualization conceptual model

(more expressive than RDF + RDFS) expressed in a true knowledge representation

language

OWL (Web Ontology Language) = standard language for ontologies

Page 21: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 21

The proof/rule layer

beyond OWL:• proof/rule layer• rule: informal notion• rules are used to perform inference over

ontologies• rules as a tool for capturing further knowledge

(not expressible in OWL ontologies)

Page 22: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 22

The Trust layer

• SW top layer:• support for provenance/trust• provenance:

– where does the information come from?

– how this information has been obtained?

– can I trust this information?

• largely unexplored issue • no standardization effort

Page 23: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 23

The Semantic Web: main ingredients

• underlying web layer (URI, XML)– reusing and extending web technologies

• basic conceptual modeling language (RDF)• ontology language (OWL)• rules/proof• reusing and extending AI technologies

– knowledge representation – automated reasoning

• ...and database technologies– data integration

Page 24: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 24

The Semantic Web from an AI perspective

• the notion of ontology• the role of logic and Description Logics• the role of rule-based formalisms• ... (agent technology)

Page 25: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 25

The notion of ontology

• ontology = shared conceptualization of a domain of interest

• shared vocabulary => simple (shallow) ontology• (complex) relationships between “terms” => deep

ontology• AI view:

– ontology = logical theory (knowledge base)

• DB view:– ontology = conceptual model

Page 26: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 26

Ontologies: example

class-def animal % animals are a classclass-def plant % plants are a class subclass-of NOT animal % that is disjoint from animalsclass-def tree subclass-of plant % trees are a type of plantsclass-def branch slot-constraint is-part-of % branches are parts of some tree has-value tree max-cardinality 1class-def defined carnivore % carnivores are animals subclass-of animal slot-constraint eats % that eat any other animals value-type animalclass-def defined herbivore % herbivores are animals subclass-of animal, NOT carnivore % that are not carnivores, and slot-constraint eats % they eat plants or parts of plants value-type plant OR (slot-constraint is-part-of has-value

plant)

Page 27: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 27

Ontologies: the role of logic

• ontology = logical theory • why?

– declarative

– formal semantics

– reasoning (sound and complete inference techniques)

• well-established correspondence between conceptual modeling formalisms and logic

Page 28: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 28

Ontologies and Description Logics

• OWL is based on a fragment of first-order predicate logic (FOL)

• Description Logics (DLs) = subclasses of FOL– only unary and binary predicates– function-free– quantification allowed only in restricted form– (variable-free syntax)– decidable reasoning

• DLs are one of the most prominent languages for Knowledge Representation

Page 29: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 29

Ontologies and Description Logics

• expressive abilities of DLs have been widely explored

• reasoning in DLs has been extensively studied• DL reasoners have been developed and optimized

DLs as a central technology for the SW

Page 30: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 30

Rule-based formalisms

• Prolog• Logic programming• Constraint (logic) programming• Production rules• Datalog• ...

Rule language for SW not standardized yet

RIF (Rule Interchange Format) W3C working group

Page 31: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 31

The SW from a database perspective

• ontology as a virtual database schema• ontology-based information access• the Semantic Web as a framework for data

integration

Page 32: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 32

Virtual data integration

abstract model of a virtual data integration system:triple (G,S,M) • G = global information schema• S = set of data sources• M = mapping between sources and global schema

1. the data sources are autonomous and heterogeneous information systems

2. the global schema is a virtual conceptualization of the domain of interest

3. the mapping is a declarative specification of the relationship between sources and global schema

Page 33: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 33

Ontology-based information access

• ontology = virtual global information schema• ontology language = conceptual modeling

language• the Web = distributed, heterogeneous, autonomous

set of data sourcesbut :• data integration considers different formalisms for

expressing schema and data• the architecture of a data integration system is

centralized

Page 34: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 34

Ontology-based information access

• the data integration approach can be generalized to non-centralized architectures

=> peer data management systems• despite the differences in the formalisms adopted,

there is a tight relationship between the SW and data integration

=> dealing with incomplete information

Page 35: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 35

The Semantic Web and data integration

similarity between data integration and the SW:• global schema = ontology• sources = web sites (data)• mapping = ??

however:• different languages• different technologies

=> the SW is essentially a data integration technology

Page 36: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 36

Very quick overview of SW technologies

main current technologies and standards for the SW: • RDF • RDFS• OWL

Page 37: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 37

RDF

• RDF = Resource Description Framework• RDF data model is an abstract, conceptual layer

(independent of XML)• RDF data model = set of RDF triples

• triple = (subject, predicate, object)– subject = resource– predicate = property (of the resource)– object = value (of the property)

• standardized in 1999

Page 38: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 38

RDFS

• RDFS = RDF Schema• set of predefined predicates:

– subClassOf– subPropertyOf– domain– range– ...

• ...with predefined semantics!• standardized in 2004

Page 39: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 39

OWL

• OWL = Web Ontology Language• the OWL family is constituted by 3 different

languages (with different expressive power):– OWL Full– OWL-DL– OWL-Lite

• technology at an early stage – standardized in 2004– reasoning techniques and tools are very recent– “optimization” of reasoning is largely unexplored

Page 40: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 40

OWL vs. RDFS

• class-def• subclass-of• property-def• subproperty-of• domain• range

• class-def• subclass-of• property-def• subproperty-of• domain• range

• class-expressions• AND, OR, NOT

• role-constraints• has-value, value-type• cardinality

• role-properties• trans, symm...

• class-expressions• AND, OR, NOT

• role-constraints• has-value, value-type• cardinality

• role-properties• trans, symm...

RDF(S) OWL

Page 41: The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07

Introduction to the Semantic Web 41

From RDFS to OWL Full

with respect to the relative expressive power:

OWL Full > OWL-DL > OWL-Lite > RDFS