dr. alexandra i. cristea acristea/ semantic web

75
Dr. Alexandra I. Cristea http://www.dcs.warwick.ac.uk/ ~acristea/ Semantic Web

Upload: ashley-pitts

Post on 17-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dr. Alexandra I. Cristea acristea/ Semantic Web

Dr. Alexandra I. Cristea

http://www.dcs.warwick.ac.uk/~acristea/

Semantic Web

Page 2: Dr. Alexandra I. Cristea acristea/ Semantic Web

2

Page 3: Dr. Alexandra I. Cristea acristea/ Semantic Web

Requirements of the WWW• The internet - already there• HTML programmers• Search engines• Core weight of interest

3

Page 4: Dr. Alexandra I. Cristea acristea/ Semantic Web

Why do we need the Semantic Web?

I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web –– the content, links, and transactions between people and computers.

...the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines.

Tim Berners-Lee (1999) Weaving the Web

4

Page 5: Dr. Alexandra I. Cristea acristea/ Semantic Web

5

• Realising the complete “vision” is too hard for now (probably)

• But we can make a start by adding semantic annotation to web resources

Scientific American, May 2001:

Page 6: Dr. Alexandra I. Cristea acristea/ Semantic Web

6

Where we (still) are Today: the Syntactic Web

[Hendler & Miller 02]

Page 7: Dr. Alexandra I. Cristea acristea/ Semantic Web

7

The Syntactic Web is…• A hypermedia, a digital library

– A library of documents called (web pages) interconnected by a hypermedia of links

• A database, an application platform– A common portal to applications accessible through web pages, and

presenting their results as web pages• A platform for multimedia

– BBC Radio 4 anywhere in the world! Terminator 3 trailers!• A naming scheme

– Unique identity for those documents

A place where computers do the presentation (easy) and people do the linking and interpreting (hard).

Why not get computers to do more of the hard work?

[Goble 03]

Page 8: Dr. Alexandra I. Cristea acristea/ Semantic Web

8

Hard Work using the Syntactic Web…

Find images of Peter Patel-Schneider, Frank van Harmelen and Alan Rector…

Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois

Page 9: Dr. Alexandra I. Cristea acristea/ Semantic Web

9

Impossible(?) via the Syntactic Web…• Complex queries involving background knowledge

– Find information about “animals that use sonar but are not either bats or dolphins”

• Locating information in data repositories– Travel enquiries– Prices of goods and services– Results of human genome experiments

• Finding and using “web services”– Visualise surface interactions between two proteins

• Delegating complex tasks to web “agents”– Book me a holiday next weekend somewhere warm, not too

far away, and where they speak French or English

, e.g., Barn Owl

Page 10: Dr. Alexandra I. Cristea acristea/ Semantic Web

10

What is the Problem?• Consider a typical web page:

• Markup consists of: – rendering

information (e.g., font size and colour)

– Hyper-links to related content

• Semantic content is accessible to humans but not (easily) to computers…

Page 11: Dr. Alexandra I. Cristea acristea/ Semantic Web

11

What information can we see…WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong kong,

india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire

Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh

international world wide web conference. This prestigious event …Speakers confirmedTim berners-lee

Page 12: Dr. Alexandra I. Cristea acristea/ Semantic Web

12

What information can a machine see…WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana,

hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire

Register nowOn the 7th May Honolulu will provide

Page 13: Dr. Alexandra I. Cristea acristea/ Semantic Web

13

Solution: XML markup with “meaningful” tags?

<name>WWW2002

The eleventh international world wide webcon</name>

<location>Sheraton waikiki hotel

Honolulu, hawaii, USA</location>

<date>7-11 may 2002</date>

<slogan>1 location 5 days learn interact</slogan>

<participants>Registered participants coming fromaustralia, canada, chile denmark, france, germany, ghana,

hong

Page 14: Dr. Alexandra I. Cristea acristea/ Semantic Web

14

But What About…<conf>WWW2002

The eleventh international world wide webcon</conf>

<place>Sheraton waikiki hotel

Honolulu, hawaii, USA</place>

<date>7-11 may 2002</date>

<slogan>1 location 5 days learn interact</slogan>

<participants>Registered participants coming fromaustralia, canada, chile denmark,

Page 15: Dr. Alexandra I. Cristea acristea/ Semantic Web

15

Machine sees…<name>WWW2002The eleventh international world wide webc</name><location>Sheraton waikiki hotelHonolulu, hawaii, USA</location><date>7-11 may 2002</date><slogan>1 location 5 days learn interact</slogan><participants>Registered participants coming fromaustralia, canada, chile denmark, france, germany, ghana,

hong kong, india, ireland, italy,

Page 16: Dr. Alexandra I. Cristea acristea/ Semantic Web

A more current scenario

• What are you doing on Burns night?– Google “burns”– Wikipedia articles on Robert Burns– Amazon listing of books by Burns– Google Maps to look at birthplace of Burns

16

Page 17: Dr. Alexandra I. Cristea acristea/ Semantic Web

17

Page 21: Dr. Alexandra I. Cristea acristea/ Semantic Web

Combining Information

21

Page 22: Dr. Alexandra I. Cristea acristea/ Semantic Web

Combining one source with a service from another

22

Page 23: Dr. Alexandra I. Cristea acristea/ Semantic Web

Web APIs• A large and growing number of web data sources

provide program-accessible interfaces (APIs).• The web site http://www.programmableweb.com

currently (October 2015) lists over 14123.• Most popular Web APIs are:

23

Page 24: Dr. Alexandra I. Cristea acristea/ Semantic Web

Limitations of Web APIs• The interfaces are non-uniform - REST, RPC

(e.g., SOAP) and hybrid• The results are returned in variety of formats -

XML, JSON, Atom• The data schemas tend to be provider-

specific• Militates against the development of portable,

generic methods of accessing and using data.

24

Page 25: Dr. Alexandra I. Cristea acristea/ Semantic Web

25

History of the (Semantic) Web• Web was “invented” by Tim Berners-Lee (amongst

others), a physicist working at CERN• TBL’s original vision of the Web was much more

ambitious than the reality of the existing (syntactic) Web:

“... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.”

TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web

E.g., article in May 2001 issue of Scientific American…

Page 26: Dr. Alexandra I. Cristea acristea/ Semantic Web

The semantic web• Invented by Tim Berners-Lee and others.

W3C driving organisation.– Web of machine-readable data

• What are the main aims of the SW?– Automated query-answering– Automated use of the data (reasoning,

planning,acting, etc)

26

Page 27: Dr. Alexandra I. Cristea acristea/ Semantic Web

WWW v Semantic Web• WWW is a web of documents• SW is a web of data• WWW documents are human readable• SW data is machine readable (in theory

at least)• Shared AAA principle:

Anyone can say Anything, Anywhere.

27

Page 28: Dr. Alexandra I. Cristea acristea/ Semantic Web

Why the Semantic Web?I don’t think [the Semantic Web is] a very good name but we’re stuck with it now. The word semantics is used by different groups to mean different things ...I think we could have called it the Data Web. ...it connects all applications together or gives [people] access to data across the company ...

Tim Berners-Lee (2007), Interview in Business Week

28

Page 29: Dr. Alexandra I. Cristea acristea/ Semantic Web

Why the Semantic Web?• Syntax / semantics distinction: long

history in philosophy of language, linguistics, formal logic

• Syntax concerned with arrangement of symbols

• Semantics concerned with the relation between symbols strings and the world: what things actually mean.

29

Page 30: Dr. Alexandra I. Cristea acristea/ Semantic Web

What can the Semantic Web actually do?• Query answering:• IBM’s Watson: beats human competitors at

Jeopardy• but• specifically trained for this task (including

looking at decade’s worth of past Jeopardy answers)

• sort of cheating (reaction times means it always gets first go!)

30

Page 31: Dr. Alexandra I. Cristea acristea/ Semantic Web

What can the Semantic Web actually do?

• Query answering:• Wolfram-alpha: does complex query-

answering and solves mathematical problems• but• hand-curated database - not the Semantic

Web• hugely labour-intensive to develop and cannot

take advantage of new knowledge

31

Page 32: Dr. Alexandra I. Cristea acristea/ Semantic Web

What can the Semantic Web actually do?• Query answering:

• Other systems:– considerable progress– current state-of-the-art is extremely useful

• but• the general case is hard!

32

Page 33: Dr. Alexandra I. Cristea acristea/ Semantic Web

What can the Semantic Web actually do?

• Automated use of data:• works well in constrained circumstances:

– for example: Google maps can automatically combine information about maps, speed limits, current road usage, etc., to get estimates of journey time

• very hard in unconstrained circumstances:– classic SW example of an automated travel agent

still far from achievable

33

Page 34: Dr. Alexandra I. Cristea acristea/ Semantic Web

What are the requirements of the Semantic Web?

• Large numbers of users to make their data:– available– in an appropriate machine-readable format

This is happening now: open government data (esp. in UK and US) and many other organisations and individuals: https://www.data.gov.uk/ https://www.data.gov/ >> find more open data repositories as homework!

• Good query-answering systems• The ability to automatically interpret and use

data

34

Page 35: Dr. Alexandra I. Cristea acristea/ Semantic Web

35

Need to Add “Semantics”• External agreement on meaning of annotations

– E.g., Dublin Core• Agree on the meaning of a set of annotation tags

– Problems with this approach• Inflexible• Limited number of things can be expressed

• Use Ontologies to specify meaning of annotations– Ontologies provide a vocabulary of terms– New terms can be formed by combining existing ones– Meaning (semantics) of such terms is formally specified– Can also specify relationships between terms in multiple

ontologies

Page 36: Dr. Alexandra I. Cristea acristea/ Semantic Web

Ontology Languagesfor theSemantic Web

Page 37: Dr. Alexandra I. Cristea acristea/ Semantic Web

What is an ontology?• Originally: a definitive account of what exists

(derived from metaphysics).• Therefore, we can create a single ontology

that describes the world –• maybe dividing into smaller sub-ontologies

as necessary.• But this is completely misconceived!

37

Page 38: Dr. Alexandra I. Cristea acristea/ Semantic Web

Same world-view?

• Check as a homework other definitions of the word ‘ontologies’ via Google.

• Hence ‘Ontology merging’ a hot research area!

38

Page 39: Dr. Alexandra I. Cristea acristea/ Semantic Web

Ontologies in the SW• A way of encoding domain knowledge, linking

the knowledge, which allows for reasoning with the data

• Dictionary/ Vocabulary Taxonomy Ì Ontology

• Ontologies allow for data integration and inference, for automated query-answering and automated use of data

39

Page 40: Dr. Alexandra I. Cristea acristea/ Semantic Web

Why Semantic Web ontologies?• data integration

40

Page 41: Dr. Alexandra I. Cristea acristea/ Semantic Web

• data integration• inference

41

Why Semantic Web ontologies?

William Burnes is the father of Robert Burns.…

Father is a subclass of parent.…

William Burnes is the parent of Robert Burns.

Page 42: Dr. Alexandra I. Cristea acristea/ Semantic Web

42

Why Semantic Web ontologies?• data integration• Inference

• Automated query-answering• Automated use of data

Page 43: Dr. Alexandra I. Cristea acristea/ Semantic Web

http://semanticweb.org/43

  Language Swoogle hits RevisedDublin Core RDF 1,364,337 28 October 2006FOAF OWL DL 1,194,871 27 July 2005TrackBack RDF 502,401MetaVocab RDF 441,790 16 February 2002Basic Geo Vocabulary RDF Schema 248,130 1 February 2006BIO RDF 220,228 5 March 2004RSS 1.0 RDF Schema 201,786 6 December 2000VCard RDF RDF 181,962 22 February 2001

Creative Commons metadata RDF Schema 112,216

WOT OWL DL 97,292 23 February 2004SIOC OWL DL 42,911 11 April 2008GoodRelations OWL DL 5,000 1 October 2011DOAP RDF Schema 1,442 5 November 2005

Programmes Ontology OWL 2 943 7 September 2009

Music Ontology OWL 2 646 14 February 2010

OpenGUID RDF Schema 1 24 September 2008

Provenance Vocabulary OWL DL 1 25 August 2009Pedagogical diagnosis OWL DL 1 1 April 2012

DILIGENT Argumentation Ontology OWL 2 1 13 September 2006

Example Ontologies

Page 44: Dr. Alexandra I. Cristea acristea/ Semantic Web

44

Structure of an OntologyOntologies typically have two distinct components:• Names for important concepts in the domain

– Elephant is a concept whose members are a kind of animal– Herbivore is a concept whose members are exactly those

animals who eat only plants or parts of plants – Adult_Elephant is a concept whose members are exactly

those elephants whose age is greater than 20 years

• Background knowledge/constraints on the domain– Adult_Elephants weigh at least 2,000 kg– All Elephants are either African_Elephants or

Indian_Elephants– No individual can be both a Herbivore and a Carnivore

Page 45: Dr. Alexandra I. Cristea acristea/ Semantic Web

45

Example Ontology

Page 46: Dr. Alexandra I. Cristea acristea/ Semantic Web

46

A Semantic Web — First Steps• Extend existing rendering markup with semantic

markup– Metadata annotations that describe content/function of web

accessible resources

• Use Ontologies to provide vocabulary for annotations– “Formal specification” is accessible to machines

• A prerequisite is a standard web ontology language– Need to agree common syntax before we can share

semantics– Syntactic web based on standards such as HTTP and HTML

Make web resources more accessible to automated processes

Page 47: Dr. Alexandra I. Cristea acristea/ Semantic Web

47

Ontology Design and Deployment• Given key role of ontologies in the Semantic Web, it is essential

to provide tools and services to help users:– Design and maintain high quality ontologies, e.g.:

• Meaningful — all named classes can have instances• Correct — captured intuitions of domain experts• Minimally redundant — no unintended synonyms• Richly axiomatised — (sufficiently) detailed descriptions

– Store (large numbers) of instances of ontology classes, e.g.:• Annotations from web pages

– Answer queries over ontology classes and instances, e.g.:• Find more general/specific classes• Retrieve annotations/pages matching a given description

– Integrate and align multiple ontologies (merging)

Page 48: Dr. Alexandra I. Cristea acristea/ Semantic Web

48

The Semantic WebShared ontologies help to exchange data

and meaning between web-based services

(Image by Jim Hendler)

Page 49: Dr. Alexandra I. Cristea acristea/ Semantic Web

49

Wine Example ScenarioTell me what wines I

should buy to serve with each course of the

following menu.

Wine Agent

Grocery Agent

Books Agent

I recommend Chardonney or

DryRiesling

Page 50: Dr. Alexandra I. Cristea acristea/ Semantic Web

50

Ontologies in the Semantic Web

• Provide shared data structures to exchange information between agents

• Can be explicitly used as annotations in web sites

• Can be used for knowledge-based services using other web resources

• Can help to structure knowledge to build domain models (for other purposes)

Page 51: Dr. Alexandra I. Cristea acristea/ Semantic Web

51

Ontology Languages• Wide variety of languages for “Explicit Specification”

– Graphical notations• Semantic networks• Topic Maps (see http://www.topicmaps.org/)• UML• RDF

– Logic based• Description Logics (e.g., OIL, DAML+OIL, OWL)• Rules (e.g., RuleML, Prolog)• First Order Logic (e.g., KIF)• Conceptual graphs• (Syntactically) higher order logics (e.g., LBase)• Non-classical logics (e.g., Flogic, Non-Mon, modalities)

– Probabilistic/fuzzy• Degree of formality varies widely

– Increased formality makes languages more amenable to machine processing (e.g., automated reasoning)

Page 52: Dr. Alexandra I. Cristea acristea/ Semantic Web

52

• Objects/Instances/Individuals– Elements of the domain of discourse– Equivalent to constants in FOL

• Types/Classes/Concepts– Sets of objects sharing certain characteristics– Equivalent to unary predicates in FOL

• Relations/Properties/Roles– Sets of pairs (tuples) of objects– Equivalent to binary predicates in FOL

• Such languages are/can be:– Well understood– Formally specified– (Relatively) easy to use– Amenable to machine processing

Many languages use “OO” model based on:

Page 53: Dr. Alexandra I. Cristea acristea/ Semantic Web

53

Web “Schema” Languages• Existing Web languages extended to facilitate content

description– XML XML Schema (XMLS)– RDF RDF Schema (RDFS)

• XMLS not an ontology language– Changes format ~ DTDs (document schemas) for XML– Adds an extensible type hierarchy

• Integers, Strings, etc.• Can define sub-types, e.g., positive integers

• RDFS is recognisable as an ontology language– Classes and properties– Sub/super-classes (and properties)– Range and domain (of properties)

Page 55: Dr. Alexandra I. Cristea acristea/ Semantic Web

55

Page 56: Dr. Alexandra I. Cristea acristea/ Semantic Web

56

(In)famous “Layer Cake”

Data Exchange

Semantics+reasoning

Relational Data ?

?

???

???

???

• Relationship between layers is not clear• OWL DL extends “DL subset” of RDF

Page 57: Dr. Alexandra I. Cristea acristea/ Semantic Web

57

Page 58: Dr. Alexandra I. Cristea acristea/ Semantic Web

Linked Data

58

Page 59: Dr. Alexandra I. Cristea acristea/ Semantic Web

Semantic web: Linked Data• Isn’t just about putting data on the Web• It’s about making links• Web of Hypertext -> Web of Data

59

Page 60: Dr. Alexandra I. Cristea acristea/ Semantic Web

Linked Data: The four rules

1. Use URIs as names for things.

2. Use HTTP URIs so that people can look up those names.

3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL).

4. Include links to other URIs, so that they can discover more things.

60

Page 61: Dr. Alexandra I. Cristea acristea/ Semantic Web

Why HTTP URIs?• Globally unique names

– can be created in a decentralised fashion by domain name owners;

– no central naming authority is required.• Not just a name, but a means of

accessing information describing the identified entity. (URL)

61

Page 62: Dr. Alexandra I. Cristea acristea/ Semantic Web

URIs

• These URIs point to web documents - or in the terminology of WebArch, information resources.– by definition, all its essential characteristics can be conveyed in a

message

• Web clients request a representation of a resource• One and the same resource might have different

representations; e.g., text in English, Greek, Chinese, etc.

62

Homepage of the Department of Computer Sciencehttp://www.dcs.warwick.ac.uk/

Homepage of Alexandra Cristeahttp://www2.warwick.ac.uk/fac/sci/dcs/people/Alexandra_Cristea

Page 63: Dr. Alexandra I. Cristea acristea/ Semantic Web

Content Negotiation• HTTP clients send HTTP headers with each request to

indicate what kinds of documents they prefer.• Client can say prefers language X over Y.• Or prefers RDF over HTML.• Servers inspect headers and select an appropriate response.

63

GET /fac/sci/dcs/people/Alexandra_Cristea HTTP/1.1Host: www2.warwick.ac.ukAccept: text/html, application/xhtml+xmlAccept Language: en, gr, cn

Header of GET Requests

HTTP/1.1 200 OKContent -Type: text/htmlContent-Language: en

Servers Response

Page 64: Dr. Alexandra I. Cristea acristea/ Semantic Web

URIs for Things• We need mechanisms to ensure that when URIs are

dereferenced,– real-world objects are not confused with documents that

describe them, and– humans as well as machines can retrieve appropriate

representations.

64

Page 65: Dr. Alexandra I. Cristea acristea/ Semantic Web

RDF for Linked Data

• RDF is standardly used for Linked Data. Advantages include:– Easy to insert RDF links between data from different sources.– Information from different sources can be combined by graph

merging.– Information using different schemas can be expressed in a single

graph, i.e., by mixing different vocabularies.– Data can be tightly or loosely structured.

• Features of RDF that are avoided:– Reification (hard to query with SPARQL)– Collections and containers (ditto). Use multiple triples with same

predicate instead.– Blank nodes: makes merging less effective.

65

Page 66: Dr. Alexandra I. Cristea acristea/ Semantic Web

Kinds of Links• Relationship Links

– related things in other data sources.

≈ hyperlinks in a web document. – e.g. foaf:based_neardbpedia:Edinburgh

• Identity Links – URI aliases of other data sources for the same

(real-world/abstract) object.

• Vocabulary Links – definitions of vocabulary terms used to represent the data.

66

Page 67: Dr. Alexandra I. Cristea acristea/ Semantic Web

Identity Links• different URIs may refer to same real-world object.

– Standard for equivalence: http://www.w3.org/2002/07/owl#sameAs.

• Motivations for this approach:– Different aliases can be dereferenced to different description of same

resource (AAA principle).– Support provenance : trace back to publisher of URI.– canonic > centralised naming authority > barrier to spread web of data.

• Potential problems:– Identity may be context dependent – Facts vs. opinions

67

Page 68: Dr. Alexandra I. Cristea acristea/ Semantic Web

68

Page 69: Dr. Alexandra I. Cristea acristea/ Semantic Web

Is Your Data 5- ?★

69

Page 70: Dr. Alexandra I. Cristea acristea/ Semantic Web

Reflecting on Linked Data• Structured data

– available on web (i.e. open) in many formats: – CSV, Excel, HTML Microdata(e.g. http://schema.org/), web APIs, PDF

tables (shudder), ...

• Advantages of Linked Data:– A unifying data model (RDF)– A standardised data access mechanism (HTTP)– Hyperlink-based data discovery: links connect all Linked Data into a

single global data space and enable Linked Data applications to discover new data sources at run-time.

– Self-descriptive data: vocabulary definitions are recoverable like other data, and vocabulary terms can be linked to one another.

70

Page 71: Dr. Alexandra I. Cristea acristea/ Semantic Web

Reflecting on Linked Data• Linked data adopts perspective of data integration.

– Not (necessarily) interested in reasoning aspect of Semantic Web.

• http://blog.paulwalk.net/2009/11/11/linked-open-semantic/:– Data can be open, while not being linked.– Data can be linked, while not being open.– Data which is both open and linked is increasingly viable.– The Semantic Web can only function with data which is

both open and linked.

71

Page 72: Dr. Alexandra I. Cristea acristea/ Semantic Web

Web of Data (Linked Data)

72

Page 73: Dr. Alexandra I. Cristea acristea/ Semantic Web

Summary Linked Data

• Linked Data principles– Naming things with URIs– Making URIs dereferenceable– Providing useful RDF information– Including links to other things

73

Page 74: Dr. Alexandra I. Cristea acristea/ Semantic Web

74

AcknowledgementsThanks to various people from

whom I “borrowed” material:

– Jeen Broekstra– Carole Goble– Frank van Harmelen– Austin Tate– Raphael Volz

And thanks to all the people from whom they borrowed it

Page 75: Dr. Alexandra I. Cristea acristea/ Semantic Web

Finding out more on SW• Course website and recommended reading• Do your homeworks!• There is lots of relevant literature online –

try to explore it• Also a lot of informal discussion on Twitter,

newsgroups, YouTube, etc.

75