© 2009 franz j. kurfess semantic web 1 cpe/csc 481: knowledge-based systems dr. franz j. kurfess...

110
© 2009 Franz J. Kurfess Semantic Web 1 CPE/CSC 481: Knowledge-Based Systems Dr. Franz J. Kurfess Computer Science Department Cal Poly

Upload: russell-elliott

Post on 27-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

© 2009 Franz J. Kurfess Semantic Web 1

CPE/CSC 481: Knowledge-Based Systems

CPE/CSC 481: Knowledge-Based Systems

Dr. Franz J. Kurfess

Computer Science Department

Cal Poly

© 2009 Franz J. Kurfess Semantic Web 2

Usage of the SlidesUsage of the Slides these slides are intended for the students of my

CPE/CSC 481 “Knowledge-Based Systems” class at Cal Poly SLO if you want to use them outside of my class, please let me know

([email protected]) I usually put together a subset for each quarter as a

“Custom Show” to view these, go to “Slide Show => Custom Shows”, select the

respective quarter, and click on “Show” To print them, I suggest to use the “Handout” option

4, 6, or 9 per page works fine Black & White should be fine; there are few diagrams where

color is important

© 2009 Franz J. Kurfess Semantic Web 3

Course OverviewCourse Overview Introduction Knowledge Representation

Semantic Nets, Frames, Logic

Reasoning and Inference Predicate Logic, Inference

Methods, Resolution

Reasoning with Uncertainty Probability, Bayesian Decision

Making

Expert System Design ES Life Cycle

CLIPS Overview Concepts, Notation, Usage

Pattern Matching Variables, Functions,

Expressions, Constraints

Expert System Implementation Salience, Rete Algorithm

Expert System Examples Semantic Web &

Knowledge Conclusions and Outlook

© 2009 Franz J. Kurfess Semantic Web 4

OverviewOverview Introduction Knowledge Processing

Knowledge Acquisition, Representation and Manipulation

Knowledge Organization Classification, Categorization Ontologies, Taxonomies, Thesauri

Knowledge Retrieval Information Retrieval Knowledge Navigation

Knowledge Presentation Knowledge Visualization

Knowledge Exchange Knowledge Capture, Transfer,

and Distribution Usage of Knowledge

Access Patterns, User Feedback

Knowledge Management Techniques Topic Maps, Agents

Knowledge Management Tools

Knowledge Management in Organizations

© 2009 Franz J. Kurfess Semantic Web 5

Course OverviewCourse Overview Introduction Knowledge Representation

Semantic Nets, Frames, Logic

Reasoning and Inference Predicate Logic, Inference

Methods, Resolution

Reasoning with Uncertainty Probability, Bayesian Decision

Making

Expert System Design ES Life Cycle

CLIPS Overview Concepts, Notation, Usage

Pattern Matching Variables, Functions,

Expressions, Constraints

Expert System Implementation Salience, Rete Algorithm

Expert System Examples Conclusions and Outlook

© 2009 Franz J. Kurfess Semantic Web 6

Overview Semantic WebOverview Semantic Web Motivation Objectives Semantic Web Introduction

World Wide Web “Deep Web” Knowledge and the Web

Syntactic vs. Semantic Web human view of documents computer view of documents

Knowledge Representation on the Web XML + Meta-tags RDF

Knowledge Organization with Ontologies Conceptual building blocks Web Ontology Language

(OWL)

Reasoning with Ontologies Description Logics Reasoning with OWL

Important Concepts and Terms

Chapter Summary

© 2009 Franz J. Kurfess Semantic Web 7

LogisticsLogistics Introductions Course Materials

textbook handouts Web page CourseInfo/Blackboard System and Alternatives

Term Project Lab and Homework Assignments Exams Grading

© 2009 Franz J. Kurfess Semantic Web 8

Bridge-InBridge-In

© 2009 Franz J. Kurfess Semantic Web 9

Pre-TestPre-Test

© 2009 Franz J. Kurfess Semantic Web 10

MotivationMotivation

© 2009 Franz J. Kurfess Semantic Web 11

ObjectivesObjectives

© 2009 Franz J. Kurfess Semantic Web 13

Semantic Web IntroductionSemantic Web Introduction

© 2009 Franz J. Kurfess Semantic Web 14

World Wide Web World Wide Web

© 2009 Franz J. Kurfess Semantic Web 15

“Deep Web”“Deep Web”

© 2009 Franz J. Kurfess Semantic Web 16

Knowledge and the WebKnowledge and the Web

© 2009 Franz J. Kurfess Semantic Web 17

© 2009 Franz J. Kurfess Semantic Web 18

History of the Semantic WebHistory of the Semantic Web Web was “invented” by Tim Berners-Lee (amongst others), a physicist

working at CERN TBL’s original vision of the Web was much more ambitious than the

reality of the existing (syntactic) Web:

TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web

E.g., article in May 2001 issue of Scientific American…

“... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.”

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 19

Realising the complete “vision” is too hard for now (probably) But we can make a start by adding semantic annotation to web

resources

Scientific American, May 2001

Semantic Web – Scientific AmericanSemantic Web – Scientific American

© 2009 Franz J. Kurfess Semantic Web 20

Where we are Today: the Syntactic Web

Where we are Today: the Syntactic Web

[Hendler & Miller 02]

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 21

The Syntactic Web is…The Syntactic Web is…A place where computers do the presentation (easy) and

people do the linking and interpreting (hard). A hypermedia, a digital library

A library of documents called (web pages) interconnected by a hypermedia of links

A database, an application platform A common portal to applications accessible through web pages, and

presenting their results as web pages

A platform for multimedia BBC Radio 4 anywhere in the world! Terminator 3 trailers!

A naming scheme Unique identity for those documents (URLs)

Why not get computers to do more of the hard work?[Goble 03]

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 22

Hard Work using the Syntactic Web…

Hard Work using the Syntactic Web…

Find images of Steve Furber

Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois

Carole Goble

… Alan Rector…

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 23

Impossible (?) using the Syntactic Web…Impossible (?) using the Syntactic Web…

Complex queries involving background knowledge Find information about “animals that use sonar but are not

either bats or dolphins”

Locating information in data repositories Travel enquiries Prices of goods and services Results of human genome experiments

Finding and using “web services” Visualise surface interactions between two proteins

Delegating complex tasks to web “agents” Book me a holiday next weekend somewhere warm, not too far

away, and where they speak French or English

, e.g., Barn Owl

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 24

Syntactic vs. Semantic WebSyntactic vs. Semantic Web

human view of documents

computer view of documents

© 2009 Franz J. Kurfess Semantic Web 25

What is the Problem?What is the Problem?

Web pages contain content (text, images, music) markup (HTML, XHTML) hyperlinks code (JavaScript)

Content is most critical for humans but meaningless to computers requires interpretation (understanding)

© 2009 Franz J. Kurfess Semantic Web 26

What information can we see…What information can we see…

WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong kong, india,

ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire

Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh

international world wide web conference. This prestigious event …Speakers confirmedTim berners-lee Tim is the well known inventor of the Web, …Ian FosterIan is the pioneer of the Grid, the next generation internet …

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 27

What information can a machine see…

What information can a machine see…

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 28

Solution: XML markup with “meaningful” tags?

Solution: XML markup with “meaningful” tags?<name>

</name><location> </location>

<date> </date><slogan> </slogan><participants>

</participants>

<introduction>

</introduction><speaker> </speaker><bio> </bio>…

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 30

Still the Machine only sees…Still the Machine only sees…< > </ >< > </ >

< > </ >< > </ >< >

</ >

< >

</ >< > </ >< > </ >< > </ >< > </ >

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 31

Need to Add “Semantics”Need to Add “Semantics”External agreement on meaning of annotations

E.g., Dublin Core for annotation of library/bibliographic information Agree on the meaning of a set of annotation tags

Problems with this approach Inflexible Limited number of things can be expressed

Use Ontologies to specify meaning of annotations Ontologies provide a vocabulary of terms New terms can be formed by combining existing ones

“Conceptual Lego” Meaning (semantics) of such terms is formally specified Can also specify relationships between terms in multiple

ontologies

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 34

Meanwhile related developmentsMeanwhile related developmentsObject oriented programming

Simula, Smalltalk, … JavaObject oriented design

Entity relationship diagrams… UMLSGML, HTML, XML and the web

Including RDF and Topic Maps

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 35

Knowledge Representation on the Web

Knowledge Representation on the Web

knowledge is primarily enclosed in documents Web pages use of HTML offers no clear separation of content and

presentation (formatting) HTML is very limited in its expressiveness

fixed vocabulary

© 2009 Franz J. Kurfess Semantic Web 36

HTML + Meta-tagsHTML + Meta-tags

intention was to use meta-tags to describe the contents of Web pages was quickly abused to increase the relevance rankings of

pages

meta-tags are just labels on the documents no structure to the labels free (unlimited, uncontrolled) vocabulary

© 2009 Franz J. Kurfess Semantic Web 37

XML + Meta-tagsXML + Meta-tags

XML offers much better expressiveness XML itself is not a KR language offers facilities to define KR languages

much better separation of content and presentationXML allows the definition of schemata

customized naming and structure of tags

flexible transformations for variable presentations e.g. XSLT to create different versions of documents

© 2009 Franz J. Kurfess Semantic Web 38

MicroformatsMicroformats

BackgroundPurpose

limitations of current approaches human-readable and machine-readable

Usage basics customization

Limits of microformatsOutlook

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 40

Michael Cook

Caleb Troughton

5/3/2007

MicroFormatsMicroFormats

© 2009 Franz J. Kurfess Semantic Web 41

What are Microformats?What are Microformats?

A standard set of HTML/XML semanticsA set of tag classes and patterns used for storing

information in web pages Microformat tags are typically kept brief and descriptive

Open format Microformats are not controlled by any one company or

individual Anyone can suggest new Microformats or request

revisions

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 42

More about MicroformatsMore about Microformats

typically written in XHTML visible representation of the data semantic data hidden in the code

implicit interpretation similar to how “int” implies that the data element will be an

integer example: “hCard/adr” implies that the data element will

contain an address.

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 43

What current systems exist?What current systems exist?

no standard set of web markup techniques XML is standardized, but it’s a markup specification

language

commonly use markup techniques do exist many Microformats are derived directly from methods

already used

“Microformats are semantics with momentum, a codification of

what everyone did anyway.” Derrick Pallas, Alexa Internet, Inc.

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 44

Current ConventionsCurrent Conventions

lower readability by humans information scattered around a page or several pages data may be formatted using sloppy markup or

inconsistent patterns

lower readability by machines require scraping an entire page in search for patterns information less trustworthy

unexpected information such as contact information for a person you aren’t looking for

May be categorized incorrectly 805 Santa Barbara: is this a street address or an area code and city name?

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 45

Human or Machine Use?Human or Machine Use?

both: humans first, machines second

commonly used tags to embed information designed to be brief, descriptive easy for a human to interpret embedded data from code

adding semantics becomes second nature such as “blockquote”, Microformat class names

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 46

Use of MicroformatsUse of Microformats

conventions written in current standard markup languages an experienced programmer will easily interpret the

Microformat syntax and utilize implications of the language

implementation current markup languages allows Microformats to be

quickly and easily implemented on machines

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 47

Who should use Microformats?Who should use Microformats?

programmer web code with embedded information a user or search engine might extract it typically information that will be represented or extracted

often can contain anything from name and address, to business

affiliation, to contact information

also used to aid search engines extracting information by supplementing Metadata deter crawlers from following links with a “rel=‘nofollow’"

attribute.

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 48

Benefits of Microformats 1Benefits of Microformats 1

programmer easily read raw markup language code naming conventions allow others to easily extract data

from the raw language readily recognizable names of data members.

easily edited by other programmers the information is easily readable potentially abstract data member naming is following a

standardized convention

in the future someone may need to edit this information

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 49

Benefits of Microformats 2Benefits of Microformats 2

universal modules in code. creation of scripts Microformatsnaming conventions are standardized

easy to integrate this code into web pages modular development of pages development of library utilities

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 50

Benefits of Microformats 3Benefits of Microformats 3

unit conversion and currency conversion is trivial Web browser contains a user’s default preferences makes conversions on the fly

information extraction Web crawler searching for an address or email.

accuracy is dramatically improved the programmer specified “this is the email” or “this is the address”

crawler speed is much faster it no longer has to scan the entire page or rely on a regular expression to

extract information

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 51

Microformat UsageMicroformat Usage

easy, intuitive specificationsformat corresponds to the data type you wish to

represent in your code specifications can be found on the Microformats Wiki sometimes little difference between Microformats and code

you might see on a web page today. benefits the readability by humans and accessibility of

code by machines

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 52

Code DifferencesCode Differences

Commonplace vCard:

BEGIN:VCARD VERSION:3.0 N:Çelik;Tantek FN:Tantek Çelik URL:http://tantek.com ORG:Technorati END:VCARD

Microformat hCard:

<div class="vcard">  <a class="url fn" href="http://tantek.com/">   Tantek Çelik  </a>  <div class="org">Technorati</div> </div>

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 53

CustomizationCustomization

Identify a situation in which a Microformat would provide a solution no existing Microformats or XML markup addresses it

Propose the Microformat to the Microformats WikiWork with other contributors to develop a draft version

of the format. without community involvement, the new format will not be

adoptedSubmit the final draft to the Wiki

the community will accept the format as a standard if it becomes more and more common in practice

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 54

Web Page UpdateWeb Page Update

2 methods of converting existing web data to Microformatted data By hand

large amount of manual labor complicated when extracting fields from arbitrary structures

By machine applications that attempt to convert commonplace semantics such

as vCard to the appropriate Microformat, hCard inneffective unless the original content follows commonplace trends can become error prone in interpreting existing field names

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 55

Microformat LimitationsMicroformat Limitations

acceptance and use in practice only successful if widely used

existing vs. new content since conversion is tedious at best, advocation of

Microformats in new content is arguably more important than content conversion

confusion short class names can lead to possibly ambiguous titles

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 56

ExampleExample <div class="vcard">

<span class="fn">John Smith</span>, <div class="adr"> <div class="street-address">1 Seaview Lane</div>, <span class="locality">Mousehole</span>, <span class="region">Cornwall</span>, <span class="country-name">UK</span> </div></div>

query to Google Maps “1 Seaview Lane, Mousehole, Cornwall, UK” combination of the above

interpretation “locality” is “Mousehole” a county name? A province name? ambiguous information

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 57

ToolsTools

Web browsers FireFox 3 has Microformat copy/paste support.

Web browser extensions “Microformats Extensions”, Operator recognize Microformatted code allow users to perform copy operations from within a web

browserDreamweaver extensions

easy implementation of Microformats in new web pagescalendar, address book, and email utilities

ability to copy paste Microformatted data, preserving fields

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 58

FutureFuture

Microformats will hopefully become a standard in Web development

Advanced search engines will use Microformats to directly extract information

Search engines will use Microformats to establish relations between data types and values

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 59

How Can You Help?How Can You Help?

being a member of the Microformats community helps the development of Microformats.

suggest a new Microformat for your favorite items most likely somebody else was faster

help influence proposed Microformats structure, usage, implementation

help translate Microformats into other languagesadvocate the use of Microformats

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 60

SourcesSources Digital Web Magazine. “Microformats Primer.”

http://www.digital-web.com/articles/microformats_primer/. Describes and demonstrates why and how microformats are used. Argues that microformats will aid programmers intending to generate CSS code, insert Metadata, or implement plug and play Javascript.

Official Microformats Home Page. http://microformats.org/. Updated 4/17/2007.Provides up to date information on the implementation of microformats, via a Web Blog. An overview of the set of current supported microformats, and proposed new formats.

Official Microformats Wiki. http://microformats.org/wiki/Main_Page. Updated 4/17/2007.Allows anyone to contribute ideas to the official Microformat team. Provides detailed specification of existing elements, allows the public to submit new elements for consideration, and demonstrates examples of microformat use.

Wikipedia “Microformats.” http://en.wikipedia.org/wiki/Microformats. Created 3/1/2007.Good overview.

XML.com “Microformats.” http://www.xml.com/pub/a/2005/03/23/deviant.html. Updated 3/23/2007.General overview of intended uses for microformats. Demonstrates simple examples.

Cook & Troughton, 2007

© 2009 Franz J. Kurfess Semantic Web 61

Resource Description Framework (RDF)

Resource Description Framework (RDF)

grammar for encoding relationships RDF triples as basic building blocks

An RDF triple has three components: subject predicate (or verb) object each can be expressed as a resource on the Web (URI)

far less ambiguous than encoding data in random XML documents

© 2009 Franz J. Kurfess Semantic Web 62

© 2009 Franz J. Kurfess Semantic Web 63

What is the Purpose of RDF?What is the Purpose of RDF?

The purpose of RDF (Resource Description Framework) is to give a standard way of specifying data "about" something.

Here's an example of an XML document that specifies data about China's Yangtze river:

<?xml version="1.0"?><River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>

"Here is data about the Yangtze River. It has a length of 6300 kilometers.Its startingLocation is western China's Qinghai-Tibet Plateau. Its endingLocationis the East China Sea."

© 2009 Franz J. Kurfess Semantic Web 64

XML --> RDFXML --> RDF

<?xml version="1.0"?><River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>

XML

Modify the following XML document so that it is also a valid RDF document:

<?xml version="1.0"?><River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>

RDF

Yangtze.xml

Yangtze.rdf

"convert to"

© 2009 Franz J. Kurfess Semantic Web 65

The RDF FormatThe RDF Format

<?xml version="1.0"?><River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>

RDF provides an ID attribute for identifying the resource being described.

The ID attribute is in the RDF namespace.

Add the "fragment identifier symbol" to the namespace.

1

2

3

© 2009 Franz J. Kurfess Semantic Web 66

The RDF Format (cont.)The RDF Format (cont.)

<?xml version="1.0"?><River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>

Identifies the type(class) of the resource being described.

Identifies the resource being described. Thisresource is an instance of River.

These are properties,or attributes, of thetype (class).

Values of the properties

1

2

3

4

© 2009 Franz J. Kurfess Semantic Web 67

Namespace ConventionNamespace Convention

xmlns="http://www.geodesy.org/river#"Question: Why was "#" placed onto the end of the namespace? E.g.,

Answer: RDF is very concerned about uniquely identifying things - uniquely identifying the type (class) and uniquely identifying the properties.If we concatenate the namespace with the type then we get a uniqueidentifier for the type, e.g.,http://www.geodesy.org/river#RiverIf we concatenate the namespace with a property then we get a uniqueidentifier for the property, e.g.,

http://www.geodesy.org/river#length

http://www.geodesy.org/river#startingLocation

http://www.geodesy.org/river#endingLocation

Thus, the "#" symbol is simply a mechanism for separating the namespace from the type name and the property name.

Bes

t Pra

ctic

eB

est Practice

© 2009 Franz J. Kurfess Semantic Web 68

The RDF FormatThe RDF Format

<?xml version="1.0"?><Class rdf:ID="Resource" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="uri"> <property>value</property> <property>value</property> ...</Class>

© 2009 Franz J. Kurfess Semantic Web 69

Advantage of using the RDF FormatAdvantage of using the RDF FormatYou may ask: "Why should I bother designing my XML to be in the RDF

format?"Answer: there are numerous benefits:

The RDF format, if widely used, will help to make XML more interoperable: Tools can instantly characterize the structure, "this element is a type (class), and here are its

properties”. RDF promotes the use of standardized vocabularies ... standardized types (classes) and

standardized properties.

The RDF format gives you a structured approach to designing your XML documents. The RDF format is a regular, recurring pattern.

It enables you to quickly identify weaknesses and inconsistencies of non-RDF-compliant XML designs. It helps you to better understand your data!

You reap the benefits of both worlds: You can use standard XML editors and validators to create, edit, and validate your XML. You can use the RDF tools to apply inferencing to the data.

It positions your data for the Semantic Web!

Net

wor

k ef

fect

Inte

rope

rabi

lity

© 2009 Franz J. Kurfess Semantic Web 70

Disadvantage of using the RDF Format

Disadvantage of using the RDF Format

Constrained: the RDF format constrains you on how you design your XML (i.e., you can't design your XML in any arbitrary fashion).

RDF uses namespaces to uniquely identify types (classes), properties, and resources. Thus, you must have a solid understanding of namespaces.

Another XML vocabulary to learn: to use the RDF format you must learn the RDF vocabulary.

© 2009 Franz J. Kurfess Semantic Web 71

Uniquely Identify the ResourceUniquely Identify the Resource

Earlier we said that RDF is very concerned about uniquely identifying the type (class) and the properties. RDF is also very concerned about uniquely identifying the resource, e.g.,

<?xml version="1.0"?><River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>

This is the resource being described. We want to uniquelyidentify this resource.

© 2009 Franz J. Kurfess Semantic Web 72

Triple -> resource/property/valueTriple -> resource/property/value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#length of 6300 kilometers

resource property valuehttp://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#startingLocation of western China's ...

resource property value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#endingLocation of East China Sea

resource property value

© 2009 Franz J. Kurfess Semantic Web 73

The RDF Format = triples!The RDF Format = triples!

XML data are structured as resource/property/value triples value of a property can be a literal

length has a value of 6300 kilometers

value of a property can be a resource property-A has a value of Resource-B property-B has a value of Resource-C

the RDF design pattern is an alternating sequence of resource-property pairs known as "striping”

© 2009 Franz J. Kurfess Semantic Web 74

“Striped” RDF Triples“Striped” RDF Triples<?xml version="1.0"?><Resource-A> <property-A> <Resource-B> <property-B> <Resource-C> <property-C> Value-C </property-C> </Resource-C> </property-B> </Resource-B> </property-A></Resource-A>

value of:property-A

property-B

p.-C

Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.

© 2009 Franz J. Kurfess Semantic Web 75

SPARQLSPARQL

SPARQL Protocol and RDF Query Language

© 2009 Franz J. Kurfess Semantic Web 76

KR on the Web RequirementsKR on the Web Requirements

high expressiveness difficult to predict the knowledge that will be represented open, distributed repository

new representation mechanism may be introduced knowledge may be distributed across multiple sites

syntactic interoperability easy access to knowledge in repositories facilitated via APIs and libraries

semantic interoperability interpretation of knowledge is compatible across

repositories

© 2009 Franz J. Kurfess Semantic Web 77

XML as KR on the WebXML as KR on the Web

expressiveness anything for which a grammar can be defined can be

encoded in XMsyntactic interoperability

an XML parser can parse any XML data it is usually a reusable component

semantic interoperability ]major limitation: it just describes grammars no way to recognize a semantic unit from a particular domain

because XML aims at document structure no common interpretation of the document content

© 2009 Franz J. Kurfess Semantic Web 78

RDF as KR on the WebRDF as KR on the Web

expressiveness nested object-attribute-value structure satisfies the

universal expressive power requirement

syntactic interoperability qpplication-independent RDF parsers are available

semantic interoperability object-attribute structure provides natural semantic units

all objects are independent entities

a domain model—defining objects and relationships—can be represented naturally in RDF translation steps are not necessary as they are with XML

© 2009 Franz J. Kurfess Semantic Web 79

Ontology: Origins and HistoryOntology: Origins and History

Ontology in Philosophy a philosophical discipline—a branch of philosophy that deals with the nature and the organisation of reality

Science of Being (Aristotle, Metaphysics, IV, 1)Tries to answer the questions:

What characterizes being? Eventually, what is being

How should things be classified?

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 80

Ontology in LinguisticsOntology in Linguistics

“Tank“

ReferentFormStands for

Relates toactivates

Concept

[Ogden, Richards, 1923]?

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 81

Classification: An Old ProblemClassification: An Old Problem“On those remote pages it is written that animals are divided into:

a. those that belong to the Emperor b. embalmed ones c. those that are trained d. suckling pigse. mermaids f. fabulous ones g. stray dogs h. those that are included in this classificationi. those that tremble as if they were mad j. innumerable ones k. those drawn with a very fine camel's hair brush l. others m. those that have just broken a flower vase n. those that resemble flies from a distance"

From The Celestial Emporium of Benevolent Knowledge, Borges

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 82

Ontology in Computer ScienceOntology in Computer Science

An ontology is an engineering artifact: It is constituted by a specific vocabulary used to describe a

certain reality, plus a set of explicit assumptions regarding the intended meaning

of the vocabulary. Almost always including how concepts should be classified

describes a formal specification of a certain domain Shared understanding of a domain of interest Formal and machine manipulable model of a domain of

interest explicit specification of a conceptualisation

[Gruber93]

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 83

Example OntologyExample Ontology

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 84

Ontology Classified LogicallyOntology Classified Logically

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 85

Where else are ontologies used?Where else are ontologies used?

Bioinformatics The Gene Ontology The Protein Ontology (MGED)

Medicine “The terminology wars”

LinguisticsDatabase integrationUser interface designFractal Indexing

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 86

“Manchester Postgraduate Student taking CS626”

“Hand which isanatomicallynormal”

Ontologies as Conceptual LegoOntologies as Conceptual Lego

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 87

User Interfaces using conceptual Lego

User Interfaces using conceptual Lego

FRACTURE SURGERY FRACTURE SURGERY

Structured Data Entry

File Edit Help

TibiaTibia FibulaFibula AnkleAnkle More...More...

RadiusRadius UlnaUlna WristWrist More...More...HumerusHumerus

FemurFemur

LeftLeft RightRight

More...More...Gt TrochGt TrochShaftShaft NeckNeck

FemurFemur

LeftLeft

NeckNeck

ReductionReduction FixationFixation

OpenOpen ClosedClosedOpenOpen

FixationFixation

Fixation of open fracture of neck of left femur

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 88[AKT 2003]

Semantic Web Challenge

Semantic Web Challenge

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 89

So why is it hard?So why is it hard?Ontology languages are tricky

“All tractable languages are useless; all useful languages are intractable”

Ontologies are tricky People do it too easily;

People are not logicians Intuitions hard to formalise

The evidence The problem has been about for 3000 years

But now it matters! The semantic web means knowledge representation matters

The goal Make it easier

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 90

Structure of an OntologyStructure of an Ontology

Ontologies typically have two distinct components:

Names for important concepts in the domain Background knowledge/constraints on the domain

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 91

Concept Names Concept Names

Elephant a concept whose members are a kind of animal

Herbivore a concept whose members are exactly those animals who

eat only plants or parts of plants

Adult_Elephant a concept whose members are exactly those elephants

whose age is greater than 20 years

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 92

Domain KnowledgeDomain Knowledge

Adult_Elephants weigh at least 2,000 kgAll Elephants are either African_Elephants or

Indian_ElephantsNo individual can be both a Herbivore and a

Carnivore

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 93

Tools and ServicesTools and Services

We need to provide tools and services to: Design and maintain high quality ontologies, e.g.:

Meaningful — all named classes can have instances Correct — captured intuitions of domain experts Minimally redundant — no unintended synonyms Richly axiomatised — (sufficiently) detailed descriptions

Store (large numbers) of instances of ontology classes Annotations from web pages

Answer queries over ontology classes and instances, e.g.: Find more general/specific classes Retrieve annotations/pages matching a given description

Integrate and align multiple ontologies

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 94

OWL as (Description) LogicOWL as (Description) Logic

XMLS datatypes as well as classes in 8P.C and 9P.C E.g., 9hasAge.nonNegativeInteger

Arbitrarily complex nesting of constructors E.g., Person u 8hasChild.(Doctor t 9hasChild.Doctor)

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 97

Ontologies as DL Knowledge BasesOntologies as DL Knowledge Bases

OWL ontology maps to a DL knowledge base K = hT, Ai T (Tbox) is a set of axioms of the form:

C v D, C ´ D (concept inclusion/equivalence) R v S, R ´ S (role inclusion/equivalence) R+ v R (role transitivity)

A (Abox) is a set of axioms of the form x 2 D (concept instantiation) hx,yi 2 R (role instantiation)

Two sorts of Tbox axioms often distinguished “Definitions”

C v D or C ´ D where C is a concept name

General Concept Inclusion axioms (GCIs) C v D where C in an arbitrary concept

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 98

Knowledge Base SemanticsKnowledge Base Semantics

An interpretation I satisfies (models) an axiom A (I ² A): I ² C v D iff CI µ DI I ² C ´ D iff CI = DI I ² R v S iff RI µ SI I ² R ´ S iff RI = SI I ² R+ v R iff (RI)+ µ RI I ² x 2 D iff xI 2 DI I ² hx,yi 2 R iff (xI,yI) 2 RI

I satisfies a Tbox T (I ² T ) iff I satisfies every axiom A in T

I satisfies an Abox A (I ² A) iff I satisfies every axiom A in A

I satisfies a KB K (I ² K) iff I satisfies both T and A

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 99

Services as Reasoning (1)Services as Reasoning (1)

Knowledge is meaningful (classes can have instances) C is satisfiable w.r.t. K iff there exists some model I of K

s.t. CI ;

Knowledge is correct (captures intuitions) C subsumes D w.r.t. K iff for every model I of K, CI µ DI

Knowledge is minimally redundant (no unintended synonyms) C is equivalent to D w.r.t. K iff for every model I of K, CI =

DI

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 100

Services as Reasoning (2)Services as Reasoning (2)

Querying knowledge x is an instance of C w.r.t. K iff for every model I of K, xI 2

CI hx,yi is an instance of R w.r.t. K iff for, every model I of K,

(xI,yI) 2 RI

All above problems reducible to Knowledge Base consistency A KB K is consistent iff there exists some model I of K

KB consistency reducible to concept consistency

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 101

Results for Margherita PizzaResults for Margherita Pizza

What it means All Margherita_pizzas (amongst other things)

Are Pizzas have_topping some Tomato_topping have_topping some Mozzarella_topping

& because they are Pizzashave_base some Pizza_base

someValuesFromrestrictions

Properties subpane showingalternative ‘frame’view

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 102

Pizza_toppings

Pizzas

Margherita_pizzas

aMP1

aMP2

aMPi

Pizza_base

aPB1

aPBj

aPB2

What itMeans

What itMeans

Mozzarella_Toppings

aMZ1 aMZ2

aMZ3

aMZ4

Tomato_toppingss

aTkaT1

aT2

aT4

aT3…

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 103

DL Reasoning (1)DL Reasoning (1)

Tableau algorithms used to test satisfiability (consistency)

Try to build a tree-like model I of the input concept CDecompose C syntactically

Apply tableau expansion rules Infer constraints on elements of model

Tableau rules correspond to constructors in logic (u, t etc) Some rules are nondeterministic (e.g., t, 6) In practice, this means search

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 104

DL Reasoning (2)DL Reasoning (2)

Stop when no more rules applicable or clash occurs Clash is an obvious contradiction, e.g., A(x), : A(x)

Cycle check (blocking) may be needed for termination

C satisfiable iff rules can be applied such that a fully expanded clash free tree is constructed

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 105

Highly Optimised ImplementationHighly Optimised Implementation

Naive implementation leads to effective non-termination

Modern systems include MANY optimisationsexamples

classification subsumption

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 106

Optimised classification Optimised classification

compute partial orderinguse enhanced traversal

exploit information from previous tests

use structural information to select classification order

Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 107

Optimised subsumption testing Optimised subsumption testing

search for modelsnormalisation and simplification of conceptsabsorption (rewriting) of general axiomsDavis-Putnam style semantic branching searchdependency directed backtrackingcaching of satisfiability results and (partial) modelsheuristic ordering of propositional and modal

expansion…

© 2009 Franz J. Kurfess Semantic Web 108Horrocks & Rector, 2004

© 2009 Franz J. Kurfess Semantic Web 109[Dieng et al. 1999]

Reference [Dieng et al. 1999]Reference [Dieng et al. 1999]

© 2009 Franz J. Kurfess Semantic Web 110

Reference [Sommerville 01] Reference [Sommerville 01]

[Sommerville 01]

[Sommerville 01]

© 2009 Franz J. Kurfess Semantic Web 111

Post-TestPost-Test

© 2009 Franz J. Kurfess Semantic Web 113

ReferencesReferences [Gil 2000] Yolanda Gil, Knowledge Mobility. Dagstuhl Workshop

“Semantics for the Web”, March 2000. [NEEDS] National Engineering Digital Library, www.needs.org [Russell & Norvig 1995] Stuart Russell and Peter Norvig, Artificial

Intelligence - A Modern Approach. Prentice Hall, 1995.

© 2009 Franz J. Kurfess Semantic Web 114

Important Concepts and TermsImportant Concepts and Terms natural language processing neural network predicate logic propositional logic rational agent rationality Turing test

agent automated reasoning belief network cognitive science computer science hidden Markov model intelligence knowledge representation linguistics Lisp logic machine learning microworlds

© 2009 Franz J. Kurfess Semantic Web 115

Our goal, by the end of the course…Our goal, by the end of the course…

You should be able to understand the similarities and differences amongst the related methodologies

Understand the logical foundationsHave the vocabulary and basic skills to know when

and how to use modern ontology tools… and when not to!

© 2009 Franz J. Kurfess Semantic Web 116

Summary Chapter-TopicSummary Chapter-Topic

© 2009 Franz J. Kurfess Semantic Web 117

ResourcesResources

Presentation Ian Horrocks and Alan Rector: The Semantic Web: Ontologies and OWL CS64, University of Manchester, Manchester, UK

Presentation James Hendler:

© 2009 Franz J. Kurfess Semantic Web 118