1 csit600f: introduction to semantic web dickson k.w. chiu phd, smieee text: antoniou & van...
Post on 21-Dec-2015
222 views
TRANSCRIPT
1
CSIT600f: Introduction to Semantic Web
Dickson K.W. ChiuPhD, SMIEEE
Text: Antoniou & van Harmelen: A Semantic Web Primer
Ref: Ivan Herman: Tutorial on Semantic Web Technology
Dickson Chiu 2006 CSIT600b s1-2
Towards a Semantic Web
WWW is an impressive success: amount of available information (> 1 Giga-page) number of human users (> 200 Mega-user)
The current Web represents information using natural language (English, Hungarian, Chinese,…) graphics, multimedia, page layout
Humans can process this easily can deduce facts from partial information can create mental associations are used to various sensory information
(well, sort of… people with disabilities may have serious problems on the Web with rich media!)
Dickson Chiu 2006 CSIT600b s1-3
Need for understanding Web info
Tasks often require to combine data on the Web: hotel and travel infos may come from different
sites searches in different digital libraries etc.
Again, humans combine these information easily even if different terminologies are used!
Dickson Chiu 2006 CSIT600b s1-4
However…
However: machines are ignorant! partial information is unusable difficult to make sense from, e.g., an image drawing analogies automatically is difficult difficult to combine information
is <foo:creator> same as <bar:author>?
how to combine different XML hierarchies?
…
Dickson Chiu 2006 CSIT600b s1-5
Example: Searching
The best-known example… Google et al. are great, but there are too
many false hits adding descriptions to resources should
improve this
Dickson Chiu 2006 CSIT600b s1-6
Where we are Today: the Syntactic Web
[Hendler & Miller 02]
Dickson Chiu 2006 CSIT600b s1-7
The Syntactic Web is… A hypermedia, a digital library
A library of documents called (web pages) interconnected by a hypermedia of links
A database, an application platform A common portal to applications accessible through web
pages, and presenting their results as web pages A platform for multimedia
BBC Radio 4 anywhere in the world! Peer-to-peer sharing (BT, edonkey, PPLive, …)
A naming scheme Unique identity for those documents
A place where computers do the presentation (easy) and people do the linking and interpreting (hard).
Why not get computers to do more of the hard work?
Dickson Chiu 2006 CSIT600b s1-8
Hard using the Syntactic Web… Finding the image of something
Find pictures that contain red birds with blue background Complex queries involving background knowledge
Find information about “animals that use sonar but are not either bats or dolphins”
Locating information in data repositories Travel enquiries Prices of goods and services Results of human genome experiments
Finding and using “web services” Visualise surface interactions between two proteins
Delegating complex tasks to web “agents” Book me a holiday next weekend somewhere warm, not too far
away, and where they speak French or English
Dickson Chiu 2006 CSIT600b s1-9
What is the Problem?
Consider a typical web page:
Markup comprise rendering
information (e.g., font size and colour)
Hyper-links to related content
Semantic content is accessible to humans but not (easily) to computers…
Dickson Chiu 2006 CSIT600b s1-10
What information can we see…WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong
kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire
Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh
international world wide web conference. This prestigious event …
Speakers confirmedTim berners-lee Tim is the well known inventor of the Web, …Ian FosterIan is the pioneer of the Grid, the next generation internet …
Dickson Chiu 2006 CSIT600b s1-11
Information a machine may see…
…
…
…
Dickson Chiu 2006 CSIT600b s1-12
Solution: XML markup with “meaningful” tags?
<name> </name><location>
</location>…
How about…<conf> </conf>
<place>
</place>
Then how about…< 会议 >
</ 会议 >
< 地点 >
</ 地点 >
Dickson Chiu 2006 CSIT600b s1-13
What Is Needed?
A resource should provide information about itself also called “metadata” metadata should be in a machine
processable format agents should be able to “reason” about
(meta)data metadata vocabularies should be defined
Dickson Chiu 2006 CSIT600b s1-14
What Is Needed (Technically)?
To make metadata machine processable, we need: unambiguous names for resources (URIs) a common data model for expressing
metadata (RDF) and ways to access the metadata on the Web
common vocabularies (Ontologies) The “Semantic Web” is a metadata
based infrastructure for reasoning on the Web
It extends the current Web (and does not replace it)
Dickson Chiu 2006 CSIT600b s1-15
Adding “Semantics” External agreement on meaning of annotations
E.g., Dublin Core (http://dublincore.org/) Agree on the meaning of a set of annotation tags
Problems with this approach Inflexible Limited number of things can be expressed
Use Ontologies to specify meaning of annotations Ontologies provide a vocabulary of terms New terms can be formed by combining existing ones Meaning (semantics) of such terms is formally specified Can also specify relationships between terms in
multiple ontologies
Dickson Chiu 2006 CSIT600b s1-16
History of the Semantic Web Web was “invented” by Tim Berners-Lee (amongst others), a
physicist working at CERN TBL’s original vision of the Web was much more ambitious
than the reality of the existing (syntactic) Web:
TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web
E.g., article in May 2001 issue of Scientific American…
“... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.”
Dickson Chiu 2006 CSIT600b s1-17
Berner-Lee’s Architecture
Data Exchange
Semantics+reasoning
Relational Data?
?
???
???
???
• Relationship between layers is not clear• OWL DL extends “DL subset” of RDF
Dickson Chiu 2006 CSIT600b s1-18
A Spectrum of Ontology
Catalog/ID
GeneralLogical
constraints
Terms/glossary
Thesauri“narrower
term”relation
Formalis-a
Frames(properties)
Informalis-a
Formalinstance
Value Restrs.
Disjointness, Inverse, part-
of…
Dickson Chiu 2006 CSIT600b s1-19
Ontology in Philosophy - a philosophical discipline—a branch of philosophy that deals with the nature and the organization of reality
Science of Being (Aristotle, Metaphysics, IV, 1) studies being or existence as well as the basic
categories thereof trying to find out what entities and what types of
entities exist has strong implications for the conceptions of
reality.
Ontology: Origins and History
Dickson Chiu 2006 CSIT600b s1-20
Ontology in Linguistics
“Tank“
ReferentFormStands for
Relates toactivates
Concept
[Ogden, Richards, 1923]?
Dickson Chiu 2006 CSIT600b s1-21
An ontology is an engineering artifact [Neches91]: defines basic terms and relations comprising the
vocabulary of a topic area the rules for combining terms and relations to define
extensions to the vocabulary “An explicit specification of a conceptualization”
[Gruber93] Formal specification of a shared conceptualization
(of a certain domain) [Borst 97]: Shared understanding of a domain of interest Formal and machine manipulable model of a domain of
interest
Ontology in Computer Science
Dickson Chiu 2006 CSIT600b s1-22
Structure of an OntologyOntologies typically have two distinct components:1. Names for important concepts in the domain
Elephant is a concept whose members are a kind of animal Herbivore is a concept whose members are exactly those
animals who eat only plants or parts of plants Adult_Elephant is a concept whose members are exactly
those elephants whose age is greater than 20 years
2. Background knowledge/constraints on the domain Adult_Elephants weigh at least 2,000 kg All Elephants are either African_Elephants or
Indian_Elephants No individual can be both a Herbivore and a Carnivore
Dickson Chiu 2006 CSIT600b s1-23
Ontology Elements Concepts (classes) + their hierarchy Concept properties (slots / attributes) Property restrictions (type, cardinality, domain,
etc.) Relations between concepts (disjoint, equality, etc.) Instances
E-R diagram / UML diagram ??? Note: “Property” “Slot” “Relation” “Relationtype”
“Attribute” Semantic link type”
Dickson Chiu 2006 CSIT600b s1-24
A Semantic Web — First Steps
Extend existing rendering markup with semantic markup Metadata annotations that describe content/function of web
accessible resources Use Ontologies to provide vocabulary for annotations
“Formal specification” is accessible to machines A prerequisite is a standard web ontology language
Need to agree common syntax before we can share semantics Syntactic web based on standards such as HTTP and HTML
Make web resources more accessible to automated processes
Dickson Chiu 2006 CSIT600b s1-25
More Example: Automatic Assistant
Your own personal (digital) automatic assistant knows about your preferences builds up knowledge base using your past can combine the local knowledge with remote services:
hotel reservations, airline preferences dietary requirements medical conditions calendaring etc
It communicates with remote information (i.e., on the Web!)
Dickson Chiu 2006 CSIT600b s1-26
Example: Database Integration
Databases are very different in structure, in content Lots of applications require managing several
databases after company mergers combination of administrative data for e-Government biochemical, genetic, pharmaceutical research etc.
Most of these data are now on the Web The semantics of the data(bases) should be known
how this semantics is mapped on internal structures is immaterial
Dickson Chiu 2006 CSIT600b s1-27
Example: Digital Libraries
It is a bit like the search example It means catalogs on the Web
librarians have known how to do that for centuries
goal is to have this on the Web, World-wide extend it to multimedia data, too
But it is more: software agents should also be librarians! help you in finding the right publications
Dickson Chiu 2006 CSIT600b s1-28
Example: Semantics of Web Services
Web services technology is great But if services are ubiquitous, searching issue
comes up, for example: “find me the most elegant Schrödinger equation solver” what does it mean to be
“elegant”? “most elegant”?
mathematicians ask these questions all the time… It is necessary to characterize the service
not only in terms of input and output parameters… …but also in terms of its semantics
Dickson Chiu 2006 CSIT600b s1-29
How Simple Ontologies Help
not as costly to build and potentially more importantly, many are available provide a controlled vocabulary website organization and navigation support support expectation setting (e.g. user
interface) “umbrella” structures from which to extend
content (e.g., UNSPSC) searching support sense disambiguation support (e.g., terms
belong to different categories) Deborah McGuinness. Ontologies Come of Age. The Semantic Web: Why, What and How, MIT Press, 2001. (MS-Word)
Dickson Chiu 2006 CSIT600b s1-30
How Structured Ontologies Help
more structure => more power consistency checking completion (of unspecified attributes and
relations) interoperability support validation and verification testing or even
encode entire test suites structured, comparative, and customized
search “intelligence” in application, e.g., system
configuration support
Dickson Chiu 2006 CSIT600b s1-31
Benefits of Semantic Web
Communication between people Interoperability between software agents Reuse of domain knowledge Make domain knowledge explicit Analyze domain knowledge
Dickson Chiu 2006 CSIT600b s1-32
The Semantic Web is Not “ Artificial Intelligence on the Web”
although it uses elements of logic… … it is much more down-to-Earth (we will see later) it is all about properly representing and characterizing metadata of course: AI systems may use the metadata of the SW
but it is a layer way above it
“A purely academic research topic” SW is out of the university labs now lots of applications exist already (see examples later) big players of the industry use it (Sun, Adobe, HP, IBM,…) of course, much is still be done!
Building an ontology is not a goal in itself