1236369. the need “most of the web's content today is designed for humans to read, not for...
TRANSCRIPT
![Page 1: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/1.jpg)
1236369
![Page 2: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/2.jpg)
The Need
“Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.”
Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001
2236369
![Page 3: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/3.jpg)
Semantic ProcessingOur goal: to be able to pose complex search
tasks that using the semantics of pieces of information, e.g.,
I want to purchase a DVDof “Dore the Explorer” at a price lower than 10$.Is such a CD available atamazon.com?
3
![Page 4: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/4.jpg)
Current search agents are not suitable for such task
4236369
![Page 5: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/5.jpg)
236369 5
In order to get the feeling, try to read and understand a page written in a foreign
language
![Page 6: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/6.jpg)
Current Solution
Use “intelligent” agents
The Semantic-Web Approach
Content is machine-understandable by being bound to some formal description
of itself (i.e. metadata)
6236369
![Page 7: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/7.jpg)
So what is the Semantic Web?Humans can easily “connect the dots” when
browsing the Web… you disregard advertisements you “know” (from the context) that one
phone number in my homepage is my office phone and the other one is my fax number
etc.… but machines can’t!The goal is to have a Web of Data to that
machines can exploite
7236369
![Page 8: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/8.jpg)
Some NeedsBetter searchSemantics in Web ServicesData integration
8236369
![Page 9: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/9.jpg)
Need to Improve Web Search
9236369
![Page 10: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/10.jpg)
Need for Semantics in Web Services
If the services are ubiquitous, searching issue comes up, for example: “find me the most elegant Schrödinger equation solver”
but what does it mean to be “elegant”? “most elegant”?
mathematicians ask these questions all the time… It is necessary to characterize the service not
only in terms of input and output parameters… …but also in terms of its semantics
10236369
![Page 11: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/11.jpg)
Need for Data IntegrationDatabases are very different in structure & in
contentLots of applications require managing several
databases after company mergers combination of administrative data for e-
Government biochemical, genetic, pharmaceutical research etc.
Most of these data are accessible on the Web (though not necessarily public yet)
11236369
![Page 12: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/12.jpg)
Example: Data Integration in Life Sciences
12236369
![Page 13: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/13.jpg)
And the Problem is Real
13236369
![Page 14: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/14.jpg)
GoalsWeb of data - provides common data
representation framework to facilitate integrating multiple sources to draw new conclusions
Increase the utility of information by connecting it to its definitions and to its context
More efficient information access and analysis
236369 14
![Page 15: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/15.jpg)
What Is Needed (Technically)?To make data machine process-able, we need:
unambiguous names for resources that may also bind data to real world objects: URI-s
a common data model to access, connect, describe the resources: RDF
access to that data: SPARQL common vocabularies: RDFS, OWL, SKOS reasoning logics: OWL, Rules
The “Semantic Web” is an infrastructure for the interchanging and integrating data on the Web
It extends the current Web (and does not replace it)
15236369
![Page 16: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/16.jpg)
Required ApplicationsAgents that search the Web and retrieve
valuable information to the end userWeb services that publish their informationPrograms that try to integrate data of
different web services and to produce new results or draw new conclusions from the integrated data
236369 16
![Page 17: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/17.jpg)
Ontologies & Inference Engines
“For the semantic web to function, computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning.”
Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001
236369 17
![Page 18: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/18.jpg)
The Four Building Blocks
1. XML2. RDF3. Ontologies4. Agents
236369 18
![Page 19: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/19.jpg)
XML
“XML allows users to add arbitrary structure to their documents but says nothing about what the structures mean”
236369 19
![Page 20: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/20.jpg)
RDF –Resource Description FrameworkMeaning encoded in sets of ‘triples’: entities
have properties which have valuesEntities, properties and values all have
distinct URIs
236369 20
“imagine that we have access to a variety of databases with information about people, including their addresses. If we want to find people living in a specific zip code, we need to know which fields in each database represent names and which represent zip codes. RDF can specify that "(field 5 in database A) (is a field of type) (zip code)," using URIs rather than phrases for each term.” Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001a
![Page 21: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/21.jpg)
OntologiesDatabase A and Database B may use
different fields to contain ‘zip code’Ontologies sort this outOntology = ‘a document or file that
formally defines the relations among terms’
Ontologies for the web normally haveA taxonomyA set of inference rules
236369 21
![Page 22: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/22.jpg)
Agents
“Agent based computing appears to be the appropriate paradigm to work in a complex world with multiple ontologies, fragments and multiple inferencing engines.”
Stork, Hans-Georg and Mastroddi, Franco, Semantic Web Technologies - a New Action Line in the European Commission’s IST Programme, 2001
236369 22
![Page 23: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/23.jpg)
The Power of Agents - Integration“The real power of the Semantic Web will be realized when people create many programs that collect Web content from diverse sources, process the information and exchange the results with other programs. The effectiveness of such software agents will increase exponentially as more machine-readable Web content and automated services (including other agents) become available.”
Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001
236369 23
![Page 24: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/24.jpg)
‘Ambient Intelligence’
“In the next step, the Semantic Web will break out of the virtual realm and extend into our physical world. URIs can point to anything, including physical entities, which means we can use the RDF language to describe devices such as cell phones and TVs.”Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001
236369 24
![Page 25: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/25.jpg)
Resource Description Framework
236369 25
![Page 26: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/26.jpg)
26
What is RDF?A part of the semantic-web activityRDF is a general-purpose language for
representing information on the webSpecifically, objects and relationships
Designed to allow computer applications to process data based on its semanticsRather than displaying data to humans (as
opposed to RSS)
An RDF document is actually a labeled graph that is represented in XMLThe specific language is called RDF/XMLW3C recommendation (Feb. 2004)
236369
![Page 27: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/27.jpg)
Resource Description Framework (RDF)
The data model of the Semantic WebThe data model of the Semantic Web
A schema-less data model that features unambiguous identifiers and named relations between pairs of resources
A schema-less data model that features unambiguous identifiers and named relations between pairs of resources
A labeled, directed graph of relations between resources and literal valuesA labeled, directed graph of relations between resources and literal values
27236369
![Page 28: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/28.jpg)
RDF Data Consists of TripletsRDF data is a set of statementsEach statement is a triplet (Resource,
Property, Value)Sometimes we refer to a triplet using the
terminology of (Subject, Predicate, Object)
236369 28
The author of http://www.cs.technion.ac.il/kanza/myPage.html
is Yaron KanzaResource (Subject): myPage.html
Property (Predicate): author Value (Object): Yaron Kanza
![Page 29: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/29.jpg)
29
RDF Data
Subject Objectpredicate
The basic element: Triple (labeled edge)
Person#845
#1002
address
postalCode
6941Haifa
city
Herzel
street
RDF document: edge-labeled graph
Statement
236369
![Page 30: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/30.jpg)
URI-s Play a Fundamental RoleAnybody can create (meta)data on any resource
on the Web e.g., the same Person could be annotated through
other terms semantics is added to existing Web resources via
URI-sData exist on the Web, because it is accessible
through standard Web meansURI-s ground RDF into the Web
information can be retrieved using existing tools this makes the “Semantic Web”, well… “Semantic
Web”
30236369
![Page 31: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/31.jpg)
Example: RDF Triples
<rdf:Description rdf:about="http://.../membership.svg#FullSlide"> <axsvg:graphicsType>Chart</axsvg:graphicsType> <axsvg:labelledBy rdf:resource="http://...#BottomLegend"/> <axsvg:chartType>Line</axsvg:chartType></rdf:Description>
31236369
![Page 32: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/32.jpg)
URI-s: Merging (data integration)
It becomes easy to merge data Merge can be done because statements refer
to the same URI-s nodes with identical URI-s are considered
identicalMerging is a very powerful feature of RDF
metadata may be defined by several (independent) parties…
…and combined by an application one of the areas where RDF is much handier
than pure XML
32236369
![Page 33: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/33.jpg)
Data integration example
33236369
![Page 34: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/34.jpg)
34
page.html
John Smith
John’s Home Page
DC:Creator
DC:Title
<?xml version=“1.0”?><rdf:RDF
xmlns:rdf=“http://www.w3.org/TR/WD-rdf-syntax#”xmlns:dc=“http://purl.org/metadata/dublin_core#”>
<rdf:Description about=“page.html”> <dc:Creator>John Smith</dc:Creator> <dc:Title>John’s Home Page</dc:Title> </rdf:Description>
</rdf:RDF>
236369
![Page 35: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/35.jpg)
35
page.html
John SmithJohn’s Home Page [email protected]
dc:Title
dc:Creator
Name Email
. . .<Description about=“page.html”> <dc:Creator> <Description> <corp:Name>John Smith</corp:Name> <corp:Email>[email protected]</corp:Email> </Description> </dc:Creator> <dc:Title>John’s Home Page</dc:Title></Description>
</RDF>236369
![Page 36: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/36.jpg)
ContainersGroups of things: <bag> <seq>
<alt><bag>: unordered list; duplicates
allowed<seq>: ordered list; duplicates
allowed<alt>: list of alternatives; one will
be selected236369 36
![Page 37: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/37.jpg)
DC – Dublin CoreThe mission: To make it easier to find
resources using the Internet through the following activities
Developing metadata standards for discovery across domains
Defining frameworks for the interoperation of metadata sets
Facilitating the development of community- or discipline-specific metadata sets that are consistent with the above items
37236369
![Page 38: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/38.jpg)
DC - Core Metadata Element SetoTitle (dc:title)oSubject (dc:subject)oDescription
(dc:description)oCreatoroPublisher
(dc:publisher)oContributoroDate
o Typeo Formato Identifier
(dc:identifier)o Sourceo Languageo Relation
(dcterms:isPartOf)o Coverageo Rights
38236369
![Page 39: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/39.jpg)
39
A set of fifteen basic properties for describing generalized Web resources“Title”: the name given to the resource“Creator”: the person or organization primarily
responsible for the resource“Subject”: what the resource is about“Description”: a description of the content“Publisher”: the person or organization responsible
for making the resource available“Contributor”: someone who has provided content to
the resource other than the creator“Date”: date of creation or publication
Dublin Core
236369
![Page 40: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/40.jpg)
40
“Type”: type of resource, such as home page, technical report, novel, photograph…
“Format”: data format of the resource“Identifier”: URL, ISBN number, …“Source”: another resource that this resource is derived
from“Language”: the language of the content“Relation”: another resource and its relationship to this
one“Coverage”: the portion of time or space described by
this resource (atlases, histories, etc.)“Rights”: the intellectual property rights adhering to
this resource, or a pointer to them
Dublin Core
236369
![Page 41: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/41.jpg)
41
Manually from HTML or “user domain XML”With special assisting tools – like Protégé,
Reggie, DC-dot, RDF for XMLIdeally – with some automated procedure
from HTML/XML documentsCan we use XSLT for that?
Creating RDF documents
236369
![Page 42: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/42.jpg)
236369 42
![Page 43: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/43.jpg)
RDF SchemaRDF Schema (RDFS) enriches the data model
of RDF, adding vocabulary and associated semantics forClasses and subclassesProperties and sub-propertiesTyping of properties
Support for describing simple ontologiesAdds an object-oriented flavorBut with a logic-oriented approach and using
“open world” semantics236369 43
![Page 44: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/44.jpg)
44
Not an XML Schema!A “companion” specification for RDF specClass, Type, subClassOf,Properties: domain, rangeMisc: label, comment, isDefinedBy,etc.
RDF Schema
236369
![Page 45: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/45.jpg)
Classes, Resources, … Basically, traditional ontologies:
use the term “mammal” “every dolphin is a mammal” “Flipper is a dolphin” etc.
RDFS defines resources and classes: everything in RDF is a “resource” “classes” are also resources, but… they are also a collection of possible resources (i.e.,
“individuals”) “mammal”, “dolphin”, …
Relationships are defined among classes/resources: “typing”: an individual belongs to a specific class (“Flipper is a
dolphin”) “subclassing”: instance of one is also the instance of the other
(“every dolphin is a mammal”)
RDFS formalizes these notions in RDF45236369
![Page 46: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/46.jpg)
Classes, Resources in RDF(S)
RDFS defines rdfs:Resource, rdfs:Class as nodes; rdf:type, rdfs:subClassOf as properties
46236369
![Page 47: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/47.jpg)
Typed NodesA resource may belong to several classes
rdf:type is just a property…“Flipper is a mammal, but Flipper is also a TV
star…”i.e., it is not like a data type in this sense!The type information may be very important
for applications e.g., it may be used for a categorization of
possible nodes
47236369
![Page 48: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/48.jpg)
Inferred Properties
(#Flipper rdf:type #Mammal) is not in the original RDF data… …but can be inferred from the RDFS rules Better RDF environments will return that triplet, too
48236369
![Page 49: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/49.jpg)
Inference: FormalThe RDF Semantics document has a list of
(44) entailment rules :“if such and such triplets are in the graph, add this
and this triplet”do that recursively until the graph does not changethis can be done in polynomial time for a specific
graph
Whether those extra triplets are physically added to the graph, or deduced when needed is an implementation issue
49236369
![Page 50: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/50.jpg)
Properties (Predicates)Property is a special class (rdf:Property)Properties are constrained by their range
and domainProperties are also resources… (have URI’s)For example, (P rdfs:range C) means:
1. P is a property
2.C is a class instance3.when using P, the “object” must be an individual in C
* this is an RDF statement with subject P, object C, and property rdfs:range
50236369
![Page 51: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/51.jpg)
Property Specification Example
51236369
![Page 52: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/52.jpg)
LiteralsLiterals may have a data type (floats, ints, etc)
defined in XML Schemas, including full XML Fragments
(Natural) language can also be specified (via xml:lang)
52236369
![Page 53: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/53.jpg)
236369 53
<rdf:RDFxmlns:rdf= "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"xml:base= "http://www.animals.fake/animals#">
<rdf:Description rdf:ID="animal"> <rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/></rdf:Description>
<rdf:Description rdf:ID="horse"> <rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/> <rdfs:subClassOf rdf:resource="#animal"/></rdf:Description>
</rdf:RDF>Horse is defined as subclass of animal
Example
![Page 54: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/54.jpg)
Example<?xml version="1.0"?><rdf:RDF
xmlns:rdf= "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xml:base= "http://www.animals.fake/animals#"><rdfs:Class rdf:ID="animal" /><rdfs:Class rdf:ID="horse">
<rdfs:subClassOf rdf:resource="#animal"/> </rdfs:Class>
</rdf:RDF>
236369 54
Abbreviated version. Works because an RDFS class is an RDF resource.
Use rdfs:Class instead of rdfDescription and drop the rdf:type information
![Page 55: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/55.jpg)
RDF and RDF Schema
<rdf:RDF xmlns:g=“http://schema.org/gen” xmlns:u=“http://schema.org/univ”> <u:Chair rdf:ID=“john”> <g:name>John Smith</g:name> </u:Chair></rdf:RDF>
<rdfs:Property rdf:ID=“name”> <rdfs:domain rdf:resource=“Person”></rdfs:Property>
<rdfs:Class rdf:ID=“Chair”> <rdfs:subclassOf rdf:resource= “http://schema.org/gen#Person”></rdfs:Class>
u:Chair
John Smith
rdf:typeg:name
g:Person
g:name
rdfs:Class rdfs:Property
rdf:typerdf:type
rdf:type
rdfs:subclassOf
rdfs:domain
55236369
![Page 56: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/56.jpg)
SPARQL Protocol and RDF Query Language
236369 56
![Page 57: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/57.jpg)
SPARQLSPARQL =
Query Language + Protocol + XML Results Format
• Access and query RDF graphs
• Product of the RDF Data Access Working Group
236369 57
![Page 58: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/58.jpg)
58
SPARQL QueryPREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title2WHERE{ ?doc dc:title "SPARQL at speed" . ?doc dc:creator ?c . ?docOther dc:creator ?c . ?docOther dc:title ?title2
}
• On an abstracts/papers database:“Find other papers by the authors of a given paper.”
236369
![Page 59: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/59.jpg)
59
SPARQL QueryPREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX shop: <http://example/shop#>
SELECT ?titleWHERE{ ?doc dc:title ?title . FILTER regex(?title, "SPARQL") . ?doc dc:creator ?c . ?c foaf:name ?name . OPTIONAL { ?doc shop:price ?price }}
• “Find books with ‘SPARQL’ in the title. Get the authors’ name and the price (if available).”
• Multiple vocabularies
236369
![Page 60: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/60.jpg)
60
Inference• An RDF graph may be backed by inference
− OWL, RDFS, application, rules
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>SELECT ?typeWHERE{ ?x rdf:type ?type .}
:x rdf:type :C .:C rdfs:subClassOf :D .
--------| type |========| :C || :D |--------
236369
![Page 61: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/61.jpg)
61
Another ExamplePREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX ldap: <http://ldap.hp.com/people#>PREFIX foaf:
SELECT ?name ?email{ ?doc dc:title ?title . FILTER regex(?title, “SPARQL”) . ?doc dc:creator ?reseacher . ?researcher ldap:email ?email . ?researcher ldap:name ?name}
• “Find the name and email addresses of authors of a paper about SPARQL”
236369
![Page 62: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/62.jpg)
Other SPARQL FeaturesLimit the number of returned results;
remove duplicates, sort them, … Return the full subgraph (instead of a
list of bound variables) Construct a graph combining a
separate pattern and the query results Use datatypes and/or language tags
when matching a pattern
62236369
![Page 63: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/63.jpg)
Programming Practice - JenaRDF toolkit in Java from HP’s Bristol lab; it has
a large number of classes/methods listing, removing associated properties, objects, comparing
full RDF graphs manage typed literals, mapping Seq, Alt, etc. to Java
constructs
an “RDFS Reasoner” a full SPARQL implementation a layer (Joseki) for remote access of triples
(essentially, and triple database) and more… Probably the most widely used RDF environment in
Java today
63236369
![Page 64: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/64.jpg)
Programming Practice - Jena RDF toolkit in Java from HP’s Bristol lab; it has
a large number of classes/methods listing, removing associated properties, objects,
comparing full RDF graphs manage typed literals, mapping Seq, Alt, etc. to Java
constructs an “RDFS Reasoner” a full SPARQL implementation a layer (Joseki) for remote access of triples
(essentially, and triple database) and more… Probably the most widely used RDF environment in
Java today
64236369
![Page 65: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/65.jpg)
236369 65
![Page 66: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/66.jpg)
RDF Schema is LimitedWe cannot express facts such as
Two classes are disjointBuild a class that is the union of two classesCardinality restrictionScope of propertiesProvide relationships between properties, such
as transitive, unique, inverse
236369 66
![Page 67: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/67.jpg)
Ontologies (OWL)RDFS is useful, but does not solve all the
issuesCan a program reason about some terms?
E.g.: “if «A» is left of «B» and «B» is left of «C», is «A» left of «C»?”
Programs should be able to deduce such statements
If somebody else defines a set of terms: are they the same?
67236369
![Page 68: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/68.jpg)
How to speak Ontology?We need a Web Ontologies Language to
define: more on the terminology used in a specific context more constraints on properties, logical
characterization of properties etc.
W3C’s Ontology Language (OWL) - A layer on top of RDFS with additional possibilities
“OWL is now the most used KR language in the history of AI…” (Jim Hendler)
68236369
![Page 69: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/69.jpg)
OWLA Web ontology language that is more
expressive than RDF and RDF SchemaWritten in XML on top of RDFUsing OWL we want to provide exact
descriptions of items and the relationships between them
236369 69
![Page 70: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/70.jpg)
Why “OWL” and not “WOL”?Some urban legends…
e.g., reference to Owl from Winie the Pooh, who misspelled his name as “WOL”
A reference to an AI project at MIT of the mid 70’s by Bill Martin, called “One World Language”…
an early attempt for a KR language and associated ontology, intended to be a universal language for encoding meaning for computers
70236369
![Page 71: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/71.jpg)
Property Restrictions in OWLRestriction may be by: value constraints (i.e.,
further restrictions on the range) all values must be from a class at least one value must be from a class
cardinality constraints (i.e., how many times the property can be
used on an instance?) minimum cardinality maximum cardinality exact cardinality
71236369
![Page 72: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/72.jpg)
Property Restriction Example“A dolphin is a mammal living in the sea or in
the Amazonas”
72236369
![Page 73: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/73.jpg)
OWL (cont.)Property Characterization
In OWL, one can characterize the behavior of properties (symmetric, transitive, …)
Term Equivalence/Relations For classes (owl:equivalentClass, owl:disjointWith) For properties (owl:equivalentProperty, owl:inverseOf) For individuals (owl:sameAs, owl:differentFrom)
Special class owl:Ontology with special properties:
owl:imports, owl:versionInfo, owl:priorVersion owl:backwardCompatibleWith, owl:incompatibleWith rdfs:label, rdfs:comment can also be used
73236369
![Page 74: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/74.jpg)
OWL and LogicOWL expresses a small subset of
First Order Logic it has a “structure” (class hierarchies,
properties, datatypes…), and “axioms” can be stated within that structure only.
Inference based on OWL is within this framework only
74236369
![Page 75: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/75.jpg)
However: Ontologies are Hard!A full ontology-based application is
Complex systemHard to implementHeavy to runAnd not all application may need it
Three layers of OWL are defined: Lite, DL(Description Logic) Full
decreasing level of complexity and expressiveness
75236369
![Page 76: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/76.jpg)
OWL Example: FOAF – The way to describe yourselfStands for "Friend Of A Friend"Provides structured linksInformation distributed & extensible
Name (foaf:name) E-mail (Foaf:mbox) Representing picture
(Foaf:img) Your publications
(Foaf:publications) Your online account
(Foaf:holdsAccount)
76236369
![Page 77: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/77.jpg)
Linkshttp://www.w3.org/RDF/s
236369 77
![Page 78: 1236369. The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e725503460f94b71d47/html5/thumbnails/78.jpg)
SPARQL LinksJena: Java and .Net Semantic Web
Frameworkhttp://jena.sourceforge.net/
SPARQL Queryhttp://jena.sourceforge.net/ARQ
SPARQL Protocolhttp://www.joseki.org
SquirrelRDF: Access legacy SQL:http://jena.sourceforge.net/SquirrelRDF
236369 78