rdf, owl, sparql and the semantic wed accu 2009 seb rose

Post on 04-Jan-2016

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

RDF, OWL, SPARQL and the Semantic Wed

ACCU 2009

Seb Rose

A Quick Recap

• The internet has been around for almost 20 years

• In the beginning the was mostly static content

• Over the past 10 years there has been a move to more dynamic content

• Usage is getting faster and better, but is still often frustrating

Where Is The Meaning?

• Web technologies are biased toward presentation rather than semantic content

• Even when the data comes from structures sources we need to use custom presentation technologies to render it

• XML Schemas & XSLT can be used for data integration, but are fragile

Semantic Web

• Introduced by Tim Berners Lee in 2001

• Supports distributed web at level of data (rather than presentation)

• Permits machine agents to reason about content

• Retains the open nature of the web

AAA

• Anyone can say Anything about Any topic.

• [coined by Allemang/Hendler in their book]

• No waiting for authorities to agree on schema, leading to …

• Network effect of gradual “semantification” of web, requiring …

• Ability to filter facts based on provenance

Where Will It Go?

• Semantic content may live:- at dedicated web addresses- interwoven in existing web pages

• Presentation may be generated from semantic content

• Existing browsers will either ignore the content or render it literally

RDF

• Resource Definition Framework

• Introduced early in W3C process

• Expresses facts (assertions) as triples:Subject Predicate Object

• E.g. Tony WorksFor Oracle

Thinking About RDF

• Tabular Representation:Row – SubjectColumn – PredicateCell Value – Object

• Directed Graph:Node – SubjectDirected Arc – PredicateNode - Object

Resources

• Subjects, Predicates and Objects all name Resources

• Resources are identified by URIs

• If a URI is dereferencable it is also a URL…but it doesn’t have to be a URL

• Objects can be literal (XML) values

What Does RDF Look Like?

• RDF has several standard serialisations

• Often stored as RDF-XML, but this is not very readable

• Most simply expressed as raw triples, but this is very verbose

• Most often consumed by practitioners as N3

Namespaces

• URIs tend to live in namespaces• These get given symbolic names to make

serialisations more compact• There are various ‘standard’ namespaces:

rdf, rdfs, owl, dc• You can specify a default namespace for an

RDF document• Abbreviated URIs are called qnames

Ontology

• A semantic model – a schema• Each namespace refers to an ontology• Dublin Core (named after a town in US) is

the most common with hundreds of useful annotations

• There are many domain specific ontologies (and there are search tools to help you find an applicable one)

N3 - Example@prefix comp: <http://www.claysnow.co.uk/ACCU2009/Example.rdf#>@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

comp:Accu2009 rdf:type comp:ComputerConference .comp:Accu2009 comp:startDate “April 22, 2009” .comp:Accu2009 comp:hasKeynote comp:BobMartin .comp:Accu2009 comp:hasKeynote comp:AllanKelly .

ORcomp:Accu2009 rdf:type comp:ComputerConference ;

comp:startDate “April 22, 2009”;comp:hasKeynote comp:BobMartin ,

comp:AllanKelly .

N3 – Example (2)

• Abbreviate rdf:type to “a”:

comp:Accu2009 a comp:ComputerConference ;

comp:startDate “April 22, 2009”;

comp:hasKeynote comp:BobMartin ,

comp:AllanKelly .

Blank Nodes

• Aka BNODES

• Useful when making statements about entities that don’t have an identifier

• Can be interpreted as “there exists”• Shown in square braces:

comp:BobMartin comp:presentedKeyNoteAt[ a comp:Conference ;comp:locatedAt comp:BarceloOxford ] .

Lists

• Verbose in RDF, but have compact representation in N3:comp:Accu2009 comp:scheduledBreaks (comp:MorningCoffee comp:Lunch comp:AfternoonTea) .

• Becomes :comp:Accu2009 comp:scheduledBreak _:a_:a rdf:first comp:MorningCoffee ._:a rdf:rest _:b ._:b rdf:first comp:Lunch ._:b rdf:rest _:c ._:c rdf:first comp:AfternoonTea ._:c rdf:rest rdf:nil .

Explicit Reification

• Used for making statements about statements:“Giovanni told me ACCU2010 will be in Hawaii”:r rdf:subject comp:Accu2010 ;

rdf:predicate comp:locatedAt ;rdf:object geo:Hawii .

comp:Giovanni :hasSaid :r .

• By asserting these reification triples, we have not asserted the triple itself.

More RDF

• rdf:Property – the class of RDF properties

• rdf:Statement – the class of RDF statements

• rdf:resource – used in RDF-XML

• rdf:about – used in RDF-XML

• However, almost all uses of RDF also use RDFS (even the RDF resource itself)

SPARQL

• W3C standard query language

• Based on triple patterns

• Variables denoted by prefixing with ‘?’

• Graph patterns are lists of triple patterns enclosed in curly braces {}

• Responses in tabular format: SELECT

• Responses in graph format: CONSTRUCT

SELECT

SELECT { ?grandfather ?granddaughter }

WHERE

{ ?grandfather :hasChild ?parent .

?parent :hasDaughter ?granddaughter . }

UNION

SELECT { ?grandfather ?granddaughter }

WHERE

{{ ?grandfather :hasDaughter ?mother .

?mother :hasDaughter ?granddaughter . }

UNION

{ ?grandfather :hasSon ?father .

?father :hasDaughter ?granddaughter . }}

RDFS

• Resource Definition Framework Schema• Expressed in RDF• Extends RDF by introducing a set of

distinguished resources• RDF creates a graph structure, while RDFS

models sets• RDFS expresses ‘meaning’ through

mechanism of inference.

Inference• Each RDFS construct is defined by the inferences

that can be made when it is used• Inferencing is done by some part of the system

and results in inferred triples• When this happens and what happens to the triples

is not specified• Typically asserted and inferred triples are

indistinguishable• Inferred triples are also used to make further

inferences

Inference (2)

• Inferencing is the ‘glue’ that is used to join different schemas

• The RDFS (and OWL) patterns that follow enable us to specify how schemas should be joined (and the data merged)

• The resulting data can then be queried as if it were a single unified triple set

Inferencing Assumptions

• Non Unique Naming – a single, real-world entity may have multiple URI assigned to it. Inferencing engines cannot assume that different URIs refer to different entities.

• Open World Assumption – inferencing engines cannot assume they have seen all relevant assertions. More can become available at any time.

Type Propagation

• We have already seen:comp:Accu2009 rdf:type comp:ComputerConference .

• In the comp namespace we would find:comp:ComputerConference rdf:type rdfs:Class. comp:Conference rdf:type rdfs:Class .comp:ComputerConference rdfs:subClassOf comp:Conference .

• Hence, we can infer:comp:Accu2009 rdf:type comp:Conference .

Relationship Propagation

• We have already seen:comp:AccuConference comp:hasKeynote comp:BobMartin .

• In the comp namespace we would find:comp:hasKeynote rdf:type rdf:Property . comp:hasSpeaker rdf:type rdf:Property .comp:hasKeynote rdfs:subPropertyOf comp:hasSpeaker .

• Hence, we can infer:comp:AccuConference comp:hasSpeaker comp:BobMartin .

Typing a Property

• Given a triple, S P O, we can describe the usage of the predicate:comp:hasSpeaker rdfs:domain comp:Person .comp:hasSpeaker rdfs:range comp:Conference .

• There are no invalid assertions in RDFS, so:comp:BobMartin comp:hasSpeaker comp:AccuConference .

• Would allow us to infer:comp:BobMartin rdf:type comp:Conference .comp:AccuConference rdf:type comp:Person .

Unexpected Interaction

• Recall:comp:ComputerConference rdf:type rdfs:Class. comp:Conference rdf:type rdfs:Class .comp:ComputerConference rdfs:subClassOf comp:Conference .

• We now add:comp:geekRatio rdf:type rdf:Property .comp:geekRatio rdf:domain comp:ComputerConference .

• Now whenever we see:A comp:geekRatio B

We can infer: A rdf:type comp:Conference

Unexpected Interaction (2)

• rdfs:domain is not the same as declaring a property on a class in OO modelling

• In RDFS, properties are defined independently of classes

• RDFS relations (domain, range) tell us what inferences can be made from asserted triples

Some Simple Patterns

• RDFS has a limited number of simple inference rules

• These can combine in subtle ways to provide useful inferences

• The following patterns simulate some aspects of modelling set manipulations

Set Intersection

• Given:comp:Programmer rdf:type rdfs:Class .comp:ContractStaff rdf:type rdfs:Class .

• We can model this in one direction:comp:ContractProgrammer rdfs:subClassOf comp:Programmer .comp:ContractProgrammer rdfs:subClassOf comp:ContractStaff .

• Now by asserting that:comp:VerityStobb rdf:type comp:ContractProgrammer .

• We can infer that:comp:VerityStobb rdf:type comp:Programmer .comp:VerityStobb rdf:type comp:ContractStaff .

Set Union

• Given:comp:Programmer rdf:type rdfs:Class .comp:Tester rdf:type rdfs:Class .

• We can model this in one direction:comp:Programmer rdfs:subClassOf comp:IT_Staff .comp:Tester rdfs:subClassOf comp:IT_Staff .

• Now by asserting either:comp:VerityStobb rdf:type comp:Programmer .comp:VerityStobb rdf:type comp:Tester .

• We can infer that:comp:VerityStobb rdf:type comp:IT_Type .

Properties

• Similar constructs can be used to approximate intersection and union of properties

• Property Transfer can be used to join data from different schemas. Given: mySchema:P1 rdfs:subPropertyOf yourSchema:P2 .

Then: mySchema:S mySchema:P1 mySchema:O .

Entails: mySchema:S yourSchema:P2 mySchema:O .

Non-modeling RDFS

• rdfs:label – provides a human readable label (for external display)

• rdfs:comment – inline human readable documentation

• rdfs:seeAlso – a URI for additional documentary resources

• rdfs:isDefinedBy – a URI for resource definitions to be specified

RDFS-Plus

• RDFS extended with a subset of OWL (specified by Allemang/Hendler)

• Inferencing is expensive

• RDFS is not quite expressive enough

• RDFS-Plus trades expressivity for performance and is useful in many real world situations

Inverse Properties

• owl:inverseOfGiven: A P B .

P rdf:type rdf:Property .Q rdf:type rdf:Property .P owl:inverseOf Q .

Infer:B Q A

• But, given: A Q B .

We can’t infer: B P A .

Property Properties

• owl:SymmetricPropertyGiven: A P B .

P rdf:type owl:SymmetricProperty .

Infer:B P A

• owl:TransitivePropertyGiven: A P B .

B P C.P rdf:type owl:TransitiveProperty .

Infer:A P C.

Sameness of Types

• owl:equivalentClassGiven: A owl:equivalentClass B .

x rdf:type A .

Infer:x rdf:type B .

• owl:equivalentPropertyGiven: P owl:equivalentProperty Q .

A P B .

Infer:A Q B .

Sameness of Individuals

• owl:sameAsGiven: A owl:sameAs B .

A P C .D P A .

Infer:B P C .D P B .

• This identifies equivalent ‘individuals’, while equivalentClass identifies equivalent types.

Sameness of Individuals (2)

• owl:FunctionalPropertyGiven: P rdf:type owl:FunctionalProperty .

A P B .A P C .

Infer:B owl:sameAs C .

• This is an important class, since it allows sameness to be inferred

Sameness of Individuals (3)

• owl:InverseFunctionalPropertyGiven: P rdf:type owl:InverseFunctionalProperty .

A P B .C P B .

Infer:A owl:sameAs B .

• This resource can be thought of as analogous to primary key in relational databases.

Property Classification

• owl:DatatypePropertyFor properties whose object is a literal value e.g. comp:startDate rdf:type owl:DatatypeProperty

• owl:ObjectPropertyFor properties whose object is a resource e.g. comp:hasSpeaker rdf:type owl:ObjectProperty

• Used to assist tool support, not semantics

Utility of RDF-Plus

• Can model a wide variety of relationships• Can infer sameness of individuals and types• Is relatively cheap to implement• However, further OWL constructs allow us

greater possibilities:- Classification by restriction- Full set manipulation- Cardinality

Restriction

comp:OxfordConferences owl:equivalentClass [ a owl:Restriction; owl:onProperty comp:locatedAt ; owl:someValuesFrom comp:OxfordVenues ] .

• owl:Restriction rdfs:subClassOf owl:Class .

• Defined by a description of its members in terms of existing properties and classes.

On Property

comp:OxfordConferences owl:equivalentClass [ a owl:Restriction; owl:onProperty comp:locatedAt ; owl:someValuesFrom comp:OxfordVenues ] .

• Specifies which property is used to define the restriction class

Some Values From

comp:OxfordConferences owl:equivalentClass [ a owl:Restriction; owl:onProperty comp:locatedAt ; owl:someValuesFrom comp:OxfordVenues ] .

• At least one value of the property comes from the specified class

All Values From

comp:OxfordConferences owl:equivalentClass [ a owl:Restriction; owl:onProperty comp:locatedAt ; owl:allValuesFrom comp:OxfordVenues ] .

• All values of the property comes from the specified class

• This includes the empty set

Has Value

comp:BarceloOxfordConferences owl:equivalentClass [ a owl:Restriction; owl:onProperty comp:locatedAt ; owl:hasValue geo:BarceloOxford ] .

• The value of the property is as specified

• Special case of owl:someValuesFrom

comp:Programmer a owl:Class .

comp:DevelopmentTool a owl:Class .

comp:developsWith a rdf:Property ;

rdfs:domain comp:Programmer ;

rdfs:range comp:DevelopmentTool .

comp:OldSchoolTool owl:subClassOf comp:DevelopmentTool .

comp:OldSchoolCoder owl:subClassOf

[ a owl:Restriction ;

owl:onProperty comp:developsWith ;

owl:allValuesFrom comp:OldSchoolTool ] .

Now if we assert:

comp:AlanLenton a comp:OldSchoolCoder ;

comp:developsWith comp:Vim .

What can we inference can we make?

Protegé

• Open source ontology editing tool• Version 3.4 supports OWL 1.0, SPARQL

and integrates with reasoners • Uses slightly archaic terminology for

restriction description:- necessary (or partial definition)- necessary and sufficient (or complete definition)

Set Operations

• These use the same syntax we saw for list constructs:C a owl:Class ;

owl:unionOf( A, B ) .

C a owl:Class ;

owl:intersectionOf( A, B, C ) .

Assumptions Revisited

• With set operations we would also like to express cardinality constraints

• Open World and Non Unique Name Assumptions make this impossible

• We can expressly turn off these assumptions where necessary

Explicit Set Membership

• This ‘Closes the World’:ed:OxbridgeUniversities a owl:Class ;

owl:one of ( ed:OxfordUni, ed:CambridgeUni ) .

• But we also need to limit the Non Unique Name Assumption:ed:OxfordUni owl:differentFrom ed:CambridgeUni .

And for larger sets …

:ColouredBalls owl:equivalentClass

[ a owl:allDifferent ;

owl:distinctMembers ( :BlackBall,

:PinkBall, :BlueBall, :GreenBall, :BrownBall, :YellowBall, :RedBall ) ] .

Cardinality Restrictions

• [ a owl:Restriction ; owl:onProperty cards:bridgePlayer ; owl:cardinality 4 ]

• [ a owl:Restriction ; owl:onProperty cards:monopolyPlayer ; owl:minCardinality 2 ; owl:maxCardinality 6 ]

Set Complement

• A owl:complementOf B

• Very dangerous, because A now contains everything (in the universe) not in A

• Usually used with intersections:comp:NonGroovyProgrammers owl:intersectionOf

( [ a owl:Class ;

owl:complementOf comp:GroovyProgrammers ] , comp:Programmers ) .

Disjoint Sets

• A owl:disjointWith B

• No member of A can be a member of B

• Used to infer difference

• There is no AllDisjoint construct for classes that is analogous to the owl:AllDifferent construct for individuals

Contradictions

• In RDFS there could be no contradictions• OWL constructs allow us to make

contradictory statements::WayneCounty a :Man .:JayneCounty a :Woman .:WayneCounty owl:sameAs :JayneCounty .:Man owl:disjointWith :Woman .

• OWL can tell is there is a contradiction, but cannot tell us which assertion is ‘wrong’

Unsatisfiable Classes

• Since we can make contradictory statements, it follows that we can define classes that can’t have any members

• Unsatisfiability can be propagated through subClassOf, someValuesFrom, intersection, domain and range constructs

Classes and Individuals

• Reasoning about classes and individuals is implemented identically

• Reasoning about individuals can be thought of as processing data and generating output

• Reasoning about classes can be thought of as compilation of the model

• Reasoning about classes can be done in the absence of any individuals

Class or Individual

• Would like to model separately

• Is a Bird an individual in the class of Animals, or a class containing individuals?

• Use the Class-Individual Mirror Pattern::Bird a :Animal .:Birds owl:equivalentClass

[ a owl:Restriction ; owl:onProperty :comprises ; owl:hasValue :Bird ] .

Antipatterns

• Allemang and Hendler identify 5 antipatterns:- Rampant Classism- Exclusivity- Objectification- Managing Identifiers for Classes- Creeping Conceptualization

Objectification

• Model a person with parents::Person a owl:Class .:hasParent rdfs:domain :Person .:hasParent rdfs:range :Person .[ a owl:Restriction ; owl:onProperty :hasParent ; owl:cardinality 2 ] .

• What happens if we assert::Willem :hasParent :Beatrix .

Inferences

• It’s not a contradiction that we haven’t assert that Willem and Beatrix are Person – it will be inferred

• It’s not a contradiction that there is only one hasParent assertion – Open World

• It’s not a contradiction if there were more than 2 hasParent assertions – Non Unique Naming

Reuse Revisited

• owl:Ontology – specifies URI of ontology

• owl:imports – includes specified ontology

• owl:versionInfo – human/tool readable

• owl:priorVersion

• owl:backwardCompatibleWith

• owl:incompatibleWith

• owl:DeprecatedClass DeprecatedProperty

Modeling Philosophy

• Provable – a model should be decidable (from Description Logic). This is important in formal systems where every possible inference must be guaranteed.

• Executable – inferences must be correct, but not necessarily complete. Like most programming languages, these models are not decidable.

OWL Dialects

• OWL Full – unconstrained use of OWL constructs. Not decidable.

• OWL DL – constrained use of OWL constructs. Decidable (in finite time)

• OWL Lite – a cutdown version of OWL intended to ease implementation

• These are defined by the standard, but there are numerous other proprietary subsets.

owl:Class vs. rdfs:Class

• owl:Class is defined as a subclass of rdfs:Class

• In OWL Lite and OWL DL, owl:Class must be used for all class descriptions

• Not all RDFS classes are legal OWL Lite or OWL DL classes

• In OWL Full owl:Class and rdfs:Class are equivalent

Usability

• There are plenty of Semantic Web systems out there, but

• Inferencing can be very slow

• Especially in the presence of data updates

• Received wisdom is that for large (i.e. useful) datasets, inferencing should be done over the T-Boxes only

OWL 2.0 and beyond

• This presentation has been based on OWL 1.0

• OWL 2.0 is in development and there are already tools that support the emerging standard

• SPARQL-DL is being proposed as a successor to SPARQL

Conclusion

• This technology is still developing

• There is lots of academic interest

• Tooling is becoming usable

• Business is beginning to invest

References

• Semantic Web for the Working Ontologist - Allemang/Hendler - 978-0-12-373556-0

• The Semantic Web – Tim Berners-Lee, Jim Hendler, Ora Lassila – Scientific American, May 2001 :http://www.sciam.com/article.cfm?id=the-semantic-web

• www.w3.org• Protégé Owl Tutorial – Horridge 2004:www.co-ode.org/resources/tutorials/ProtegeOWLTutorial.pdf

top related