rdf query languages

36
ISA /department of mathematics and computer science TU/e eindhoven university of technology April 17, 2003 1 RDF Query Languages Flavius Frasincar [email protected]

Upload: craig-mcgee

Post on 30-Dec-2015

53 views

Category:

Documents


1 download

DESCRIPTION

RDF Query Languages. Flavius Frasincar [email protected]. Contents. Why RDF Query Languages? RDF Features (Recap) RDF Query Language Requirements RDF Query Languages RQL (RDF Query Language): Select: variables Where: path expressions From: condition Summary. Why RDF QLs?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 1

RDF Query Languages

Flavius [email protected]

Page 2: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 2

Contents

• Why RDF Query Languages?• RDF Features (Recap)• RDF Query Language Requirements• RDF Query Languages• RQL (RDF Query Language):

– Select: variables– Where: path expressions– From: condition

• Summary

Page 3: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 3

Why RDF QLs?

• RDF is the standard representation language for Web metadata (foundation of the Semantic Web)

• RDF is already used in:– Large description schemas: ODP (Open Directory Project) - web site

classification with 385,965 topics, UNSPSC (United Nations Standard Products and Services Code) - product classification with 16,506 classes

– Large description bases: ODP classifies 3,339,355 sites

• RDF QLs are needed in order to access data from (large) RDF representations

Page 4: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 4

RDF

Primitive Semantics: Subject Predicate Object (one statement)Three alternative notations:

• Graph

• Triple (http://example.com/sb.jpg, painted_by, “Rembrandt”)

• RDF/XML <rdf:Description rdf:ID=http://example.com/sb.jpg>

<painted_by> Rembrandt </painted_by> </rdf:Description>

painted_byhttp://example.com/sb.jpg Rembrandt

Page 5: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 5

RDF Features

• RDF:– Data Model: Directed Labeled Graph

• Nodes: Resources (with or without URIs) or Literals• Edges: Properties (attributes or relationships)• Labels: Nodes (URI) or Edges (Property URI)

• RDF Schema:– Multiple classification of resources– Specialization of both classes/properties (simple and multiple)– Unordered, optional, and multivalued properties– Domain and range polymorphism of properties

Page 6: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 6

RDF vs. XML

• Different Data Models:– RDF data model: a directed graph with labels on both edges and

nodes– XML data model: a tree with labels on edges or nodes

• Different Semantics:– RDF is able to model complex semantic relations (e.g. class/property hierarchies based on specialization)– XML has only one type of semantics (inclusion semantics) (an element contains another element)

• RDF has an XML syntax RDF/XML but XML QLs do not support RDF semantics: we need an RDF QL

Page 7: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 7

Requirements for an RDF QL

• Understand RDF Data Model (RDF graph or RDF triples)

• Path expressions can use labels from both nodes and edges

• Compose queries: the output of one query can be used as input for the next query

• Declarative: not bound to any implementation (closer to human language!)

• Support RDF Schema

Page 8: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 8

RDF Query Languages

• Triple-based: querying the structure– RDQL– Triple [successor of SiLRI] (Horn logic)

e.g. Find statements whose subject is … and object is …

• XML-based: querying the syntax– RDF Query – RQuery (XQuery)

e.g. Find description elements whose attribute value contains …

• Graph-based (but not graphical): querying the semantics– RQL (OQL)

e.g. Find resources classified under … whose property value is …

Page 9: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 9

RDF Query Language (RQL)

• Declarative query language for RDF• Language proposal (not yet a standard)• Based on the RDF-graph representation• Supports RDF Schema (a few from the existing RDF QL do that)• References (small differences between them):

– RQL from ICS-FORTH (Greece) (http://139.91.183.30:9090/RDF/RQL/)– Sesame from Aidministrator (Holland) (http://sesame.aidministrator.nl/)

• The rest of the presentation refers to the Sesame impl.

Page 10: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 10

RQL Input

• The input to an RQL query is a complete RDF model, i.e. a model that contains its RDFS-closure (defined in RDF Semantics).

• Note that the RDFS-closure includes the RDF-closure• [RDF-closure] e.g. rdf1: if (xxx aaa yyy) then add (aaa rdf:type

rdf:Property) • [RDFS-closure] e.g. rdfs9: if (xxx rdfs:subClassOf yyy) and (aaa

rdf:type xxx) then add (aaa rdf:type yyy)

• There are operators variants (append ^) that discard this new data (intensional data) and consider only the given statements (extensional data) from an RDF model

Page 11: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 11

Example: RDF Input

rdfs:subPropertyOf

paints

technique Artist

Literal

Literal

first_name

last_name Artifact ExtResource Literal creates

Sculptor Sculpter

Painter Painting

Cubist Flemish

&r1 &r2

sculpts

paints

Rembrandt

van Rijn

first_name

last_name

Literal

Literal

Literal

Literal

mime_type

title

file_size

last_modified

oil on canvas

17

Abraham and Isaac

technique

file_size

title rdfs:subClassOf

rdf:type

<property>

Schema Instance

cult=http://www.icom.com/schema.rdf# adm=http://www.oclc.org/schema.rdf#

Page 12: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 12

Example:Web Resources

• &r1http://www.european-history.com/rembrandt.html

• &r2http://www.artchive.com/rembrandt/abraham.jpg

Page 13: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 13

Select-Where-From

select X, Y

from {X}cult:paints{Y},{X}cult:first_name{Xfname}

where Xfname like "Rembrandt"

using namespace cult=http://www.icom.com/schema.rdf#

• Variables on graph labels• Path expressions/conditions use variables and constants • RQL result is a table of tuples (a relation) that has for each

variable (the columns) a value assigned (the rows)

List of variables

List of path expressions

Condition (optional)

Page 14: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 14

RQL Result

X Yhttp://www.european-history.com/rembrandt.html http://www.artchive.com/rembrandt/artist_at_his_easel.jpg

http://www.european-history.com/rembrandt.html http://www.artchive.com/rembrandt/abraham.jpg

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Bag rdf:ID="query_result"> <rdf:li> <rdf:Seq> <rdf:li rdf:resource="http://www.european-history.com/rembrandt.html"/> <rdf:li rdf:resource="http://www.artchive.com/rembrandt/ artist_at_his_easel.jpg"/> </rdf:Seq> </rdf:li> <rdf:li>…abraham.jpg …</rdf:li> </rdf:RDF>

Page 15: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 15

Why RQL Result Is a Bag?

select X

from {X}cult:paints{Y},{X}cult:first_name{Xfname}

where Xfname like "Rembrandt"

using namespace

cult=http://www.icom.com/schema.rdf#

Xhttp://www.european-history.com/rembrandt.html

http://www.european-history.com/rembrandt.html

• e.g. if only one variable is returned there might be multiple bindings of this variable with the same value (we need a Bag)

Page 16: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 16

Namespaces

• All the labels for nodes and edges are associated with a certain namespace

using namespace cult=http://www.icom.com/schema.rdf# adm=http://www.oclc.org/schema.rdf#

• cult contains information intended for museum specialists(e.g. artists, artifacts, museums descriptions)• adm contains information for portal administrators(e.g. title, file_size, mime-type of a certain external resource)

• (Web) Resources are orthogonally classified using the two above schemas

Page 17: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 17

Select: Variables

• There are three kinds of variables:– Instance: e.g. X– Class: e.g. $C– Property: e.g. @P

• “Find all resources together with their associated classes, properties, and property values”:

select X, $C, @P, Y is equivalent to select *from {X : $C}@P{Y} (* = all variables) from {X : $C}@P{Y}

• “A resource X has type C” has two syntaxes

X : C (not standalone) or C{X} (a path expression that limits a node)

Page 18: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 18

From: Path Expressions

• Path expressions specify a linear path through the RDF data model

• Each variable used in a path expression is bound to labels from the model

• “Find all painters and their associated paintings”

select Painter, Painting from {Painter}cult:paints{Painting}using namespace cult=http://www.icom.com/schema.rdf#

Page 19: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 19

The ‘.’ in Path Expressions

• Path expressions can be arbitrarily long• The ‘.’is used to specify a join condition between the object and

the subject of two consecutive properties

select Painter, Painting, Techniquefrom {Painter}cult:paints{Painting}. cult:technique{Technique}using namespace cult=http://www.icom.com/schema.rdf#

• In the above example Painting is the object of cult:paints and the subject of cult:technique

• If Painting is not interesting it can be omittedfrom {Painter}cult:paints. cult:technique{Technique}

Page 20: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 20

List of Path Expressions

• Since path expressions are linear it is not possible to express two paths with the same origin in one path expression

• List of path expressions sharing variables

select Painter, Painting, Painter_lname

from {Painter}cult:paints{Painting},

{Painter}cult:last_name{Painter_lname}

using namespace

cult=http://www.icom.com/schema.rdf#

Page 21: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 21

Class of a Resource

select Painter, $Painter, Painting

from {Painter : $Painter}cult:paints{Painting}

using namespace

cult=http://www.icom.com/schema.rdf#

select Painter, Painter_type, Painting

from {Painter}rdf:type{Painter_type}, {Painter}cult:paints{Painting}

using namespace

rdf = http://www.w3.org/1999/02/22-rdf-syntax-ns# ,

cult = http://www.icom.com/schema.rdf#

• Q1 returns the most specific type (class) for a resource while Q2 returns all types of this resource

Q1 (better)

Q2

Page 22: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 22

Class Restriction for Resourcesselect Painterfrom {Painter :cult:Flemish}cult:paints{Painting}using namespace cult=http://www.icom.com/schema.rdf#

• Note that cult:Flemish must be part of the domain of cult:paints, otherwise the query returns 0 results.

select Painterfrom cult:Flemish{Painter}using namespace cult=http://www.icom.com/schema.rdf#

• Q1 returns multiple times a Flemish painter that has more than one paintings while Q2 does not so.

Q1

Q2 (better)

Page 23: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 23

Domain and Rangeselect $Domain, $Rangefrom {:$Domain}cult:has_style{:$Range}using namespace cult=http://www.icom.com/schema.rdf#

select domain(@P),@P,range(@P)from {}@P{}where @P = cult:has_styleusing namespace cult=http://www.icom.com/schema.rdf#

• Q1 return data from schema with RDFS-closure while Q2 return data present in schema without RDFS-closure (both are independent of the model instance)

Q1 (better)

Q2

Page 24: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 24

Where: Condition

• The where clause is optional • The condition constrains the value of variables bound in the

from clause. It uses two kind of operators:– Comparison: <, <=, =, >, >=, != like (with *)[lexical], in [set]– Logical: and, or, not

• The first 5 comparison operators are overloaded for sets or single-valued (classes, properties, reals, integers, and literals/resources) based on set comparison or single-value comparison (subClassOf, subPropertyOf, reals comparison, integers comparison, and lexical comparison)

Page 25: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 25

Comparison Operators

• “Select all artists, their type, and their first name that have a painting resource containing the string ‘abraham’”

select Artist, $Artist, ArtistFNamefrom {Artist : $Artist} cult:first_name {ArtistFName}where Artist in select Painter

from {Painter} cult:paints {Painting} where Painting like "*abraham*"

using namespace cult = http://www.icom.com/schema.rdf#

Page 26: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 26

Logical Operators

• “Select all painters with a first name that starts with R and all sculptors with a first name that does not start with M”

select Artist, ArtistFName

from {Artist :$Artist} cult:first_name {ArtistFName}

where ($Artist <= cult:Painter and ArtistFName like "R*")

or

($Artist <= cult:Sculptor and not (ArtistFName like "M*"))

using namespace

cult = http://www.icom.com/schema.rdf#

Page 27: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 27

Standard Functions

• Standard functions are used to retrieve standard RDFS relationships

• We already did see: domain() and range()

• Other examples: Class, Property, subClassOf(), subPropertyOf(), typeOf() etc.

• The standard functions can be used also as standalone queries

Class

subClassOf ( http://www.icom.com/schema.rdf#Artist )

typeOf( http://www.european-history.com/rembrandt.html ) etc.

Page 28: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 28

Strict Interpretation with ‘^’• “Retrieve the direct subclasses of Artist”

subClassOf^ ( http://www.icom.com/schema.rdf#Artist )

• “Retrieve all subclasses of Artist”

subClassOf ( http://www.icom.com/schema.rdf#Artist )

• “Retrieve the most specific classes to which the resource http://www.european-history.com/rembrandt.html belongs to”

typeOf^ ( http://www.european-history.com/rembrandt.html )

• “Retrieve the classes to which the resource http://www.european-history.com/rembrandt.html belongs to”

typeOf ( http://www.european-history.com/rembrandt.html )

Page 29: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 29

Standalone Queries

• The standard functions: Class, subClassOf, Property, subPropertyOf etc.

• Any class (resource of type rdf:Class): returns the extension (resources) of this class

http://www.icom.com/schema.rdf#Artist

• Any property (resource of type rdf:Property): returns the extension (pairs subject-object) of this property

http://www.icom.com/schema.rdf#creates

Page 30: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 30

Set Operations• The query results can be combined using the following operators:

union, intersect, and minus• “Retrieve the first name and the last name of all painters”

(select PainterR, PainterLName, PainterFNamefrom cult:Painter{PainterR}. cult:last_name{PainterLName}, {PainterR}cult:first_name{PainterFName})union(select PainterR, PainterLName, NULL from cult:Painter{PainterR}. cult:last_name{PainterLName} where not (PainterR in select PainterR from {PainterR}cult:first_name )) using namespace cult = http://www.icom.com/schema.rdf#

Note that not all painters have a first name in the input model (outer join operation)

Page 31: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 31

Summary

• There is a need for RDF query languages (XML query language cannot handle RDF semantics)

• RQL: declarative query language for uniformly querying RDF schemas and RDF descriptions Select list of variables (variables to be returned) From list of path expressions (variables are bound) Where condition (constrains the value of variables) – Compositional (in and set operations)– Very expressive– Well-defined semantics, syntax can be improved … … but not yet a standard!

Page 32: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 32

Appendix

• Try your own queries at:http://sesame.aidministrator.nl/sesame/actionFrameset.jsp?repository=museum

• The result of the query: – HTML Table– RDF-Bag– XML

• Explore the Museum example (with or without inferred statements):– Schema (ontology)– Instance (data statements)

Page 33: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 33

Exercise 1

• “Find the first name of painters that have paintings using the

‘oil on canvas’ technique and return also these paintings”

select Painter_fname, Paintingfrom {Painter}cult:paints{Painting}. cult:technique{Painting_technique}, {Painter}cult:first_name{Painter_fname}where Painting_technique like "oil on canvas"using namespace cult=http://www.icom.com/schema.rdf# , adm=http://www.oclc.org/schema.rdf#

Page 34: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 34

Exercise 2

• “Find the first name of the painters that have a painting

stored in a file with size greater than 5”

select Painter_fnamefrom {Painter}cult:paints{Painting}. adm:file_size{Painting_fsize}, {Painter}cult:first_name{Painter_fname}where Painting_fsize > 15using namespace cult=http://www.icom.com/schema.rdf# , adm=http://www.oclc.org/schema.rdf#

Page 35: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 35

Exercise 3

• “Find the resources which are not of type ExtResource”

• First Solution:

select R

from rdfs:Resource{R}

where not (R in select R

from adm:ExtResource{R})

using namespace

rdfs=http://www.w3.org/2000/01/rdf-schema# ,

adm=http://www.oclc.org/schema.rdf#

Page 36: RDF Query Languages

ISA

/department of mathematics and computer science

TU/e eindhoven university of technology

April 17, 2003 36

Exercise 3 (cont’d)

• Second solution:

(select R

from rdfs:Resource{R})

minus

(select R

from adm:ExtResource{R})

using namespace

rdfs=http://www.w3.org/2000/01/rdf-schema# ,

adm=http://www.oclc.org/schema.rdf#