dr. alexandra i. cristea acristea/ cs 253: topics in database systems: c4

46
Dr. Alexandra I. Cristea http://www.dcs.warwick.ac.uk/ ~acristea/ CS 253: Topics in Database Systems: C4

Upload: leah-graham

Post on 28-Mar-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Dr. Alexandra I. Cristea

http://www.dcs.warwick.ac.uk/~acristea/

CS 253: Topics in Database Systems: C4

Page 2: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

• Previously we looked at:– XML, and its query language(s) – RDF

• Next:– RDF query languages

Page 3: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

RDF query languages

Page 4: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Proposals• SPARQL

– http://www.w3.org/TR/rdf-sparql-query/

• RDQL– http://www.w3.org/Submission/RDQL/

• RQL– http://139.91.183.30:9090/RDF/RQL/

• SeRQL– http://www.openrdf.org/doc/sesame/users/ch06.html

• Triple: – http://triple.semanticweb.org/

• N3: – http://www.w3.org/DesignIssues/Notation3

• Comparison of languages: – http://www.aifb.uni-karlsruhe.de/WBS/pha/rdf-query/rdfquery.pdf

Page 5: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

SeRQL

Page 6: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Introduction SeRQL

• "Sesame RDF Query Language", pronounced "circle“

• new RDF/RDFS query language

• currently being developed by Aduna as part of Sesame. http://www.openrdf.org/

• It combines (best?) features of other (query) languages (RQL, RDQL, N-Triples, N3) and adds some of its own.

Page 7: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Sesame• open source RDF framework with support for

RDF Schema inferencing and querying. • Originally, it was developed by Aduna (then

known as Aidministrator) as a research prototype for the EU research project On-To-Knowledge.

• further developed and maintained by Aduna in cooperation with NLnet Foundation, developers from Ontotext, and a number of volunteer developers

Page 8: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

SeRQL's features

• Graph transformation.

• RDF Schema support.

• XML Schema datatype support.

• Expressive path expression syntax.

• Optional path matching.

Page 9: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

SeRQL basic building blocks

• RDF:– URIs, – literals and – variables URIs and literals

• variables

Page 10: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Variables• identified by names. • must start with a letter or an underscore ('_')

and can be followed by zero or more letters, numbers, underscores, dashes ('-') or dots ('.').

• Examples: Var1 _var2 unwise.var-name_isnt-it

• SeRQL keywords are not allowed to be used as variable names.

Page 11: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

(reserved) Keywords

• Currently: select, construct, from, where, using,

namespace, true, false, not, and, or, like, label, lang, datatype, null, isresource, isliteral, sort, in, union, intersect, minus, exists, forall, distinct, limit, offset.

• case-insensitive, (unlike variable names).

Page 12: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

URIs

• full URIs

• abbreviated URIs (QNames)

Page 13: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Full URIs

• must be surrounded with "<" and ">".

• Tend to be long (!!)

• Examples: <http://www.openrdf.org/index.html>

<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>

<mailto:[email protected]>

<file:///C:\rdffiles\test.rdf>

Page 14: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Abbreviated URIs (QNames)

• Components:

defined prefix (for the namespace) and a colon (“:”), then the URI part that is not a namespace

• Examples:sesame:index.html

rdf:type

foaf:Person

Page 15: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Literals

• Parts:– “label”, – language tag, and – datatype

• Examples:"foo"

"foo"@en"<foo/>"^^http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral

"<foo/>"^^rdf:XMLLiteral

Optional, mutually exclusive

Language tagLanguage tag

datatypedatatype

labellabel

Page 16: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Blank nodes

• RDF has a notion of blank nodes. – nodes that are not labelled with a URI or literal. – Interpretation (): "there exists a node such that..." – Blank nodes have internal identifiers

• Shortcut in SeRQL:_:bnode1

• Attention: problem of non-portability!!!

Page 17: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Path expressions

• expressions that match specific paths through an RDF graph

• usually, triples = path expressions of length 1

• in SeRQL: arbitrary length

Page 18: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Basic path expressions

• Query: – persons who work for (companies that are) IT

companies.

Page 19: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Original (possible) RDF:<?xml version=“1.0”?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:foo="http://www.mycompany.smthg/company#"><rdf:Description about=“http://www.mycompany.smthg/company/Person”>

<foo:worksFor> rdf:resource=“http://www.mycompany.smthg/company/Company” </foo:worksFor>

</rdf:Description><rdf:Description about=“http://www.mycompany.smthg/company/Company”>

<rdf:type>rdf:resource=“http://www.mycompany.smthg/company/CompanySchema#ITCompany”</rdf:type>

</rdf:Description></rdf:RDF>

Page 20: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Basic path expressions• Query:

– persons who work for (companies that are) IT companies.

{Person} foo:worksFor {Company} rdf:type {foo:ITCompany}

Person Company <foo:ITCompany><foo:worksFor> <rdf:type>

Triple (length =1):

Page 21: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Multiple Path Expressions

• Separated with commas

• Example:

{Person} ex:worksFor {Company},{Company} rdf:type {ex:ITCompany}

Page 22: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Non-interesting nodes

• Can be left empty

• Examples: {Person} ex:worksFor {} rdf:type {ex:ITCompany}

{Painting} ex:painted_by {} ex:name {"Picasso"}

Page 23: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Path expression short cuts

• Multi-value nodes

• Branches

• Reified statements

Page 24: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Multi-valued nodes

• Multiple objects:

{subj1} pred1 {obj1, obj2, obj3}

• Multiple subjects:

{subj1, subj2, subj3} pred1 {obj1}

• Condition: disjoint !!

Page 25: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Branches

{subj1} pred1 {obj1};

pred2 {obj2}

Equivalent to:

{subj1} pred1 {obj1},

{subj1} pred2 {obj2}

Page 26: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Reified statements

• { {reifSubj} reifPred {reifObj} } pred {obj}

• Equivalent to:

• {_Statement} rdf:type {rdf:Statement}, {_Statement} rdf:subject {reifSubj}, {_Statement} rdf:predicate {reifPred}, {_Statement} rdf:object {reifObj}, {_Statement} pred {obj}

Page 27: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Optional Path Expressions

{Person} ex:name {Name};

ex:age {Age};

[ex:email {EmailAddress}]

Page 28: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Queries in SeRQL• Select queries:

– returning a table of values, or a set of variable-value bindings.

– SELECT, FROM, WHERE, LIMIT, OFFSET and USING NAMESPACE

• Construct queries:– returns a true RDF graph – CONSTRUCT, FROM, WHERE, LIMIT,

OFFSET and USING NAMESPACE

Page 29: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Select queries

• SELECT C FROM {C} rdf:type {rdfs:Class} – returns all URIs of classes

• SELECT DISTINCT *

FROM {Country1} ex:borders {} ex:borders {Country2}

USING NAMESPACE ex =<http://example.org/things#>

Page 30: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Construct queries• CONSTRUCT {Parent} ex:hasChild {Child}

FROM {Child} ex:hasParent {Parent}

USING NAMESPACE ex = <http://example.org/things#>

• CONSTRUCT *

FROM {SUB} rdfs:subClassOf {SUPER}– This query extracts all rdfs:subClassOf relations from an

RDF graph.

Page 31: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

WHERE clause

• Optional;

• Specifies Boolean constraints

SELECT Country

FROM {Country} ex:population {Population}

WHERE Population < "1000000"^^xsd:positiveInteger

USING NAMESPACE ex = <http://example.org/things#>

Page 32: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Nested WHERE clauses• Query 1 (normal WHERE-clause):

SELECT Name, EmailAddress FROM {Person} foaf:name {Name}; [ex:email {EmailAddress}]WHERE EmailAddress LIKE "g*"

• Query 2 (nested WHERE-clause):SELECT Name, EmailAddress FROM {Person} foaf:name {Name}; [ex:email {EmailAddress} WHERE EmailAddress LIKE "g*"]

• at most one nested WHERE-clause per optional path expression, and at most one 'normal' WHERE-clause

Page 33: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Results WHERE queries• Query 1Name EmailAddressGiancarlo [email protected]

• Query 2 (nested WHERE)Name EmailAddressMichael Rubens Giancarlo "[email protected]"

Page 34: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

LIKE operator

SELECT Country

FROM {Country} ex:name {Name}

WHERE Name LIKE “netherlands" IGNORE CASE

USING NAMESPACE ex = <http://example.org/things#>

Page 35: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Built-in predicates

• {X} serql:directSubClassOf {Y}

• {X} serql:directSubPropertyOf {Y}

• {X} serql:directType {Y}

Page 36: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Set combinatory operations

• Union

• Intersect

• Minus

Page 37: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Union

SELECT title

FROM {book} dc10:title {title}

UNION

SELECT title

FROM {book} dc11:title {title}

USING NAMESPACE

dc10 = <http://purl.org/dc/elements/1.0/>, dc11 = <http://purl.org/dc/elements/1.1/>

Page 38: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Intersect

SELECT creator

FROM {album} dc10:creator {creator}

INTERSECT

SELECT creator

FROM {album} dc11:creator {creator}

USING NAMESPACE

dc10 = <http://purl.org/dc/elements/1.0/>,

dc11 = <http://purl.org/dc/elements/1.1/>

Page 39: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Minus (difference)

SELECT title

FROM {album} dc10:title {title}

MINUS

SELECT title

FROM {album} dc10:title {title}; dc10:creator {creator}

WHERE creator like "Paul"

USING NAMESPACE

dc10 = <http://purl.org/dc/elements/1.0/>,

dc11 = <http://purl.org/dc/elements/1.1/>

Page 40: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

NULL values

• SELECT *

• FROM {X} Y {Z}

• WHERE isLiteral(Z) AND datatype(L) = NULL – to check that a literal doesn't have a datatype;

Page 41: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

Query Nesting

• IN

• ANY, ALL

• EXISTS

Page 42: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

INSELECT name FROM {} rdf:type {ex:Person}; ex:name {name} WHERE name IN ( SELECT n FROM {} rdf:type {ex:Author}; ex:name {n} ) USING NAMESPACE ex = http://example.org/things#• retrieve all names of Persons, but only those

names that also appear as names of Authors.

Page 43: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

ANY, ALL

SELECT highestValue FROM {node} ex:value {highestValue} WHERE highestValue >= ALL ( SELECT value FROM {} ex:value {value} ) USING NAMESPACE ex = <http://example.org/things#>

Page 44: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

EXISTSSELECT name, hobby FROM {} rdf:type {ex:Person}; ex:name {name}; ex:hobby {hobby}WHERE EXISTS ( SELECT n FROM {} rdf:type {ex:Author}; ex:name {n}; ex:authorOf {} WHERE n = name ) USING NAMESPACE ex = <http://example.org/things#>

Page 45: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

RDF Query Languages Conclusion• We have learned:

– There is a high competition for providing The RDF query language

– No standards as yet– We have looked in more details at one of them,

SeRQL, as it is an implementers’ language paired with an existing RDF repository tool, Sesame

– Many features in SeRQL remind us of SQL, thus learning threshold should be low

Page 46: Dr. Alexandra I. Cristea acristea/ CS 253: Topics in Database Systems: C4

• Next:– OWL