the sparql query graph model for query optimization

26
The SPARQL Query Graph Model for Query Optimization Olaf Hartig and Ralf Heese

Upload: olaf-hartig

Post on 11-May-2015

5.077 views

Category:

Technology


4 download

DESCRIPTION

The slides with which I presented our paper at the 2007 European Semantic Web Conference (ESWC) in Innsbruck, Austria.

TRANSCRIPT

Page 1: The SPARQL Query Graph Model for Query Optimization

The SPARQL Query Graph Modelfor Query Optimization

Olaf Hartig and Ralf Heese

Page 2: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 2

Postings on the Jena Mailinglist

Question:http://groups.yahoo.com/group/jena-dev/message/21436

Date: Mar 8, 2006

A series of SPARQL queries of the form:

My queries run very slowly

Simple queries on a database of 10 000 trees describing families

Answer:

... Put the more specific part of the query first; it makes a significant difference. ...

Reply:

... My time went from 33 000 ms to 150 ms ...

... WHERE { { ?family ex:dad ?d . ?d ex:name “Peter” . } { ?family ex:mom ?m . ?m ex:name ”Robin” . } ...

Page 3: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 3

One Query, many ways to execute

{ ?family ex:dad ?d . ?d ex:name ”Peter” . }

{ ?family ex:mom ?m . ?m ex:name ”Robin” . }

{ ?family ex:pet ?p . ?p ex:name ”Toller” . }

{ ?family ex:pet ?p . ?p ex:name ”Toller” . }

{ ?family ex:dad ?d . ?d ex:name ”Peter” . }

{ ?family ex:mom ?m . ?m ex:name ”Robin” . }

{ ?family ex:mom ?m . ?m ex:name ”Robin” . }

{ ?family ex:dad ?d . ?d ex:name ”Peter” . }

{ ?family ex:pet ?p . ?p ex:name ”Toller” . }

Page 4: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 4

One Query, many ways to execute

{ ?family ex:dad ?d . ?d ex:name ”Peter” . }

{ ?family ex:mom ?m . ?m ex:name ”Robin” . }

{ ?family ex:pet ?p . ?p ex:name ”Toller” . }

{ ?family ex:pet ?p . ?p ex:name ”Toller” . }

{ ?family ex:dad ?d . ?d ex:name ”Peter” . }

{ ?family ex:mom ?m . ?m ex:name ”Robin” . }

{ ?family ex:mom ?m . ?m ex:name ”Robin” . }

{ ?family ex:dad ?d . ?d ex:name ”Peter” . }

{ ?family ex:pet ?p . ?p ex:name ”Toller” . }

Page 5: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 5

Outline

Query processing in databases

SPARQL query graph model (SQGM)

Rewriting SQGMs

Evaluation

Conclusion

Page 6: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 6

Outline

Query processing in databases

SPARQL query graph model (SQGM)

Rewriting SQGMs

Evaluation

Conclusion

Page 7: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 7

Internalrepresentationof the query

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Tasks of the Query Engine

SPARQLQueryGraphModel

QEP Generation

QEP – Query Execution Plan

QEP Execution

Query Parsing Query Rewriting

Page 8: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 8

Outline

Query processing in databases

SPARQL query graph model (SQGM)

Rewriting SQGMs

Evaluation

Conclusion

Page 9: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 9

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Advantages

SPARQLQueryGraphModel

Supports all phases of query processing

Extensible to new concepts of the query language

Stores additional information

needed for query processing

Adaptable to changes of the query language

Page 10: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 10

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Basic Structures

Operators

Process data (sets of variable bindings,

RDF graphs)

Head: provided variables

Body: operator details

Dataflows

Connects the input and

the output of two operators

Body

Head

Directed acyclic graph

Page 11: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 11

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Constructing an SQGMSELECT ?n ?cFROM http://example.org/university.rdfWHERE { ?s rdf:type ub:GraduateStudent . OPTIONAL { ?s ub:takesCourse ?c } ?s ub:name ?n .}

Graph access operators

Graph pattern operators

Join operators

Select result

operators

?s rdf:type ub:GraduateStudent

?s

?s ub:takesCourse ?c

?s ?c

?s ub:name ?n

?s ?n

http://example.org/university.rdf

JOIN

?s ?c

JOIN

?s ?c ?n

SELECT

?c ?n

?s ?s,?c optional

?s,?c ?s,?n

?n,?c

Page 12: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 12

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Operator Types

Graph selection operators

Graph merge operators

Union operators

Solution modifier operators

Construct result operators

...

?s rdf:type ub:GraduateStudent

?s

?s ub:takesCourse ?c

?s ?c

?s ub:name ?n

?s ?n

http://example.org/university.rdf

JOIN

?s ?c

JOIN

?s ?c ?n

SELECT

?c ?n

?s ?s,?c optional

?s,?c ?s,?n

?n,?c

Page 13: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 13

Outline

Query processing in databases

SPARQL query graph model (SQGM)

Rewriting SQGMs

Evaluation

Conclusion

Page 14: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 14

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Query Rewriting

Goals:

Faster evaluation of a query

Provide more options for the generation of query plans, e.g.:

Data access strategy

Join order

Selection of indexes

Page 15: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 15

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Transformation RulesCurrently 26 transformation rules, e.g.:

MergeJoinedGPOs

SwitchJoinedJoinRightInputs

FindContradiction

?s,?c

?s ub:takesCourse ?c

?s ?c?s ub:name ?nregex(?n, "^S")

?s ?n

JOIN

?s ?n ?c

?s,?n ?s ub:takesCourse ?c?s ub:name ?nregex(?n, "^S")

?s ?n ?c

JOIN

JOIN

JOIN

JOIN

Page 16: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 16

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Heuristic: Merge Graph Pattern Operators

Heuristic: rewrite strategy based on a set of rules

?s rdf:type ub:GraduateStudent

?s

?s ub:takesCourse ?c

?s ?c

?s ub:name ?n

?s ?n

http://example.org/university.rdf

JOIN

?s ?c

JOIN

?s ?c ?n

SELECT

?c ?n

?s ?s,?c optional

?s,?c ?s,?n

?n,?c

Graph pattern operators

can not be merged

Page 17: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 17

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Heuristic: Merge Graph Pattern Operators

But these could be merged if they were

operands of the same join operation.

?s rdf:type ub:GraduateStudent

?s

?s ub:takesCourse ?c

?s ?c

?s ub:name ?n

?s ?n

http://example.org/university.rdf

JOIN

?s ?c

JOIN

?s ?c ?n

SELECT

?c ?n

?s ?s,?c optional

?s,?c ?s,?n

?n,?c➔ Apply transformation rules to

restructure the SQGM

(SwitchJoinedJoinRightInputs)

?s,?n

Page 18: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 18

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Heuristic: Merge Graph Pattern Operators

But these could be merged if they were

operands of the same join operation.

?s rdf:type ub:GraduateStudent

?s

http://example.org/university.rdf

JOIN

?s ?c

JOIN

?s ?c ?n

SELECT

?c ?n

?s

?s,?c

?n,?c➔ Apply transformation rules to

restructure the SQGM

(SwitchJoinedJoinRightInputs)

?s ub:takesCourse ?c

?s ?c

?s ub:name ?n

?s ?n

JOIN

?s ?n

?s,?n

?s,?n

?s,?c optional

Page 19: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 19

JOIN

?s ?n

JOIN

?s ?c ?n

?s,?n

?s ub:takesCourse ?c

?s ?c

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Heuristic: Merge Graph Pattern Operators

Apply transformation rule to merge

(MergeJoinedGPOs)

?s rdf:type ub:GraduateStudent

?s

http://example.org/university.rdf

SELECT

?c ?n

?s

?n,?c

?s ub:name ?n

?s ?n

?s,?n

?s,?c optional

Page 20: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 20

JOIN

?s ?c ?n

?s ub:takesCourse ?c

?s ?c

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Heuristic: Merge Graph Pattern Operators

Apply transformation rule to merge

(MergeJoinedGPOs)

http://example.org/university.rdf

SELECT

?c ?n

?s

?n,?c

?s,?c optional

?s rdf:type ub:GraduateStudent?s ub:name ?n

?s ?n

?s,?n

Page 21: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 21

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Evaluation Results

Messured query execution time of a selected query:

Factor ≈ 2.4

SELECT ?n ?cFROM http://example.org/university.rdfWHERE { ?s rdf:type ub:GraduateStudent . OPTIONAL { ?s ub:takesCourse ?c } ?s ub:name ?n .}

0

10

20

30

40

50

60

70

80

Sec

onds

UnivBench(1.0)

UnivBench(5.0)

UnivBench(10.0)

original queryrewritten query

5.8 2.539.4 16.4 77.9 32.3

Page 22: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 22

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Evaluation Results ctd.

Time for transformation between models: < 1 ms

Query with contradiction: nearly 100% savings

Approx. time savings:

Average: 45%

Best cases: 95%

Page 23: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 23

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Explanation of the Results

Reason: fast path algorithm of Jena

Perform pattern matching within the underlying

relational database

Combined match of multiple basic graph patterns

possible

multiple SQL queries – one for every basic graph pattern

one SQL query combining the marked basic graph patterns

... WHERE { ?s rdf:type ub:GraduateStudent . OPTIONAL { ?s ub:takesCourse ?c } ?s ub:name ?n .}

... WHERE { ?s rdf:type ub:GraduateStudent . ?s ub:name ?n . OPTIONAL { ?s ub:takesCourse ?c }}

Page 24: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 24

Outline

Query processing in databases

SPARQL query graph model (SQGM)

Rewriting SQGMs

Evaluation

Conclusion

Page 25: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 25

SPARQLQueryGraphModel

QueryProcessing

in Databases

SPARQL Query Graph

Model

RewritingSQGMs

Evaluation

Conclusion

Conclusion and Future Work

SQGM: a query model for SPARQL

Supports all phases ofquery processing

Easy to extend

Transformation rules and heuristics for SQGMs

Implementation illustrated the potential of SQGMs

Outlook

Develop further heuristics to rewrite SQGMs

Consider other phases of query processing

Integrate index selection into the query optimization

Page 26: The SPARQL Query Graph Model for Query Optimization

Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization 26

The SPARQL Query Graph Modelfor Query Optimization

Olaf Hartig and Ralf Heese

Thank you!