beyond the rainbow: —— a pot of gold ala xml database projects

46
2003. DSRG, Worcester Polytech nic Institute 1 Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects WPI DSRG GROUP

Upload: sylvester-dillard

Post on 31-Dec-2015

20 views

Category:

Documents


3 download

DESCRIPTION

Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects. WPI DSRG GROUP. Motivation. XML is new, and here to stay … Universal flexible representation of data De facto standard for information exchange XQuery is useful, and here to stay… Powerful query language for XML - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute

1

Beyond the Rainbow:—— A Pot of Gold ala XML Database Projects

WPI DSRG GROUP

Page 2: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 2

Motivation

XML is new, and here to stay … Universal flexible representation of data De facto standard for information

exchange XQuery is useful, and here to stay…

Powerful query language for XML De facto standard for XML querying

Plentitude of relevant new issues …

Page 3: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 3

Internet Internet

XML Paradigm

XML1

XML3RDB4

XML5 RDBL6

XMLn

XML2

WWW: global scale distributed information system for sharing data

XMLQueries AndUpdates

– searching– querying – integrating – restructuring– updating

Page 4: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 4

Internet Internet

What We Aim For…

XML1

RDB3XML4

RDB5XML6

XMLn

XML2

XML Data ManagementMiddleware Technology

– efficient – flexible – scalable– lightweight– resource-sensitive– adaptive

Page 5: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 5

WPI Project Directions

RAINBOW: Exploiting RDB for XML management: Algebraic-XQuery processing

XCube: Flexible XML Mapping Tool: Flexible loading/extracting XML to RDB via XQuery

Updating Virtual XML Views: Update decomposition and trigger-propagation

MASS: Native XML Query Engine: Multi-axis compressed order-preserving XML storage

Page 6: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 6

WPI Project Directions

XCache: XML Query Caching: Cache containment and query rewriting

Materialized XML View Maintenance: Incremental algebraic maintenance strategy

SAXE: XML Incremental Updating & Evolution: Lightweight updating by update query rewriting

RAINDROP: XQuery-based Stream Processing: Adaptive on-fly multi-subscription optimization

Page 7: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute

7

THE RAINBOW PROJECT

Page 8: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 8

XML meets Relational DBs

XML

1) Emerging web standard2) Flexible data

representation3) Powerful query

language

Relational Database

1) Widely used to store business data

2) Efficient, reliable, secure DBMS3) Mature query processing

techniques

The look and feel of an XML query system with maturity and technology support of RDB

+

Page 9: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 9

<results><title>TCP/IP Illustrated</title><title>Data on the Web</title>

</results>

Running Example

Data on the Web002

TCP/IP Illustrated001

TitleBid

34.95002

65.95001

PriceBid

<prices><row>

<bid>001</bid><price>65.95</price>

</row><row>

<bid>002</bid><price>34.95</price>

</row></prices>

</dxv>

<dxv><book>

<row><bid>001</bid><title>TCP/IP Illustrated</title>

</row><row>

<bid>002</bid><title>Data on the Web</title>

</row></book>

<result>FOR $t IN

document(“prices.xml”)/book/titleRETURN

$t</result>

<prices><book>

<title>TCP/IP Illustrated</title><price>65.95</price>

</book><book>

<title>Data on the Web</title><price>34.95</price>

</book></prices>

<prices>FOR $book IN document(“dxv.xml”)/book/row

$prices IN document(“dxv.xml”)/prices/rowWHERE $book/bid = $prices/bidRETURN

<book>$book/title,$prices/price

</book></prices>

Page 10: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 10

XML Default View

Fixed and straight-forward mapping scheme.

<DB> <BOOKS> <ROW> <Cover>Paperback</Cover> <TITLE>Texas Holdem'</TITLE> <AUTHORS>David Sklansky, Straight Flush</AUTHORS> </ROW> <ROW> <Cover>Paperback</Cover> <TITLE>Dracula</TITLE> <AUTHORS>Bram Stoker</AUTHORS> </ROW> </BOOKS> <…> …</DB>

Cover Title AuthorsPaperback Texas Holdem' David Slansky, Straight FlushPaperback Dracula Bram Stroker

Books

XML Default View

Page 11: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 11

Generic Loading

FUNCTION Q1($root){LET $maintag := gettag($root)RETURN

<$maintag $root/@*>FOR $actual IN $root/*LET $innertag := gettag($actual)RETURN

IF ($actual/element())THEN

Q1($actual)ELSE

<$innertag $actual/@*>IF ($actual/text())THEN

<PCDATA value=$actual/text()/>ELSE

""</$innertag>

</$maintag>}

Knowledge of schema of XML document to be loaded helps to reduce unnecessary parts.

Page 12: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 12

Instantiation

XML Schema

Schema

XQueryExpression

XQueryExpressio

n(recursive)

XQueryExpression

XQueryExpressio

n(flat)

InstantiatorInstantiator

Generic loading XQuery expression recursive.+ It works for every XML document. - Many recursive calls return no value.- Unnecessary FOR-loops, IF-clauses, and getName()-fct.

Page 13: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 13

Instantiation (Example)

FUNCTION Q1($root){<BOOKLIST>FOR $book IN $root/BOOKRETURN

<BOOK $book/@cover><TITLE>

<PCDATA value=$book/TITLE/text()/></TITLE><AUTHOR>

FOR $name IN $book/AUTHOR/NAMERETURN

<NAME><PCDATA value=$book/A…/>

</NAME></AUTHOR>

</BOOK></BOOKLIST>

}

Short, non-recursive, more efficient … But: XML schema dependent!

(First Step of CLOCK mapping scheme)Instantiated Loading Query

Page 14: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 14

Flexible Mapping Management

RDB

Default

ViewReverserReverser

RDBDefault

View

XQuery(Load)

XQuery(Extract)

XML’

H

XML

Relation

Relation’

GgF

f

1

2

Page 15: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 15

XCube in a Nutshell

Easy-to-use (no new transformation language). Flexible (interchangeable XQuery expressions). Adaptable (to workload, data specifics, …). General (Schema independent). Extendable (with new mapping schemes). Tunable (Loading manager).

1. Generic XQuery loading expressions 2. XQuery load expression instantiation

Page 16: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 16

Tuples

XA

T M

erger

SQL Generator

RDBMS

User XQuery

SQL

XA

T G

enerator

XAT Executor

User Query Results in XML

XAT Optimizer

XAT

XAT

View XQuery

XA

T D

ecorrelator

View XAT

User XAT

Architecture

XAT

XAT: XML Algebra Tree

Virtual XML DocumentVirtual XML DocumentVirtual XML Document

View XAT

User XAT

XAT

Virtual XML DocumentVirtual XML DocumentXML Document

Page 17: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 17

XQuery-Level Optimization

XAT - XML Algebra Tree Model XAT Algebraic Query Plan

Optimization XAT Query Plan Reduction

Page 18: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 18

T<results>$t</result>col3

Agg

S”prices.xml”R0

R0, book/title$t

col31:

2:

3:

6:

7:

User Query

User XML Algebra Tree (XAT)

<result>FOR $t IN

document(“prices.xml”)/book/titleRETURN

$t</result>

XA

T M

erger

SQL Generator

User XQuery XA

T G

enerator

XAT Executor

XAT Optimizer

XAT

XAT

View XQuery

XA

T D

ecorrelator

XAT

View XAT

User XAT

XAT

View XAT

User XAT

Page 19: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 19

$book, titlecol10T<prices>col5</prices>

col4

S“dxv.xml” R1

R1, /book/row$book

Agg

T<book> [col10][col12] </book>col5

S“dxv.xml” R3

R3, /prices/row$prices

$prices, pricecol12

11:

12:

22:

23:

25:

14:

15:

20:

21:

31:

$book, bidcol6

$prices, bidcol7

27:

28:

col6=col726:

View Query

View XML Algebra Tree (XAT)

<prices>FOR $book IN document(“dxv.xml”)/book/row

$prices IN document(“dxv.xml”)/prices/rowWHERE $book/bid = $prices/bidRETURN

<book>$book/title,$prices/price

</book></prices>

XA

T M

erger

SQL Generator

User XQuery XA

T G

enerator

XAT Executor

XAT Optimizer

XAT

XAT

View XQuery

XA

T D

ecorrelator

XAT

View XAT

User XAT

XAT

View XAT

User XAT

Page 20: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 20

T<results>$t</result>col3

Agg

col4 R0

R0, book/title$t

col31:

2:

3:

6:

7:$book, title

col10

T<prices>col5</prices>col4

S“dxv.xml” R1

R1, /book/row$book

Agg

T<book> [col10][col12] </book>col5

S“dxv.xml” R3

R3, /prices/row$prices

$prices, pricecol12

11:

12:

22:

23:

25:

14:

15:

20:

21:

31:

$book, bidcol6

$prices, bidcol7

27:

28:

col6=col726:

User QueryView Query

Merged XML Algebra Tree (XAT)

XA

T M

erger

SQL Generator

User XQuery XA

T G

enerator

XAT Executor

XAT Optimizer

XAT

XAT

View XQuery

XA

T D

ecorrelator

XAT

View XAT

User XAT

XAT

View XAT

User XAT

Page 21: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 21

XQuery-Level Optimization

XML Algebra Representation: XAT XAT Query Plan Rewriting XAT Query Plan Reduction

Page 22: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 22

XAT Rewrite Query Optimization at Logic Algebra

Level. Goals:

Redundancy Elimination. Computation Pushdown.

Technique: Equivalence Rewrite Rules. Heuristics:

Pushdown Navigates Remove Construction of Intermediate Result Combine Multiple Operators.

XA

T M

erger

SQL Generator

User XQuery XA

T G

enerator

XAT Executor

XAT Optimizer

XAT

XAT

View XQuery

XA

T D

ecorrelator

XAT

View XAT

User XAT

XAT

View XAT

User XAT

Page 23: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 23

T<results>$t</result>col3

Agg

col4 R0

R0, book/title$t

col31:

2:

3:

6:

7: $book, titlecol10

T<prices>col5</prices>col4

S“dxv.xml” R1

R1, /book/row$book

Agg

T<book> [col10][col12] </book>col5

S“dxv.xml” R3

R3, /prices/row$prices

$prices, pricecol12

11:

12:

22:

23:

25:

14:

15:

20:

21:

31:

$book, bidcol6

$prices, bidcol7

27:

28:

col6=col726:

User Query View Query

Before Navigation Pushdown

Page 24: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 24

31:

$book, bidcol6

27:

R1, /book/row$book14:

S“dxv.xml” R115:

$book, titlecol1023:

$prices, bidcol7

28:

R3, /prices/row$prices20:

S“dxv.xml” R321:

$prices, pricecol12

25:

T<results>$t</result>col3

Agg

col31:

2:

3:

R0, book/title$t

6:

col6=col726:

T<prices>col5</prices>R011:

Agg

12:

T<book> [col10][col12] </book>col522:

After Navigation PushdownView QueryUser Query

Page 25: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 25

31:

$book, bidcol6

27:

R1, /book/row$book14:

S“dxv.xml” R115:

$book, titlecol1023:

$prices, bidcol7

28:

R3, /prices/row$prices20:

S“dxv.xml” R321:

$prices, pricecol12

25:

T<results>$t</result>col3

Agg

col31:

2:

3:

R0, book/title$t

6:

col6=col726:

T<prices>col5</prices>R011:

Agg

12:

T<book> [col10][col12] </book>col522:

Remove any Taggers?View QueryUser Query

Page 26: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 26

col31:

T<results>$t</result>col32:

Agg3:

col6=col726:

After Tagger Cancel Out

31:

$book, bidcol6

27:

R1, /book/row$book14:

S“dxv.xml” R115:

$book, title$t23:

$prices, bidcol7

28:

R3, /prices/row$prices20:

S“dxv.xml” R321:

$prices, pricecol12

25:

View QueryUser Query

Page 27: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 27

After Making Join

JOIN col6=col731:

$book, bidcol6

27:

R1, /book/row$book14:

S“dxv.xml” R115:

$book, title$t23:

$prices, bidcol7

28:

R3, /prices/row$prices20:

S“dxv.xml” R321:

$prices, pricecol12

25:

col31:

T<results>$t</result>col32:

Agg3:

View QueryUser Query

Page 28: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 28

XQuery-Level Optimization

XML Algebra Representation: XAT XAT Query Plan Rewriting XAT Query Plan Reduction

Page 29: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 29

XAT Cleanup Why:

SQL engine cannot reduce redundancy in XQuery.

How: Data Redundancy by Schema Cleanup

Each operator produced, consumed and modified some columns.

Minimum schema is then computed. Tree Redundancy by Unused Operator Cutting

Cutting matrix generation. Required columns analysis. Operator cutting.

XA

T M

erger

SQL Generator

User XQuery XA

T G

enerator

XAT Executor

XAT Optimizer

XAT

XAT

View XQuery

XA

T D

ecorrelator

XAT

View XAT

User XAT

XAT

View XAT

User XAT

Page 30: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 30

XAT Operator Properties Produced

Desc: New column generated by operator. Example: , S, T

Consumed Desc: Columns required by operator. Example: ,

Modified Desc: Columns modified by operator. Example: , ,

Page 31: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 31

Schema Computation

{R3}{}{R3}2021

{R3, $prices}{R3}{$prices}2820

{R3, $prices, col7}{$prices}{col7}2528

{R3, $prices, col7, col12}{$prices}{col12}3125

{R1}{}{R1}1415

{R1, $book}{R1}{$book}2714

{R1, $book, col6}{$book}{col6}2327

{R1, $book, col6, $t}{$book}{$t}3123

{R1, $book, col6, $t, R3, $prices, col7, col12}

{col6, col7}

{}331

{R1, $book, col6, $t, R3, $prices, col7, col12}

{}{}23

{col3, R1, $book, col6, $t, R3, $prices, col7, col12}

{$t}{col3}12

{col3}{col3}{}1

Old SchemaConsumedProducedParentNode

$book, title$t

S“dxv.xml” R1

R1, /book/row$book

col6=col7

S“dxv.xml” R3

R3, /prices/row$prices

$book, bidcol6

$prices, bidcol7

$prices, pricecol12

T<results>$t</result>col3

Agg

col3

27:

28:

14:

15:

20:

21:

31:

23:25:

1:

2:

3:

Page 32: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 33

Schema Computation

{R3}P2021

{$prices}CP2820

{$prices, col7}

CP2528

{col7, col12}

CP3125

{R1}P1415

{$book}CP2714

{$book, col6}

CP2327

{col6, $t}CP3123

{$t}CC331*

{$t}23

{col3}CP12

{col3}C1

New SchemaR3$pricescol12R1$bookcol7col6$tcol3Parent()#

*We assume Join didn’t modify $t. Otherwise, only node 25 will be deleted.

Intuition: Don’t keep anything that’s not used later.

$book, title$t

S“dxv.xml” R1

R1, /book/row$book

col6=col7

S“dxv.xml” R3

R3, /prices/row$prices

$book, bidcol6

$prices, bidcol7

$prices, pricecol12

T<results>$t</result>col3

Agg

col3

27:

28:

14:

15:

20:

21:

31:

23:25:

1:

2:

3:

Page 33: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 34

Schema Cleanup ResultNode

Original Schema Minimum Schema

1 {col3, R1, $book, col6, $t, R3, $prices, col7, col12}

{col3}

2 {col3, R1, $book, col6, $t, R3, $prices, col7, col12}

{col3}

3 {R1, $book, col6, $t, R3, $prices, col7, col12}

{$t}

31 {R1, $book, col6, $t, R3, $prices, col7, col12}

{$t}

23 {R1, $book, col6, $t} {col6, $t}

27 {R1, $book, col6} {$book, col6}

14 {R1, $book} {$book}

15 {R1} {R1}

25 {R3, $prices, col7, col12} {col7, col12}

28 {R3, $prices, col7} {$prices, col7}

20 {R3, $prices} {$prices}

21 {R3} {R3}

Page 34: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 35

XAT Cleanup Schema Cleanup

Each operator produced, consumed and modified some columns.

Minimum schema is then computed. Unused Operator Cutting

Cutting matrix generation. Required columns analysis. Operator cutting.

Page 35: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 36

Cutting Matrix Purpose:

Get rid of unused operators. Equations:

Propagation of modified Propagation of required

Identify cuttable node.

Page 36: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 37

Matrix Computation

# Parent()

col3

$t

col6

col7

$book

R1

col12

$prices

R3

Cut?

1 C

2 1 P C

3 2 - - - - - - - - -

31*

3 C C

23 31 P C

27 23 P C

14 27 P C

15 14 P

25 31 P C

28 25 P C

20 28 P C

21 20 P*We assume Join didn’t modify $t. Otherwise, only node 25 will be deleted.

$book, title$t

S“dxv.xml” R1

R1, /book/row$book

JOIN col6=col7

S“dxv.xml” R3

R3, /prices/row$prices

$book, bidcol6

$prices, bidcol7

$prices, pricecol12

T<results>$t</result>col3

Agg

col3

27:

28:

14:

15:

20:

21:

31:

23:25:

1:

2:

3:

Page 37: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 38

Matrix Computation (Cont.1)

P2021

CP2820

CP2528

CP3125

P1415

CP2714

CP2327

CP3123

CC331*

-------M-23

CP12

RRRR1

Cut?R3$pricescol12R1$bookcol7col6$tcol3Parent()#

*We assume Join didn’t modify $t. Otherwise, only node 25 will be deleted.

$book, title$t

S“dxv.xml” R1

R1, /book/row$book

JOIN col6=col7

S“dxv.xml” R3

R3, /prices/row$prices

$book, bidcol6

$prices, bidcol7

$prices, pricecol12

T<results>$t</result>col3

Agg

col3

27:

28:

14:

15:

20:

21:

31:

23:25:

1:

2:

3:

Intuition: Give me only the required columns in order to get the final result.

Page 38: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 39

Matrix Computation (Cont. 2)

# Parent()

col3

$t

col6

col7

$book

R1

col12

$prices

R3

Cut?

1 R R R R

2 1 P C

3 2 - M - - - - - - -

31*

3 C C X

23 31 P C

27 23 P C X

14 27 P C

15 14 P

25 31 P C X

28 25 P C X

20 28 P C X

21 20 P X*We assume Join didn’t modify $t. Otherwise, only node 25 will be deleted.

$book, title$t

S“dxv.xml” R1

R1, /book/row$book

JOIN col6=col7

S“dxv.xml” R3

R3, /prices/row$prices

$book, bidcol6

$prices, bidcol7

$prices, pricecol12

T<results>$t</result>col3

Agg

col3

27:

28:

14:

15:

20:

21:

31:

23:25:

1:

2:

3:

Page 39: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 40

XAT after Cutting

$book, title$t

S“dxv.xml” R1

R1, /book/row$book

Agg

col3

14:

15:

23:

1:

3:

T<results>$t</result>col32:

$book, title$t

S“dxv.xml” R1

R1, /book/row$book

JOIN col6=col7

S“dxv.xml” R3

R3, /prices/row$prices

$book, bidcol6

$prices, bidcol7

$prices, pricecol12

T<results>$t</result>col3

Agg

col3

27:

28:

14:

15:

20:

21:

31:

23:25:

1:

2:

3:

Reduced To

Page 40: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 41

SQL Generated

$book, title$t

S“dxv.xml” R1

R1, /book/row$book

Agg

col3

14:

15:

23:

1:

3:

T<results>$t</result>col32:

$book, title$t

S“dxv.xml” R1

R1, /book/row$book

JOIN col6=col7

S“dxv.xml” R3

R3, /prices/row$prices

$book, bidcol6

$prices, bidcol7

$prices, pricecol12

T<results>$t</result>col3

Agg

col3

27: 28:

14:

15:

20:

21:

31:

23: 25:

1:

2:

3:

SELECT “$book”.title as “$t”, “$book”.bid as “col6”,“$prices”.price as “col12”,“$prices”.bid as “col7”

FROM book “$book”,prices “$prices”

WHERE “col6”=“col7”

SELECT “$book”.title as “$t”, FROM book “$book”,

XA

T M

erger

SQL Generator

User XQuery XA

T G

enerator

XAT Executor

XAT Optimizer

XAT

XAT

View XQuery

XA

T D

ecorrelator

XAT

View XAT

User XAT

XAT

View XAT

User XAT

Page 41: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 42

XQuery-Level Optimization

XML Algebra Representation: XAT XAT Query Plan Rewriting XAT Query Plan Reduction

Page 42: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 43

Performance Gain in Execution

0

10000

20000

30000

40000

50000

10 510 1010 1510 2010 2510# of Elements in XML dataset

Tim

e (m

s)

None Rewrite Cleanup Rewrite+Cleanup

Page 43: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 44

Rainbow Engine Overhead

1%42%

2%

55%

Generation(ms)

Rewrite(ms)

Decorrelation(ms)

Cleanup(ms)

XA

T M

erger

SQL Generator

User XQuery

XA

T G

enerator

XAT Executor

XAT Optimizer

XAT

XAT

View XQuery

XA

T D

ecorrelator

XAT

View XAT

User XAT

XAT

View XAT

User XAT XAT

Rewrite

XAT Cleanup

Total:32,522 ms

Ack.: XQuery using Kweelt Parser

Page 44: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 45

http://davis.wpi.edu/dsrg/rainbowhttps://sourceforge

.net/projects/rainbow-engine/

Page 45: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 46

Related Work XPERANTO[VLDBJ2000]: XQGM vs. XAT

Xquery Views over RDB, Extension by UDFs for XML features

SilkRoute[IEEE2001(24:2)]: Xquery Views over RDB, Generate SQL Efficiently

AGORA[VLDB2000]: Syntax level rewriting.

Page 46: Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects

2003. DSRG, Worcester Polytechnic Institute 47

Summary Efficient XQuery Processing XML Algebra Tree (XAT) XAT Optimization:

Rewrite by using equivalent rules Cleanup

Schema cleanup Operator cutting

Prototype system implementation.