query compiler. the query compiler parses sql query into parse tree transforms parse tree into...

24
Query Compiler

Post on 18-Dec-2015

260 views

Category:

Documents


0 download

TRANSCRIPT

Query Compiler

The Query Compiler• Parses SQL query into parse tree

• Transforms parse tree into expression tree (logical query plan)

• Transforms logical query plan into physical query plan

{P1,P2,…..}

{P1,C1>...}

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

Pi

answer

SQL query

parse tree

logical query plan

“improved” l.q.p

l.q.p. +sizesstatistics

Grammar for simple SQL<Query> ::= <SFW><Query> ::= (<Query>)

<SFW> ::= SELECT <SelList> FROM <FromList> WHERE <Cond>

<SelList> ::= <Attr>,<SelList><SelList> ::= <Attr>

<FromList> ::= <Relation>, <FromList><FromList> ::= <Relation>

<Cond> ::= <Cond> AND <Cond><Cond> ::= <Tuple> IN <Query><Cond> ::= <Attr> = <Attr><Cond> ::= <Attr> LIKE <Pattern>

<Tuple> ::= <Attr>

Atoms(constants), <syntactic categories>(variable),::= (can be expressed/defined as)

QueryStarsIn(title,year,starName)

MovieStar(name,address,gender,birthdate)

Query:

Give titles of movies that have at least one star born in 1960

SELECT titleFROM StarsIn, MovieStarWHERE starName = name AND birthdate LIKE '%1960%' ;

Parse Tree<Query>

<SFW>

SELECT <SelList> FROM <FromList> WHERE <Condition>

<Attribute> <RelName> , <FromList> AND

title StarsIn <RelName>

<Condition> <Condition>

<Attribute> = <Attribute> <Attribute> LIKE <Pattern>

starName name birthdate ‘%1960’

MovieStar

Another query equivalent

SELECT titleFROM StarsInWHERE starName IN (

SELECT nameFROM MovieStarWHERE birthdate LIKE '%1960%'

);

The Preprocessor (expand query & semantic checking)

• Checks against schema definition:– Relation uses

– Attribute uses, resolve names ( A to R.A)

– Use of types (strings, integers, dates, etc)

and operators’ arguments type/arityThese preprocessing functions are called

semantic checking

• If all tests are passed, then the parse tree is said to be valid

Algebraic laws for transforming logical query plans• Commutative and associative laws:

)()( TSRTSR

RSSR

)()( TSRTSR

RSSR

Above laws are applicable for both sets and bags

Theta-join• Commutative:

• Not always associative: – On schema R(a,b), S(b,c), T(c,d) the first query can not be

transformed into the second: (Why?)

RSSR CC

TSR dabSbR )( ..

)(.. TSR dabSbR

Because, we can’t join S and T using the condition a<d since a is an attribute of neither S nor T.

Laws Involving Selection () ))(()( RR DCDC

)()()( RRR DSCDC

))(())(( RR CDDC

Splitting laws

Only if R is a set.

The union is

“set union”Order is flexible

Laws Involving Selection ()

)()()( SRSR CCC

SR

SR

SR

C

CC

C

)(

)()(

)(

What about intersection?

SRSR CC )()(

Algebraic Laws involving selection )(RDC

)( SRC

)( SRC

)( SR DC

For the binary operators, we push the selection only if all attributes in the condition C are in R.

))(( RDC

SR C

SRC )(

SR DC )(

Example:

• Consider relation schemas R(A,B) and S(B,C) and the expression below:

(A=1 OR A=3) AND B<C(R S)

1. Splitting AND A=1 OR A=3 (B < C(R S))

2. Push to S A=1 OR A=3 (R B < C(S))

3. Push to R A=1 OR A=3 (R) B < C(S)

Pushing selections• Usually selections are pushed down the expression tree.

• The following example shows that it is sometimes useful to pull selection up in the tree.

StarsIn(title,year,starName)Movie(title,year,length,studioName)

CREATE VIEW MoviesOf1996 AS SELECT * FROM MOVIE WHERE year=1996;

Query: Which stars worked for which studios in 1996?

SELECT starName,studioName FROM MoviesOf1996 NATURAL JOIN StarsIN;

pull selection upthen push down

Laws for (bag) Projection• A simple law: Project out attributes that are not needed

later. – i.e. keep only the input attr. and any join attribute.

)( SRL

)( SR CL

)( SRL

))()(( SR NML

))()(( SR NCML

))()(( SR NML

RCL RMC

Examples for pushing projection

Schema R(a,b,c), S(c,d,e)

)( SRxea ))(( , SR ecxea

)(, SRyedxba ))()(( ,,, SR cyedcxbayx

Example: Pushing Projection• Schema: StarsIn(title,year,starName) • Query: SELECT starName FROM StarsIn WHERE year = 1996;

Should we transform to ?Depends!Is StarsIn stored or computed?

starName

year=1996

StarsIn

starName

year=1996

StarsIn

starName,year

Reasons for not pushing the projection• If StarsIn is stored, then for the projection we have to scan

the relation.

• If the relation is pipelined from some previous computation, then yes, we better do the projection (on the fly).

• Also, if for example there is an index on year for StarsIn, such index is useless in the projected relation starName,year(StarsIn)

– While such an index is very useful for the selection on “year=1996”

Laws for duplicate elimination and grouping

Try to move in a position where it can be eliminated altogether

E.g. when is applied on • A stored relation with a declared primary key• A relation that is the result of a operation, since grouping creates a

relation with no duplicates.

)())(( RR LL absorbs

Improving logical query plans

• Push as far down as possible (sometimes pull them up first).

• Do splitting of complex conditions in in order to push even further.

• Push as far down as possible, introduce new early (but take care for exceptions)

• Combine with to produce -joins or equi-joins

• Choose an order for joins

Example of improvement

SELECT titleFROM StarsIn, MovieStarWHERE starName = name AND birthdate LIKE ‘%1960’;

title

starname=name AND birthdate LIKE ‘%1960’

StarsIn

MovieStar

title

StarsIn MovieStar

starName=name

birthdate LIKE ‘%1960’

title

StarsIn

MovieStar

starName=name

birthdate LIKE ‘%1960’

name

And a better plan introducing a projection to filter out useless attributes: