cs 44321 cs4432: database systems ii logical plan rewriting
TRANSCRIPT
CS 4432 1
CS4432: Database Systems II
Logical Plan Rewriting
CS 4432 query processing 2
parse
convert
apply laws
estimate result sizes
consider physical plans estimate costs
pick best
execute
{P1,P2,…..}
{(P1,C1),(P2,C2)...}
Pi
answer
SQL query
parse tree
logical query plan
“improved” l.q.p
l.q.p. +sizes
statistics
CS 4432 3
Query in SQL Query Plan in Algebra (logical) Other Query Plan in Algebra (logical)
CS 4432 4
Query plan 1 (in relational algebra)
B,D
R.A=“c” S.E=2 R.C=S.C
XR S
CS 4432 5
Query plan 2 (in relational algebra)
B,D
R.A = “c” S.E = 2
R S
natural join on R.C=S.C
CS 4432 6
Relational algebra optimization
• What are transformation rules ?– preserve equivalence
• What are good transformations?– reduce query execution costs
CS 4432 7
Rules: Natural join rewriting.
R S = S R(R S) T = R (S T)
R S S T
T R
Can also write as trees, e.g.:
CS 4432 8
Rules: Other binary operators ?
R S = S R(R S) T = R (S T)
What about :• Cross product?• Condition join?• Union?• Intersection ?• Difference ?
CS 4432 11
Rules: Selects
p1p2(R)= p1 [ p2 (R)]
[ p1 (R)] U [ p2 (R)]
p1vp2(R) =
CS 4432 12
Bags vs. Sets
R = {a,a,b,b,b,c}S = {b,b,c,c,d}What about union R U S = ?
• Option 1 SUMR U S = {a,a,b,b,b,b,b,c,c,c,d}
• Option 2 MAXR U S = {a,a,b,b,b,c,c,d}
CS 4432 13
Which option makes this rule work ?
p1vp2 (R) = p1(R) U p2(R)
Example: R={a,a,b,b,b,c}P1 satisfied by a,b; P2 satisfied by b,c
p1vp2 (R) = {a,a,b,b,b,c}
p1(R) = {a,a,b,b,b}
p2(R) = {b,b,b,c}
p1(R) U p2 (R) = {a,a,b,b,b,c}
Let us try MAX():
CS 4432 14
Which option makes this rule work ?
p1vp2 (R) = p1(R) U p2(R)
Example: R={a,a,b,b,b,c}P1 satisfied by a,b; P2 satisfied by b,c
p1vp2 (R) = {a,a,b,b,b,c}
p1(R) = {a,a,b,b,b}
p2(R) = {b,b,b,c}
p1(R) U p2 (R) = {a,a,b,b,b,b,b,b,c}
What about Sum()?
CS 4432 15
Which option makes this rule work ?
p1p2(R)= p1 [ p2 (R)]
Example: R={a,a,b,b,b,c}P1 satisfied by a,b; P2 satisfied by b,c
What about MAX versus SUM ?
CS 4432 17
Yet another example !
Senators (……) Reps (……)
T1 = yr,state Senators; T2 = yr,state Reps
T1 Yr State T2 Yr State 97 CA 99 CA 99 CA 99 CA 98 AZ 98 CA
Union?
“Sum” option makes more sense!
CS 4432 18
Executive Decision
-> Use “SUM” option for bag unions
-> CAREFUL ! Some rules cannot be used for
bags
CS 4432 19
Rules: Project
Let: X = set of attributesY = set of attributesXY = X U Y
xy (R) = x [y (R)]
CS 4432 20
Let p = predicate with only R attributes q = predicate with only S attributes m = predicate with both R and S attribs
p (R S) =
q (R S) =
Rules: combined
[p (R)] S
R [q (S)]
CS 4432 21
pq (R S) = ?
Rules: combined
Rule can be derived !
CS 4432 22
Derivation for rule :
pq (R S) =
p [q (R S) ] =
p [ R q (S) ] =
[p (R)] [q (S)]
CS 4432 23
More Rules can be Derived:
pq (R S) =
pqm (R S) =
pvq (R S) =
Rules: combined (continued)
CS 4432 24
We did one, do others on your own :
pq (R S) = [p (R)] [q (S)]
pqm (R S) =
m [(p R) (q S)]pvq (R S) =
[(p R) S] U [R (q S)]
CS 4432 25
Rules: combined
Let x = subset of R attributes z = attributes in predicate P
(subset of R attributes)
x[p (R) ] = {p [ x
(R) ]} x
xz
CS 4432 26
Rules: combined
Let x = subset of R attributes y = subset of S attributes
z = intersection of R,S attributes
xy (R S) = xy{[xz (R) ] [yz
(S) ]}
CS 4432 29
Which are “good” transformations?
CS 4432 30
Conventional wisdom: do projects early
Example: relation R(A,B,C,D,E) predicate P: (A=3) (B=“cat”)
E {p (R)} vs. E {p{ABE(R)}}
CS 4432 31
What if we have A, B indexes?
B = “cat” A=3
Intersect pointers to get
pointers to matching tuples!
But
Then better to do projection later !
CS 4432 32
p1p2 (R) p1 [p2 (R)]
p (R S) [p (R)] S
R S S R
x [p (R)] x {p [xz (R)]}
Which are “good” transformations?
CS 4432 33
Bottom line:
• Some heuristics : – Early selection is usually good
• No transformation is always good
• Rule application defines a search space– Need cost criteria to make decision
CS 4432 34
In textbook: more transformations
• Chapter 16.2, 16.3.3
• More rewrite rules
• Other operations, such as, duplicate elimination, etc.