cs 255: database system principles slides: from parse trees to logical query plans by:- arunesh...
TRANSCRIPT
CS 255: Database System Principles
slides: From Parse Trees to Logical Query Plans
By:- Arunesh Joshi Id:-006538558
Agenda
• Conversion to Relational Algebra. • Removing Sub queries From Conditions.• Improving the Logical Query Plan. • Grouping Associative/Commutative
Operators.
Parsing
• Goal is to convert a text string containing a query into a parse tree data structure:– leaves form the text string (broken into lexical
elements)– internal nodes are syntactic categories
• Uses standard algorithmic techniques from compilers– given a grammar for the language (e.g., SQL),
process the string and build the tree
Example: SQL querySELECT titleFROM StarsInWHERE starName IN (
SELECT nameFROM MovieStarWHERE birthdate LIKE ‘%1960’
);
(Find the movies with stars born in 1960)
Assume we have a simplified grammar for SQL.
Example: Parse Tree<Query>
<SFW>
SELECT <SelList> FROM <FromList> WHERE <Condition>
<Attribute> <RelName> <Tuple> IN <Query>
title StarsIn <Attribute> ( <Query> )
starName <SFW>
<Attribute> <RelName> <Attribute> LIKE <Pattern>
name MovieStar birthDate ‘%1960’
SELECT <SelList> FROM <FromList> WHERE <Condition>
The Preprocessor
• It replaces each reference to a view with a parse (sub)-tree that describes the view (i.e., a query)
• It does semantic checking:– are relations and views mentioned in the schema?– are attributes mentioned in the current scope?– are attribute types correct?
Convert Parse Tree to Relational Algebra
• The complete algorithm depends on specific grammar, which determines forms of the parse trees
• Here is a flavor of the approach
Conversion
• Suppose there are no subqueries.
• SELECT att-list FROM rel-list WHERE cond
is converted into
PROJatt-list(SELECTcond(PRODUCT(rel-list))), or
att-list(cond( X (rel-list)))
SELECT movieTitleFROM StarsIn, MovieStarWHERE starName = name AND birthdate LIKE '%1960';
<Query>
<SFW>
SELECT <SelList> FROM <FromList> WHERE <Condition>
<Attribute> <RelName> , <FromList> AND <Condition>
movieTitle StarsIn <RelName> <Attribute> LIKE <Pattern>
MovieStar birthdate '%1960'
<Condition>
<Attribute> = <Attribute>
starName name
Equivalent Algebraic Expression Tree
movieTitle
starname = name AND birthdate LIKE '%1960'
X
StarsIn MovieStar
Handling Subqueries
• Recall the (equivalent) query:SELECT titleFROM StarsInWHERE starName IN (
SELECT nameFROM MovieStarWHERE birthdate LIKE ‘%1960’
);
• Use an intermediate format called two-argument selection
title
StarsIn <condition>
<tuple> IN name
<attribute> birthdate LIKE ‘%1960’
starName MovieStar
Example: Two-Argument Selection
Converting Two-Argument Selection
• To continue the conversion, we need rules for replacing two-argument selection with a relational algebra expression
• Different rules depending on the nature of the sub query
• Here is shown an example for IN operator and uncorrelated query (sub query computes a relation independent of the tuple being tested)
Rules for IN
R <Condition>
t IN S R S
X
C
C is the condition that equatesattributes in t with correspondingattributes in S
Example: Logical Query Plan
title
starName=name
StarsIn name
birthdate LIKE ‘%1960’
MovieStar
What if Subquery is Correlated?
• Example is when subquery refers to the current tuple of the outer scope that is being tested
• More complicated to deal with, since subquery cannot be translated in isolation
• Need to incorporate external attributes in the translation
• Some details are in textbook
Improving the Logical Query Plan
• There are numerous algebraic laws concerning relational algebra operations
• By applying them to a logical query plan judiciously, we can get an equivalent query plan that can be executed more efficiently
• Next we'll survey some of these laws
Example: Improved Logical Query Plan
title
starName=name
StarsIn name
birthdate LIKE ‘%1960’
MovieStar
Associative and Commutative Operations
• product• natural join• set and bag union• set and bag intersection
associative: (A op B) op C = A op (B op C)commutative: A op B = B op A
Laws Involving Selection
• Selections usually reduce the size of the relation
• Usually good to do selections early, i.e., "push them down the tree"
• Also can be helpful to break up a complex selection into parts
Selection Splitting
• C1 AND C2 (R) = C1 ( C2 (R))
• C1 OR C2 (R) = ( C1 (R)) Uset ( C2 (R))if R is a set
• C1 ( C2 (R)) = C2 ( C1 (R))
Selection and Binary Operators
• Must push selection to both arguments:– C (R U S) = C (R) U C (S)
• Must push to first arg, optional for 2nd:– C (R - S) = C (R) - S
– C (R - S) = C (R) - C (S)
• Push to at least one arg with all attributes mentioned in C:– product, natural join, theta join, intersection– e.g., C (R X S) = C (R) X S, if R has all the atts in C
Pushing Selection Up the Tree
• Suppose we have relations– StarsIn(title,year,starName)– Movie(title,year,len,inColor,studioName)
• and a view– CREATE VIEW MoviesOf1996 AS
SELECT *FROM MovieWHERE year = 1996;
• and the query– SELECT starName, studioName
FROM MoviesOf1996 NATURAL JOIN StarsIn;
The Straightforward Tree
starName,studioName
year=1996 StarsIn
MovieRemember the ruleC(R S) = C(R) S ?
The Improved Logical Query Plan
starName,studioName
year=1996 StarsIn
Movie
starName,studioName
year=1996
Movie StarsIn
starName,studioName
year=1996 year=1996
Movie StarsIn
push selectionup tree
push selectiondown tree
Grouping Assoc/Comm Operators
• Groups together adjacent joins, adjacent unions, and adjacent intersections as siblings in the tree
• Sets up the logical QP for future optimization when physical QP is constructed: determine best order for doing a sequence of joins (or unions or intersections)
U D E FU
UA
B C
D E F
A B C
Thank You