discussion #23 1/32 discussion #23 relational algebra
Post on 21-Dec-2015
221 views
TRANSCRIPT
Discussion #23 1/32
Discussion #23
Relational Algebra
Discussion #23 2/32
Topics• Algebras • Relational Algebra
– use of standard notation– set operators , , – renaming – selection – projection – cross product – join ||
• Queries (from English)• Query optimization• SQL
Discussion #23 3/32
Relational Algebra• What is an algebra?
– a pair: (set of values, set of operations) ADT type Class Object
e.g. stack: (set of all stacks, {pop, push, top, …})
integer: (set of all integers, {+, -, *, })
• What is relational algebra?– (set of relations, set of relational operators)
– {, , , , , , , ||}
Discussion #23 4/32
Relational Algebra is Closed
• Closed: all operations produce values in the value set– (reals, {+, *, }) closed
– (reals, {+, *, , }) not closed (divide by 0)
– (reals, {+, *, >}) not closed (T/F not in value set)
– (computer reals, {+, *, }) not closed (overflow, roundoff)
– (relations, relational operators) closed
• Implication: we can always nest relational operators; can’t for algebras that are not closed.– e.g. after overflow, can do nothing
– e.g. can’t always nest: (2 < 3) + 5 = ?
Discussion #23 5/32
Set Operations: , , and • Relations are sets; thus set operations should work.• Examples:
R = A B 1 2 2 2 2 3
S = A B 2 2 2 3 4 2 5 5
RS = A B 1 2 2 2 2 3 4 2 5 5
RS = A B 2 2 2 3
RS = A B 1 2
SR = A B 4 2 5 5
Discussion #23 6/32
Set Operations (continued …)• Definition: schema(R) = {A, B} = AB, i.e. the
set of attributes• We sometimes write R(AB) to mean the relation
R with schema AB.• Definition: union compatible
– schema(R) = schema(S)– required precondition for , ,
• Definitions: – R S = { t | t R t S}– R S = { t | t R t S}– R S = { t | t R t S}
Discussion #23 7/32
Tuple Restriction: [X]• Restriction is a tuple operator (not a relational
operator).• t[X] restricts tuple t to the attributes in X.
A B Ct = 1 2 3
t[A] = (1) t[AC] = (1,3)
t = (1,2,3)
t[A] = (1,2,3)[A]= {(A,1), (B, 2), (C,3)}[A]= {(A,1)}= (1)
Discussion #23 8/32
Renaming: ABR renames attribute A to be B.
– A must be in schema(R)– B must not be in schema(R)
• Example: let
CBQ = A B 2 2 3 2
RCBQ = A B 1 2 2 2 2 3 3 2
R = A B 1 2 2 2 2 3
Q = A C 2 2 3 2
• But with :
RQ = ?Not union compatible
Discussion #23 9/32
Renaming (continued…)
• Q = ABR renames attribute A to B; the result is Q.• Precondition:
– A schema(R)– B schema(R)
• Postcondition:– schema(Q) = (schema(R) {A}) {B}– Q = {t' | t (tR t' = (t – {(A, t[A])}) {(B, t[A])})}
R = {{(A,1), (C,2)} {(A,2), (C,2)}}
Q = ABR = {{(B,1), (C,2)}
{(B,2), (C,2)}}
Discussion #23 10/32
Selection: • The selection operation selects the tuples that
satisfy a condition.
A=1R = A B 1 2
B=2R = A B 1 2 2 2
A=2B2R = A B 2 2
2 3
A=3R = A B
Note: empty, but still retain the schema
PR = { t | t R P(t) }
• Precondition: each attribute mentioned in P must be in schema(R).
• Postcondition: PR = { t | t R P(t) }
schema(PR) = schema(R)
Meaning: apply predicate P to tuple t by substituting into P appropriate t values.
R = A B 1 2 2 2 2 3
Discussion #23 11/32
Projection: The projection operation restricts tuples in a
relation to those designated in the operation.
R = A B 1 2 2 2 2 3Q = A B C 1 1 1 2 1 1 3 4 5
AR = A 1 2
BR = B 2 3
BCQ = B C 1 1 4 5
ABR = R = A,BR = {A,B}R
Precondition: X schema(R)
Postcondition: XR = { t' | t (t R t' = t[X]) }schema(XR) = X
Discussion #23 12/32
Cross Product: Standard cartesian product adapted for
relational algebra
R = A B 1 2 2 2
S = C D 1 1 2 2 3 3
R S = A B C D 1 2 1 1 1 2 2 2 1 2 3 3 2 2 1 1 2 2 2 2 2 2 3 3
Discussion #23 13/32
Cross Product (continued…)
R = A B 1 2 = t' 2 2
S = C D 1 1 2 2 3 3 = t''
Precondition: schema(R) schema(S) = Postcondition: R S = { t | t' t''(t' R t'' S t = t' t'')}
schema(R S) = schema(R) schema(S)
t' = { (A,1), (B,2) }
t'' = { (C,3), (D,3) }
t' t'' = { (A,1), (B,2), (C,3), (D,3) }
Discussion #23 14/32
Cross Product (continued…)
R = A B 1 2 = t' = { (A,1), (B,2) } 2 2
S = C A 1 1 = t'' = { (C,1), (A,1) } 2 2 3 3 = t''' = { (C,3), (A,3) }
t' t'' = { (A,1), (B,2), (C,1), (A,1) }
What if R and S have the same attribute, e.g. A?
Can’t do cross productSolution: RenameAAS
R AAS = A B C A 1 2 1 1 1 2 2 2 1 2 3 3 2 2 1 1 2 2 2 2 2 2 3 3
Discussion #23 15/32
Natural Join: ||
R = A B 1 2 2 2
S = B C 1 2 2 1 3 2
R || S = A B C 1 2 1
2 2 1
(R )
Cross Product A B
1 2 1 2 1 2 2 2 1 2 2 2 2 1 2 2 3 2
R || S = ABC
Projection
B=B'
Selection Renaming
BB'S
B' C1 22 13 2
1 2
2 2 1
1
Discussion #23 16/32
Join (continued …)
• In general, we can equate 0, 1, 2, or more attributes using || .
• A join is defined as:
schema (R || S) = schema(R) schema(S)
R || S = {t | t[schema(R)] R
t[schema(S)] S}
• There are no preconditions join always works.
Discussion #23 17/32
Join (continued…)
R = A B 1 1 2 3 4 1
S = C D 1 1 1 5
R || S = A B C D 1 1 1 1 1 1 1 5 2 3 1 1 2 3 1 5 4 1 1 1 4 1 1 5
R = A B 1 2 2 2 2 3
S = B C 1 1 2 2 3 3
R || S = A B C 1 2 2 2 2 2 2 3 3
R = A B C 1 2 3 2 2 4 2 3 5
S = A B D 1 1 1 2 2 2 2 2 1
R || S = A B C D
2 2 4 2 2 2 4 1
0 attributes in common (full cross product)
1 attribute in common
2 attributes in common
Discussion #23 18/32
Join (continued…)
• We can use renaming to control the ||
R = A B 1 2 2 2
S = B C 1 2 2 1 3 2
R || CAS = A B 1 2
S' = B A 1 2 2 1 3 2
= A B 2 1 1 2 2 3
R || S' = A B 1 2
• BTW, observe equivalence with intersection
Discussion #23 19/32
Relational Algebra Expressions• Relational operators are closed. Thus we can nest
expressions:
R = A B 1 2 3 4
S = B C D 2 5 1 2 7 2 3 2 3 4 5 4
DC=5(R || S) = A B C D 1 2 5 1 1 2 7 2 3 4 5 4
• Unary operators have precedence over binary operators; binary operators are left associative.
• We can now do something very useful: ask and answer with relational algebra (almost) any query we can dream up.
= D 1 4
Discussion #23 20/32
Relational Algebra Queries• List the prerequisites for EE200.
PrerequisiteCourse='EE200'cp = PrerequisiteEE005CS100
• When does CS101 meet?Day,HourCourse='CS101'cdh = Day Hour
M 9AM W 9AM F 9AM
• When and where does EE200 meet?Day,Hour,RoomCourse='EE200'(cdh || cr) = Day Hour Room
Tu 10AM 25 Ohm Hall W 1PM 25 Ohm Hall Th 10AM 25 Ohm Hall
Our answers are in (cdh || cr).We select Course to be EE200.Then, project on Day, Hour, Room.
Discussion #23 21/32
Queries (continued…)
• Can we rewrite the query more optimally?• What rules should we use?
– Associativity and commutivity of join– Distributive laws for select and project
• What strategy should we use?– Eliminate unnecessary operations– Make joins as small as possible before execution
RoomName='Snoopy' Day='M' Hour='9AM' (snap || csg || cr || cdh)
StudentID Name'Snoopy' Address PhoneCourse StudentID GradeCourse Room*Course Day'M' Hour'9AM'
= Room Turing Aud.
• Where can I find Snoopy at 9 am on Monday?
Discussion #23 22/32
Query Optimization• “Intuitively” we can write
Room(Name='Snoopy'snap || csg || cr || Day='M' Hour='9AM'cdh)
• Why does this execute faster?
• What laws hold that will let us do this?
R || S = S || R
P1P2E = P1P2E
P(R |×| S) = R || PS (if all the attributes of P are in S)
• How do we know they hold?
RoomName='Snoopy' Day='M' Hour='9AM' (snap || csg || cr || cdh)
as
Discussion #23 23/32
Proofs for Laws• To prove P1P2E = P1P2E, we need to prove that two
sets are equal. We prove A = B by showing AB BA. We show that AB by showing that xA xB.
• Thus, we can do two proofs to prove P1P2E = P1P2E as follows:
1. t P1P2E premise2. t E (P1P2)(t) def.: PR = {t | tR P(t)}3. t E P1(t) P2(t) identical substitutions & operations4. t E P2(t) P1(t) commutative
5. t P2E P1(t) def. of 6. t P1P2E def. of
1. t P1P2E premise2. … just go backwards from 6 to 1 in
the proof above
Discussion #23 24/32
Alternate Proof
Thus, we can prove P1P2E = P1P2E as follows:
P1P2E= {t | t E (P1P2)(t)} def.: PR = {t | tR P(t)}= {t | t E P1(t) P2(t)} identical substitutions & operations= {t | t E P2(t) P1(t)} commutative
= {t | t P2E P1(t)} def. of = {t | t P1P2E} def. of = P1P2E def. of a relation
(Derive the right-hand side from the left-hand side.)
Discussion #23 25/32
Proofs for Laws (continued …)• To prove P(R || S) = R || PS, where all attributes of P are
in S, we again need to prove that two sets are equal.• As before, we can convert the lhs to the rhs.
P(R || S) = {t | t P(R || S)} def. of a relation
= {t | t R || S P(t)} def.: PR={t | tRP(t)}= {t | t[schema(R)] R t[schema(S)] S P(t)}
def.: R||S={t | t[schema(R)]Rt[schema(S)]S}= {t | t[schema(R)] R
t[schema(S)] S P(t[schema(S)])} all attributes of P are in S
= {t | t[schema(R)] R t[schema(S)] PS}def. of
= {t | t R || PS} def. of ||
= R || PS def. of a relation
Discussion #23 26/32
SQL
Correspondence with Relational Algebra
select Afrom Rwhere B = 1
Assume we have relations R(AB) and S(BC).
select B from Rexceptselect B from S
select A, R.B, Cfrom R, Swhere R.B = S.B
A B = 1 R
B R B S
A, R.B, C R.B = S.B (R S) = R || S
Discussion #23 27/32
SQL
Correspondence with Relational Algebra
select Afrom Rwhere B = 1
Assume we have relations R(AB) and S(BC).
select R.B from Rwhere R.B not in (select S.B from S)
select *from R natural join S
A B = 1 R
B R B S
R || S
Discussion #23 28/32
SQL Queries• List the prerequisites for EE200.
select Prerequisite Prerequisitefrom cp EE005where Course='EE200' CS100
• When does CS101 meet?select Day, Hour Day Hourfrom cdh M 9AMwhere Course= 'CS101' W 9AM
F 9AM
• When and where does EE200 meet?select cdh.Course, Day, Hour, Room Course Day Hour Roomfrom cdh, cr EE200 Tu 10AM 25 Ohm Hall where cdh.Course= 'EE200' EE200 W 1PM 25 Ohm Hall and cdh.Course=cr.Course EE200 Th 10AM 25 Ohm Hall
Discussion #23 29/32
SQL Queries• List the prerequisites for EE200.
select Prerequisite Prerequisitefrom cp EE005where Course='EE200' CS100
• When does CS101 meet?select Day, Hour Day Hourfrom cdh M 9AMwhere Course= 'CS101' W 9AM
F 9AM
• When and where does EE200 meet?select Course, Day, Hour, Room Course Day Hour Roomfrom cdh natural join cr EE200 Tu 10AM 25 Ohm Hall where cdh.Course= 'EE200' EE200 W 1PM 25 Ohm Hall
EE200 Th 10AM 25 Ohm Hall
Discussion #23 30/32
SQL Queries• List all prerequisite courses.
select Prerequisite Prerequisitefrom cp CS100
EE005CS100CS101CS120CS101CS121CS205
select distinct Prerequisite Prerequisitefrom cp CS100
CS101CS120CS121CS205EE005
Discussion #23 31/32
SQL Queries• Where can I find Snoopy at 9 am on Monday?
• List all prereqs of CS750 (including prereqs of prereqs.)• Not possible with standard SQL (unless nesting depth is
known)
• Is possible with Datalog
Rules: prereqOf(x, y) :- cp(y, x).
prereqOf(x, y) :- prereqOf(x, z), cp(y, z).
Query: prereqOf(x, 'CS750')?
• To gain more power and flexibility, we typically embed SQL in a high-level language.
select Room Roomfrom snap, csg, cr, cdh Turing Aud.where Name='Snoopy' and Day='M' and Hour='9AM' and snap.StudentID=csg.StudentID and csg.Course=cr.Course and cr.Course=cdh.Course
Discussion #23 32/32
SQL Queries
• List all prereqs of CS750 (including prereqs of prereqs.)select cp.Prerequisitefrom cpwhere cp.Course = 'CS750'
union
select cp1.Prerequisitefrom cp cp1, cp cp2where cp1.Course = cp2.Prerequisite and cp2.Course = 'CS750'
union
…