slides 7 relational algebra
TRANSCRIPT
-
8/3/2019 Slides 7 Relational Algebra
1/16
1
Relational Algebra
Lecture 7
2
Outline
Relational Algebra (Section 6.1)
-
8/3/2019 Slides 7 Relational Algebra
2/16
3
Relational Algebra
Formalism for creating new relationsfrom existing ones
Its place in the big picture:
Declarative
querylanguage
Algebra Implementation
SQL,
relational calculus
Relational algebra
4
Relational Algebra Five operators:
Union: !
Difference: -
Selection:s Projection: P Cartesian Product: "
Derived or auxiliary operators: Intersection, complement
Joins (natural,equi-join, theta join, semi-join)
Renaming:r
-
8/3/2019 Slides 7 Relational Algebra
3/16
5
1. Union and 2. Difference
R1 ! R2Example:
ActiveEmployees ! RetiredEmployees
R1 R2
Example: AllEmployees ! RetiredEmployees
6
What about Intersection?
It is a derived operatorR1 # R2 = R1 (R1 R2)
Also expressed as a join (will see later)
Example
UnionizedEmployees# RetiredEmployees
-
8/3/2019 Slides 7 Relational Algebra
4/16
7
3. Selection
Returns all tuples which satisfy a condition
Notation: sc(R) (sometimes also !)
Examples
s Salary > 40000(Employee)
s name = Smith(Employee)
The condition c can be =, , %,
[in SQL: SELECT * FROM Employee
WHERE Salary > 40000]
condition
8
Selection Example
Employee
SSN Name DepartmentID Salary
999999999 John 1 30,000
777777777 Tony 1 32,000888888888 Alice 2 45,000
SSN Name DepartmentID Salary
888888888 Alice 2 45,000
Find all employees with salary more than $40,000.s Salary > 40000(Employee)
-
8/3/2019 Slides 7 Relational Algebra
5/16
9
4. Projection
Eliminates columns, then removes duplicates
Notation: PA1,,An(R) (sometimes !) Example:
project to social-security number and names:
PSSN, Name (Employee)
Output schema: Answer(SSN, Name)
[In SQL: SELECT DISTINCT SSN, Name FROM Employee]
attributes
10
Projection Example
Employee
SSN Name DepartmentID Salary
999999999 John 1 30,000
777777777 Tony 1 32,000
888888888 Alice 2 45,000
SSN Name
999999999 John
777777777 Tony888888888 Alice
PSSN, Name (Employee)
-
8/3/2019 Slides 7 Relational Algebra
6/16
11
5. Cartesian Product
Combine each tuple in R1 with each tuple inR2
Notation: R1 " R2
Example: Employee " Dependents
Very rare in practice; mainly used to expressjoins
[In SQL: SELECT * FROM R1, R2]
12
Cartesian Product Example
Employee
Name SSN
John 999999999
Tony 777777777
Dependents
EmployeeSSN Dname999999999 Emily
777777777 Joe
Employee x Dependents
Name SSN EmployeeSSN Dname
John 999999999 999999999 Emily
John 999999999 777777777 Joe
Tony 777777777 999999999 Emily
Tony 777777777 777777777 Joe
Note the output
-
8/3/2019 Slides 7 Relational Algebra
7/16
13
Cartesian Product and SQLselect * fromEmployee, Dependents;
+------+-----------+-------------+-------+| Name | SSN | EmployeeSSN | Dname |
+------+-----------+-------------+-------+
| John | 999999999 | 999999999 | Emily |
| Tony | 777777777 | 999999999 | Emily |
| John | 999999999 | 777777777 | Joe |
| Tony | 777777777 | 777777777 | Joe |
+------+-----------+-------------+-------+
select * from Employee, Dependentswhere SSN=EmpoyeeSSN;
+------+-----------+-------------+-------+| Name | SSN | EmployeeSSN | Dname |
+------+-----------+--------------+-------+
| John | 999999999 | 999999999 | Emily |
| Tony | 777777777 | 777777777 | Joe |
+------+-----------+-------------+-------+
14
Relational Algebra Five operators:
Union: !
Difference: -
Selection:s Projection: P Cartesian Product: "
Derived or auxiliary operators: Intersection, complement
Joins (natural,equi-join, theta join, semi-join)
Renaming:r
-
8/3/2019 Slides 7 Relational Algebra
8/16
15
Renaming
Changes the schema, not the instance
Schema: R(A1, , An )
Notation: r B1,,Bn (R)
Example:
r LastName, SocSocNo (Employee)
Output schema: Answer(LastName, SocSocNo)
[in SQL: SELECT Name AS LastName, SSN AS SocSocNo
FROM Employee]
16
Renaming Example
Employee
Name SSNJohn 999999999
Tony 777777777
LastName SocSocNoJohn 999999999Tony 777777777
rLastName, SocSocNo (Employee)
-
8/3/2019 Slides 7 Relational Algebra
9/16
17
Natural Join
Notation: R1 ! R2
Meaning: R1 ! R2 = PA(sC(R1 " R2)) Equality on common attributes names
Where: The selection sC checks equality of all common
attributes
The projection eliminates the duplicate commonattributes
[in SQL: SELECT DISTINCT R1.A, R1. B, R2.C FROM R1, R2WHERE R1.B = R2.BSchema: R1(A,B), R2(B,C)]
18
Natural Join Example
Employee
Name SSNJohn 999999999Tony 777777777
Dependents
SSN Dname999999999 Emily777777777 Joe
Name SSN DnameJohn 999999999 EmilyTony 777777777 Joe
Employee Dependents =
PName, SSN, Dname(s SSN=SSN2(Employee x r SSN2, Dname(Dependents))
-
8/3/2019 Slides 7 Relational Algebra
10/16
19
Natural Join
R= S=
R ! S=
VZ
ZY
ZX
YX
BA
VZ
WV
UZ
CB
WVZ
VZY
UZY
VZX
UZX
CBA
20
Natural Join
Given the schemas R(A, B, C, D), S(A, C,
E), what is the schema of R ! S ?
Given R(A, B, C), S(D, E), what is R ! S ?
Given R(A, B), S(A, B), what is R ! S ?
-
8/3/2019 Slides 7 Relational Algebra
11/16
21
Theta Join
A join that involves a predicate
R1 !q R2 = sq (R1 " R2)
Here qcan be any condition
22
Eq-join
A theta join where qis an equalityR1 !A=B R2 = s A=B (R1 " R2)
Equality on two different attributes
Example:
Employee !SSN=SSN Dependents
Most useful join in practice(difference to natural join?)
-
8/3/2019 Slides 7 Relational Algebra
12/16
23
Semijoin
R " S = PA1,,An (R ! S)
Where A1, , An are the attributes in R
Example:
Employee " Dependents
24
Employee
Name EmpId DeptNameHarry 3415 Finance
Sally 2241 Sales
George 3401 FinanceHarriet 2202 Sales
Dept
DeptName Manager
Sales HarrietProduction Charles
Semijoin - Example
Employee!DeptName EmpId DeptNameSally 2241 Sales
Harriet 2202 Sales
Only attributes of Employee are selected
-
8/3/2019 Slides 7 Relational Algebra
13/16
25
Semijoins in DistributedDatabases
Semijoins are used in distributed databases
. . .. . .
NameSSN
Dname
. . .. . .
AgeSSNEmployee
Dependents
network
Employee !ssn=ssn (s age>71 (Dependents))T = PSSNs age>71 (Dependents)
R = Employee "!TAnswer = R! Dependents
26
Complex RA Expressions
Person Purchase Person Product
sname=fred sname=gizmoPpidPssn
seller-ssn=ssn
pid=pid
buyer-ssn=ssn
Pname
-
8/3/2019 Slides 7 Relational Algebra
14/16
27
Application:Query Rewriting for Optimization
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
Reserves Sailors
sid=sid
bid=100
sname
rating > 5(Scan;write totemp T1)
(Scan;write totemp T2)
The earlier we process selections, the fewer tuples we need tomanipulate higher up in the tree (predicate pushdown)Disadvantages?
28
Algebraic Laws (Examples)
Commutative and Associative Laws R "S = S "R, R "(S "T) = (R "S) "T
R S = S R, R (S T) = (R S) T
Laws involving selection s C AND C"(R) = sC(s C"(R)) = sC(R) "sC"(R)
s C (R S) = sC (R) S When C involves only attributes of R
Laws involving projections PM(PN(R)) = PM,N(R)
! ! ! ! ! !
! !
-
8/3/2019 Slides 7 Relational Algebra
15/16
29
Operations on Bags
A bag = a set with repeated elementsAll operations need to be defined carefully on bags
{a,b,b,c}!{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f}
{a,b,b,b,c,c} {b,c,c,c,d} = {a,b,b}
sC(R): preserve the number of occurrences
PA(R): no duplicate elimination
Cartesian product, join: no duplicate elimination
Important: Relational Engines work on bags, not sets!
30
Finally: RA has Limitations !
Cannot compute transitive closure
Find all direct and indirect relatives of Fred
Cannot express in RA !!! Need to write Cprogram
SisterLouNancy
SpouseBillMary
CousinJoeMary
FatherMaryFred
RelationshipName2Name1
-
8/3/2019 Slides 7 Relational Algebra
16/16
31
Conclusion
Relational algebra is important for SQL
We mainly need projects, selections,
joins
Joins are typically time consuming for
large databases