lecture 4 relational algebra. introduction relational algebra defines the theoretical way of...
TRANSCRIPT
LECTURE 4
RELATIONAL ALGEBRA
Introduction•Relational algebra defines the theoretical
way of manipulating table contents through a number of relational operators
• These relational operators include;▫SELECT (or RESTRICT)▫PROJECT▫ JOIN▫PRODUCT▫ INTERSECT▫UNION▫DIFFERENCE▫DIVIDE
Introduction (cont’d)•The relational operators have the property
of closure, i.e., relational algebra operators are used on existing tables to produce new tables
•The relational operators are classed as being unary or binary
•Unary operators such as SELECT and PROJECT, can be applied to one relation
•Binary operators such as JOIN are applied on two relations
SELECTion•SELECT or RESTRICT, can be used to list all
of the row values, or return only the row values that match a specified criterion.
•The SELECT operator is denoted by and is formally defined as:▫(R) or (RELATION) where (R) is the set of specified tuples of the relation R and is the predicate (or criterion) to extract the required tuples.
• It is possible to create more complex criteria by using the logical operators AND, OR, and NOT
SELECT (cont’d)
OrderNum PartNum21608 AT9421610 DR9321610 DW1121613 KL6221614 KT0321617 BV0621617 CD5221619 DR9321623 KV29
OrderLine (OrderLine)
(OrderLine)
OrderNum PartNum21608 AT9421610 DR9321610 DW1121613 KL6221614 KT0321617 BV0621617 CD5221619 DR9321623 KV29
OrderNum PartNum
21608 AT94
PROJECTion
•This operator returns all values for
selected attributes.
•The PROJECT operator is denoted by and
formally defined as:
▫ or
where the projection of the relation R,
denoted by is the set of specified attributes
a1…..an of the relation R
PROJECT (cont’d)
OrderNum PartNum21608 AT9421610 DR9321610 DW1121613 KL6221614 KT0321617 BV0621617 CD5221619 DR9321623 KV29
OrderLine OrderNum(OrderLine)OrderNum216082161021610216132161421617216172161921623
UNION
• The UNION set operator combines all tuples from
two relations, excluding duplicate tuples.
• The relations must have the same attribute
characteristics (the columns and domains must
be identical) to be used in the UNION.
• When two or more tables share the same
number of columns, i.e., have the same degree,
and when the share the same (or compatible)
domains, they are said to be union-
compatible
UNION(cont’d)
• The UNION operator is denoted by and formally
defined as:
▫ The union of relations and denoted by with
degree n, is the relation where for each i (i = 1,
2, …n), ai and bi must have compatible domains.
▫ The degree of R3 is the same as that of R1 and
R2. However, the cardinality of R3 is a+b, only if
a and b are the cardinalities of R1 and R2
respectively
UNION (cont’d)
OrderNum PartNum21608 AT9421610 DR9321610 DW1121613 KL62
OrderLine1OrderNum PartNum21614 KT0321617 BV0621617 CD5221619 DR9321623 KV29
OrderLine2
OrderNum PartNum21608 AT9421610 DR9321610 DW1121613 KL6221614 KT0321617 BV0621617 CD5221619 DR9321623 KV29
OrderLine1 OrderLine2
INTERSECT
• The INTERSECT operator denoted as , returns only
the tuples that appear in both relations
• The tables must be union-compatible to give valid
results.
• The INTERSECT operator is formally defined as:
▫ The intersect of relations and denoted by with
degree n, is the relation that includes only those
tuples of R1 that also appear in R2 where for each i (i
= 1, 2, …n), ai and bi must have compatible domains.
INTERSECT(cont’d)
PartNum
DR93
DW11
BV06
CD52
PartNumAT94DR93DW11KL62KT03KV29
Part1 Part2Part1 Part2
PartNumDR93DW11
DIFFERENCE
• The DIFFERENCE operator returns all tuples in one
relation that are not found in the other relation.
• The DIFFERENCE operator requires that the two
relations be union-compatible.
• The DIFFERENCE operator is formally defined as:
▫ The difference of relations and denoted by with
degree m, is the relation that includes all the tuples
that are in R1 but not in R2 where for each i (i = 1,
2, …m), ai and bi must have compatible domains.
DIFFERENCE (cont’d)
PartNum
DR93
DW11
BV06
CD52
PartNumAT94DR93DW11KL62KT03DR93KV29
Part1 Part2Part1 Part2
PartNumAT94KL62KT03KV29
CARTESIAN PRODUCT
• The CARTESIAN PRODUCT is usually written as with the
new resulting relation containing all the attributes which
are present in and along with all possible combinations of
tuples from both and
• It can be formally defined as:
▫ The CARTESIAN PRODUCT of two relations with cardinality i and
with cardinality j is a relation with degree , cardinality and
attributes . This can be denoted as
• The CARTESIAN PRODUCT is not a very useful operation by
itself, as it combines many tuples that have no association
with each other. However, when used in conjunction with
the RESTRICT (SELECT) operator, it becomes a very
important operator known as a JOIN
CARTESIAN PRODUCT (cont’d)
OrderNum
OrderDate
21608 10/20/2010
21610 10/20/2010
21613 10/21/2010
PartNum Description
DR93 Gas Range
DW11 Washer
OrderNum OrderDate PartNum Description21608 10/20/2010 DR93 Gas Range21610 10/20/2010 DR93 Gas Range21613 10/21/2010 DR93 Gas Range21608 10/20/2010 DW11 Washer21610 10/20/2010 DW11 Washer21613 10/21/2010 DW11 Washer
Part
Orders
Product of Orders and Part
DIVISION
• The division operation produces a new relation by selecting
the tuples in one relation, , that match every row in another
relation, . It is the inverse of the CARTESIAN PRODUCT.
• DIVISION, denoted by , can be formally defined as:
▫ The DIVISION of two relations with cardinality i and with
cardinality j is a relation with degree and cardinality .
Division (cont’d)
OrderNum21610
PartNum
DR93
DW11
OrderNum PartNum
21608 AT9421610 DR9321610 DW1121613 KL6221614 KT0321617 BV0621617 CD5221619 DR9321623 KV29
OrderLine Part
OrderLine Part
JOIN•The JOIN operation is one of the essential
operations of relational algebra. • It is said to be the real power behind the
relational database, allowing the use of independent tables linked by common attributes
•The JOIN of two relations R1 and R2 is a restriction on their Cartesian product R1XR2 to meet a specified criterion.
•The join itself is defined on an attribute a of R1 and b of R2 where the attributes a and b share the same domain.
JOIN(cont’d)
• The JOIN operator is formally defined as:
▫ The join of two relations R1(a1, a2,…., an) and R2(b1, b2,
…, bm) is a relation R3 with degree and attributes (a1,
a2,…, an, b1, b2,…, bm)
• Types of join operations
▫ THETA JOIN
▫ EQUIJOIN
▫ NATURAL JOIN
▫ LEFT OUTER JOIN
▫ RIGHT OUTER JOIN
THETA JOIN AND EQUIJOIN
• EQUIJOIN is on of the most commonly used joins
which links tables on the basis of an equality
condition that compares specified columns of
each table
• The outcome of the equijoin does not eliminate
duplicate columns , and the condition or
criterion used must be explicitly defined.
• The equijoin takes its name from the equality
comparison operator (=) used in the condition
THETA JOIN AND EQUIJOIN(cont’d)
• If any other comparison operator is used the join
is called a THETA JOIN, denoted by θ (θ-join)
▫ Therefore, theta represents a predicate which
consists of the following comparison operators (<,
<=, >=, <>)
• EQUIJOIN is a special type of THETA JOIN
• Let R1(a1, a2,…., an) and R2(b1, b2,…, bm) be
relations which may have different schemas.
▫ Then, θ-join of R1 and R2 is denoted as R1 θR2
▫ The equijoin is denoted as R1 R1.a = R2.bR2
THETA JOIN AND EQUIJOIN(cont’d)
• It is possible to express the θ-join and the equijoin in
terms of restriction and Cartesian product operations
▫ Eg; equijoin R1 R1.a = R2.bR2 may also be written as
R1.a = R2.b(R1 XR2 )
• Looking at the θ-join and the equijoin in this way allows
for some rules to be created which helps in the
computation of such joins on two relations:
▫ Compute R1 XR2 . This first performs a Cartesian product to
form all possible combinations of the rows of R1 and R2
▫ Restrict the Cartesian product to only those rows where
the values in certain columns match
Equijoin Example
DEPT_CODE DEPT_NAMEACCT AccountingBIOL BiologyCIS Computer Info. SystemsENGL English
STU_NUM
STU_LNAME
STU_FNAME
STU_DOB DEPT_CODE
321452 Bowser William 12 February 1972
BIOL
324257 Smithson Anne 15 November 1977
CIS
324258 Brewer Juliette 23 August 1966 ACCT
324269 Oblonski Walter 16 September 1973
CIS
324273 smith John 30 December 1955
ENGL
STUDENT
DEPARTMENT
Equijoin Example (cont’d)
STU_NUM STU_LNAME
STU_FNAME
STU_DOB S.DEPT_CODE
D.DEPT_CODE DEPT_NAME
321452 Bowser William 12 February 1972
BIOL ACCT Accounting
321452 Bowser William 12 February 1972
BIOL BIOL Biology
321452 Bowser William 12 February 1972
BIOL CIS Computer Info. Systems
321452 Bowser William 12 February 1972
BIOL ENGL English
324257 Smithson Anne 15 November 1977
CIS ACCT Accounting
324257 Smithson Anne 15 November 1977
CIS BIOL Biology
324257 Smithson Anne 15 November 1977
CIS CIS Computer Info. Systems
324257 Smithson Anne 15 November 1977
CIS ENGL English
324258 Brewer Juliette 23 August 1966 ACCT ACCT Accounting
324258 Brewer Juliette 23 August 1966 ACCT BIOL Biology
324258 Brewer Juliette 23 August 1966 ACCT CIS Computer Info. Systems
324258 Brewer Juliette 23 August 1966 ACCT ENGL English
324269 Oblonski Walter 16 September 1973
CIS ACCT Accounting
324269 Oblonski Walter 16 September 1973
CIS BIOL Biology
324269 Oblonski Walter 16 September 1973
CIS CIS Computer Info. Systems
324269 Oblonski Walter 16 September 1973
CIS ENGL English
324273 smith John 30 December 1955
ENGL ACCT Accounting
324273 smith John 30 December 1955
ENGL BIOL Biology
324273 smith John 30 December 1955
ENGL CIS Computer Info. Systems
324273 smith John 30 December 1955
ENGL ENGL English
Cartesian product (STUDENT X DEPARTMENT)
Equijoin Example (cont’d)
STU_NUM
STU_LNAME STU_FNAME
STU_DOB S.DEPT_CODE
D.DEPT_CODE
DEPT_NAME
321452 Bowser William 12 February 1972
BIOL BIOL Biology
324257 Smithson Anne 15 November 1977
CIS CIS Computer Info. Systems
324258 Brewer Juliette 23 August 1966 ACCT ACCT Accounting
324269 Oblonski Walter 16 September 1973
CIS CIS Computer Info. Systems
324273 smith John 30 December 1955
ENGL ENGL English
STUDENT_IN_DEPT = STUDENT.DEPT_CODE = DEPARTMENT.DEPT_CODE(STUDENT X DEPARTMENT )
NATURAL JOIN
• The natural join operation is the most common
variant of the joins and requires that the two
operant relations must have at least one common
attribute, i.e., attributes that share the same
domain. The common column(s) is (are) referred to
as the join column(s)
• The natural join is in fact an equijoin, however, in
addition, the duplicate attributes are dropped with
the resulting relation containing one less column
than that of the equijoin
NATURAL JOIN(cont’d)
• Let R1 be a relation having attributes (a1, a2,…,
an, y), R2 be another relation having attributes
(b1, b2,…, bmy) where y is a set of common
attributes (join column(s)) which share the
same domain
• The natural join operator is defined as:
▫ The natural join of R1 and R2, denoted R1 X R2,
consists of combining the tuples of R1 and R2 to
build a new relation R3, such that if , ,, then
NATURAL JOIN(cont’d)
• The common set of attributes y appears only once in R3; the
notation correspond to the attribute value of a tuple of R1
• The steps required to perform the natural join of two
relations are:
1. Compute R1 X R2.
2. Select those tuples where . Only the rows are selected
where the attribute values in the join column(s) are equal
3. Perform a PROJECT operation on either to the result in step
2, and call it y in the final relation. This is to ensure that the
final relation results in a single copy of each attribute in the
joining column, thereby eliminating duplicates
Natural Join ExampleCUSTOMER
AGENT
CUS_CODE CUS_LNAME
CUS_POSTCODE
AGENT_CODE
1132445 Walker M1 5RT 2311217782 Adares NW6 4RT 1251312243 Rakowski 678954 1671321242 Rodriguez NW6 2WS 1251542311 Smithson N4 3YP 4211657399 Vanloo 67543W 231
AGENT_CODE AGENT_PHONE
125 01812439887167 01813426778231 01812431124333 01131234445
Natural Join Example (cont’d)
Cartesian product (CUSTOMER X AGENT)C.CUS_CODE C.CUS_LNAME C.CUS_POSTCODE C.AGENT_CODE A.AGENT_CODE A.AGENT_PHONE
1132445 Walker M1 5RT 231 125 018124398871132445 Walker M1 5RT 231 167 018134267781132445 Walker M1 5RT 231 231 018124311241132445 Walker M1 5RT 231 333 011312344451217782 Adares NW6 4RT 125 125 018124398871217782 Adares NW6 4RT 125 167 018134267781217782 Adares NW6 4RT 125 231 018124311241217782 Adares NW6 4RT 125 333 011312344451312243 Rakowski 678954 167 125 018124398871312243 Rakowski 678954 167 167 018134267781312243 Rakowski 678954 167 231 018124311241312243 Rakowski 678954 167 333 011312344451321242 Rodriguez NW6 2WS 125 125 018124398871321242 Rodriguez NW6 2WS 125 167 018134267781321242 Rodriguez NW6 2WS 125 231 018124311241321242 Rodriguez NW6 2WS 125 333 011312344451542311 Smithson N4 3YP 421 125 018124398871542311 Smithson N4 3YP 421 167 018134267781542311 Smithson N4 3YP 421 231 018124311241542311 Smithson N4 3YP 421 333 011312344451657399 Vanloo 67543W 231 125 018124398871657399 Vanloo 67543W 231 167 018134267781657399 Vanloo 67543W 231 231 018124311241657399 Vanloo 67543W 231 333 01131234445
Natural Join Example (cont’d)CUSTOMER X AGENT
CUS_CODE CUS_LNAME CUS_POSTCODE
AGENT_CODE
AGENT_PHONE
1132445 Walker M1 5RT 231 01812431124
1217782 Adares NW6 4RT 125 01812439887
1312243 Rakowski 678954 167 01813426778
1321242 Rodriguez NW6 2WS 125 01812439887
1657399 Vanloo 67543W 231 01812431124
OUTER JOIN
• When using the theta and natural join, it is possible
that some tuples in the joined relations do not have
identical values for the common attributes. As a
result these tuples will be ‘lost’.
• If it is required that all tuples from the original
tables be shown in the resulting relation, then it is
necessary to have a join which keeps all the tuples
in R1 which have no corresponding values in R2. The
tuples in the R2 will have null values. This type of
join is known as the outer join
OUTER JOIN (cont’d)
• There are 3 common types of the outer join
▫ Left outer join – keeps data from the left-hand relation
▫ Right outer join – keeps data from the right-hand
relation
▫ Full outer join – keeps data from both relations
• The steps for determining an outer join are very
similar to those for computing a natural join, except
that data from the left or right side of the relation,
depending on whether one is performing a left or
right outer join is included.
OUTER JOIN (cont’d)
• The stages in performing a left outer join are:
1. Compute R1 X R2.
2. Select those tuples where . Only the rows are selected where
the attribute values in the join column(s) are equal
3. Select those tuples in R1 that do not have matching values in
R2 , so
4. Perform a PROJECT operation on either to the result in step
2, and call it y in the final relation. This is to ensure that the
final relation results in a single copy of each attribute in the
joining column, thereby eliminating duplicates. Finally,
project the rest of attributes in R1 and R2 , except y, and drop
the R1 and R2 in the final relation
Left Outer Join Example (cont’d)
•A left outer join of CUSTOMER and
AGENT, will return all the tuples in the
CUSTOMER relation, including those that
do not have a matching value in the
AGENT relation.CUS_CODE CUS_LNAME CUS_POSTCODE
AGENT_CODE AGENT_PHONE
1132445 Walker M1 5RT 231 01812431124
1217782 Adares NW6 4RT 125 01812439887
1312243 Rakowski 678954 167 01813426778
1321242 Rodriguez NW6 2WS 125 01812439887
1657399 Vanloo 67543W 231 01812431124
1542311 Smithson N4 3YP 421 NULL
Right Outer Join Example (cont’d)
•A right outer join of CUSTOMER and
AGENT, will return all the tuples in the
AGENT relation, including those that do
not have a matching value in the
CUSTOMER relation.CUS_CODE CUS_LNAME CUS_POSTCOD
EAGENT_CODE AGENT_PHON
E1132445 Walker M1 5RT 231 01812431124
1217782 Adares NW6 4RT 125 01812439887
1312243 Rakowski 678954 167 01813426778
1321242 Rodriguez NW6 2WS 125 01812439887
1657399 Vanloo 67543W 231 01812431124
NULL NULL NULL 333 01131234445
CONSTRUCTING QUERIES USING RELATIONAL ALGEBRAIC EXPRESSIONS
• The main purpose of relational algebra is to
provide a way to create and manipulate relations
(tables) in a database.
• During the lifetime of a database, users will ask
many different kinds of queries. The task of
building a query involves breaking the query
down into a number of smaller steps, where
each step generates a set of intermediate results
which are then used in the steps of the query.
CONSTRUCTING QUERIES USING RELATIONAL ALGEBRAIC EXPRESSIONS
• The following steps should be followed when
building a query using relational algebraic
expressions
1. List all the attributes needed to give the
answer
2. Select all the relations needed based on the
list of attributes.
3. Specify the relational operators and the
intermediate results that are needed
Building Queries Example - Car Maintenance Database
• In a small database that stores information about the
maintenance of cars, each car is required to undergo an
inspection each year to test to see if it is roadworthy.
After each inspection a maintenance record is created
and any repairs that are needed are recorded. A repair
can require new parts to be purchased and fitted. If a car
needs a repair then the EVALUATION is set to FAIL until
all the repairs are completed and then it is set to PASS.
The Entity Relationship Diagram (ERD) and tables are
shown below.
Car Maintenance Example (cont’d)Car inspection ERD
Car Maintenance Example (cont’d)
REGISTRATION
CAR_MAKE
CAR_MODEL
CAR_COLOUR MODEL_YEAR
LINCENCE_NO
3679MR82 Toyota Corolla Blue 2006 1967fr89768
E-TS865 Nissan Micro Red 2004 1973Smith121
PE57UVP Peugeot 407 Blue 2007 1990byt3212
PISE567 Volkswagen
Eos Lime 2006 DF-678-WV
ROMA482 Volkswagen
Golf GT Black 2007 AQ-123-AV
Z-BA975 Peugeot 207 Black 2007 1980vrt7312PART_NO PART_NAME PART_COST12390 Paint
sealantsGhȻ14.95
12391 Wiper GhȻ19.9512392 Brake pads GhȻ24.9912393 Brake Discs GhȻ49.5412395 Spark Plugs GhȻ0.9912396 Airbag GhȻ24.9512397 Tyres GhȻ25.00
CAR
PART
Car Maintenance Example (cont’d)
INSPECTION_CODE REGISTRATION INSPECTION_DATE EVALUATION
100036 PE57UVP 10/05/2008 FAIL100390 ROMA482 01/09/2008 106750 E-TS865 01/03/2006 PASS122456 Z-BA975 03/10/2008 FAIL145678 PISE567 30/09/2007 PASS200450 E-TS865 21/02/2005 PASS200456 E-TS865 01/04/2007 FAIL
MAINTENANCE_RECORD
INSPECTION_CODE PART_NO106750 12396106750 12397100036 12393200450 12391100036 12397200450 12392200456 12397
REPAIR
Car Maintenance Example (cont’d)
•Consider the following query asked by a user:▫‘List all information about cars where the
model year is after 2006’•To answer this query, one must first
interpret that ‘List all information about cars’ means list all attributes in the relation CAR. The user only wants to see information on cars where the attribute MODEL_YEAR>2006. using the relational operator SELECT we can write the query as a relational algebraic expression as:
Car Maintenance Example (cont’d)
REGISTRATION
CAR_MAKE CAR_MODEL
CAR_COLOUR MODEL_YEAR
LINCENCE_NO
PE57UVP Peugeot 407 Blue 2007 1990byt3212
ROMA482 Volkswagen
Golf GT Black 2007 AQ-123-AV
Z-BA975 Peugeot 207 Black 2007 1980vrt7312
Car Maintenance Assignment
1. Display all the part names and their
prices where the cost of the part is
greater than GhȻ20.00
2. List the car registration and model
details and part numbers for all cars
where the model year is 2007, where an
inspection was carried out after
01/03/2008, which resulted in a part
being required for the repair.