# relational algebra ch. 7.4 – 7.6 john ortiz. lecture 4relational algebra2 relational query...

Embed Size (px)

TRANSCRIPT

Relational AlgebraCh. 7.4 – 7.6

John Ortiz

Lecture 4 Relational Algebra 2

Relational Query Languages Query languages: allow manipulation and

retrieval of data from a database. Relational QLs are simple & powerful.

Strong formal foundation based on logic. Allows for much optimization.

Query languages != programming languages! Not intended for complex calculations. Support easy, efficient access to large

data sets.

Lecture 4 Relational Algebra 4

Preliminaries A query is applied to relation instances,

and the result of a query is also a relation instance.

Schemas of input & result relations are fixed (determined by relations & query language constructs).

A query is specified against schemas (regardless of instances).

Attributes may be referenced either by names or by positions (two notation systems).

Lecture 4 Relational Algebra 5

Relational Algebra Basic Operations:

Selection (): choose a subset of rows. Projection (): choose a subset of columns. Cross Product (): Combine two tables. Union (): unique tuples from either table. Set difference (): tuples in R1 not in R2. Renaming (): change names of tables &

columns Additional Operations (for convenience):

Intersection, joins (very useful), division, outer joins, aggregate functions, etc.

Lecture 4 Relational Algebra 6

Selection Format: selection-condition(R). Choose tuples

that satisfy the selection condition. Result has identical schema as the input. Major = ‘CS’ (Students)

SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS 678 Ken 3.5 Math

StudentsSID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS

Result

Selection condition is a Boolean expression including =, , <, , >, , and, or, not.

Lecture 4 Relational Algebra 7

Projection Format: attribute-list(R). Retain only those

columns in the attribute-list. Result must eliminate duplicates. Major(Students)

SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS 678 Ken 3.5 Math

StudentsMajor CS Math

Result

Operations can be composed.

Name, GPA(Major = ‘CS’ (Students))

Lecture 4 Relational Algebra 8

Cross Product Format: R1 R2. Each row of R1 is paired

with each row of R2. Result schema consists of all attributes of

R1 followed by all attributes of R2.

Problem: Columns may have identical names. Use notation R.A, or renaming attributes.

Only some rows make sense. Often need a selection to follow.

Lecture 4 Relational Algebra 9

Example of Cross Product

SID Name GPA Major SID Amount Year 456 John 3.4 CS 456 1500 1998 456 John 3.4 CS 678 3000 2000 457 Carl 3.2 CS 456 1500 1998 457 Carl 3.2 CS 678 3000 2000 678 Ken 3.5 Math 456 1500 1998 678 Ken 3.5 Math 678 3000 2000

Students Awards

SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS 678 Ken 3.5 Math

StudentsSID Amount Year 456 1500 1998 678 3000 2000

Awards

Lecture 4 Relational Algebra 10

Renaming Format: S(R) or S(A1, A2, …)(R): change the

name of relation R, and names of attributes of R

CS_Students(Major = ‘CS’ (Students))

SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS 678 Ken 3.5 Math

StudentsSID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS

CS_Students

Lecture 4 Relational Algebra 11

Union, Intersection, Set Difference

Format: R1 R2 (R1 R2, R1 R2). Return all tuples that belong to either R1 or R2 (to both R1 and R2; to R1 but not to R2).

Requirement: R1 and R2 are union compatible. With same number of attributes. Corresponding attributes have same

domains. Schema of result is identical to that of R1.

May need renaming. Duplicates are eliminated.

Lecture 4 Relational Algebra 12

Examples of Set Operations

SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS 678 Ken 3.5 Math

TAsSID Name GPA Major 456 John 3.4 CS 223 Bob 2.95 Ed

RAs

SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS 678 Ken 3.5 Math 223 Bob 2.95 Ed

TAs RAsSID Name GPA Major 456 John 3.4 CS

TAs RAs

SID Name GPA Major 457 Carl 3.2 CS 678 Ken 3.5 Math

TAs RAs

Lecture 4 Relational Algebra 13

Joins Theta Join.

Format: R1 join-condition R2.

Returns tuples in join-condition(R1 R2) Equijoin.

Same as Theta Join except the join-condition contains only equalities.

Natural Join. Same as Equijoin except that equality

conditions are on common attributes and duplicate columns are eliminated.

Lecture 4 Relational Algebra 14

Examples of Joins

Theta Join. Students Students.Age<=Profs.Age Profs

SID Name GPA Age Prof 456 John 3.4 29 123 457 Carl 3.2 35 123 678 Ken 3.5 25 154

StudentsPID Pname Age Dept 123 John 35 CS 154 Scott 28 Math

Profs

SID Name GPA Age Prof PID Pname Age Dept 456 John 3.4 29 123 123 John 35 CS 457 Carl 3.2 35 123 123 John 35 CS 678 Ken 3.5 25 154 123 John 35 CS 678 Ken 3.5 25 154 154 Scott 28 Math

Result

Lecture 4 Relational Algebra 15

Examples of Joins (cont.) Equijoin. Students Prof=PID AND Name=Pname Profs

SID Name GPA Age Prof PID Pname Age Dept 456 John 3.4 29 123 123 John 35 CS

Result

Natural Join. Students Profs

SID Name GPA Age Prof PID Pname Dept 457 Carl 3.2 35 123 123 John CS

Result

Lecture 4 Relational Algebra 16

Some Questions About Joins * What is the result of R1 R2 if they do

not have a common attribute? What is the result of R R? Consider relations Students(SSN, Name, GPA, Major, Age,

PSSN) Profs(PSSN, Name, Office, Age, Dept)

Which type of join should be used to find pairs of names of students and their advisors?

Can a natural join be used? How?

Lecture 4 Relational Algebra 17

Division Format: R1 R2. Restriction: Every attribute in R2 is in R1. For R1(A1, ..., An, B1, ..., Bm) R2(B1, ...,

Bm) and T = A1, ..., An (R1), Return the subset of T, say W, such that every tuple in W R2 is in R1.

W is the largest subset of T, such that, (W R2) R1

Lecture 4 Relational Algebra 18

An Example of Division Takes CS_Req

SID CNO 456 CS210 456 CS321 456 CS135 457 CS210 457 CS321 532 CS210 678 CS321

TakesSID 456 457

ResultCNO CS210 CS321

CS_Req

What is the meaning of this expression?

Lecture 4 Relational Algebra 19

Grouping & Aggregate Functions Format: group_attributes F aggregate_functions ( r ) Partition a relation into groups Apply aggregate function to each group Output grouping and aggregation values,

one tuple per group Ex: Major F count(SID), avg(GPA) (Students)

SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS 678 Ken 3.5 Math

StudentsMajor count(SID) avg(GPA) CS 2 3.3 Math 1 3.5

Result

Lecture 4 Relational Algebra 20

Dangling Tuples in Join Usually, only a subset of tuples of each

relation will actually participate in a join. Tuples of a relation not participating in a

join are dangling tuples. How do we keep dangling tuples in the

result of a join? (Why do we want to do that?) Use null values to indicate a “no-join”

situation.

Lecture 4 Relational Algebra 21

Outer Joins Left Outer Join.

Format: R1 R2. Similar to a natural join but keep all dangling tuples of R1.

Right Outer Join. Format: R1 R2. Similar to a natural join

but keep all dangling tuples of R2. (Full) Outer Join.

Format: R1 R2. Similar to a natural join but keep all dangling tuples of both R1 & R2.

Can also have Theta Outer Joins.

Lecture 4 Relational Algebra 22

Examples of Outer Joins

Left Outer Join. Students Awards

SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS 678 Ken 3.5 Math

StudentsSID Amount Year 456 1500 1998 678 3000 2000

Awards

SID Name GPA Major Amount Year 456 John 3.4 CS 1500 1998 457 Carl 3.2 CS Null Null 678 Ken 3.5 Math 3000 2000

Result

Lecture 4 Relational Algebra 23

Relational Algebra Exercises Find the result of these expressions.

R S R R.C=S.C S

B,E((B,C R) (E<7 S))

(A,BR) - S(A,B) (D,C S)

A B C D 1 2 3 4 2 2 5 1 3 4 2 6 4 2 5 3

RD C E 1 2 3 3 4 7 4 5 5 5 2 7

S

Lecture 4 Relational Algebra 24

Queries In Relational Algebra Consider the following database schema: Students(SSN, Name, GPA, Age, MajorDept) Enrollment(SSN, CourseNo, Grade) Courses(CourseNo, Title, DName) Departments(DName, Location, Phone)

Two methods: Use temporary relations. One expression per query.

Lecture 4 Relational Algebra 25

Queries In Relational Algebra List student name and course title such

that the student has an A in the course and the course is not offered by the student’s major department. Find those students who got an A in any

course. Find the department of the students and

the courses. Find the final answer.

Lecture 4 Relational Algebra 29

Summary Relational model provides simple yet

powerful formal query languages. Relational algebra is procedural and used

for internal representation of queries. Several ways to express a given query.

DBMS should choose the most efficient plan.

Any language able to express all relational algebra queries is relational complete.

Lecture 4 Relational Algebra 30

Summary (cont.)Lots useful properties. C1(C2(R)) = C2(C1(R)) = C1 and C2(R)

L1( L2(R)) = L1(R) , if L1 L2 R1 R2 = R2 R1 R1 (R2 R3) = (R1 R2) R3 R1 R2 = R2 R1 R1 (R2 R3) = (R1 R2) R3

Lecture 4 Relational Algebra 31

Look Ahead

Next topic: Translation form ER/EER to relational model

Read from the textbook: Chapter 14.1 – 14.2