relational algebra and my sql
DESCRIPTION
Lecture 5 CS157B. Relational Algebra and My SQL. Prof. Sin Min Lee Deparment of Computer Science San Jose State University. Functional Dependency. A, B R(….) , we say FD: AB if. For all t1, t2 element of R if t1[A] = t2[A] => t1[B] = t2[B]. Definition: A R(….) is a key if FD: A. - PowerPoint PPT PresentationTRANSCRIPT
Relational Algebra and My SQLProf. Sin Min Lee
Deparment of Computer Science
San Jose State University
Functional Dependency
•For all t1, t2 element of R if t1[A] = t2[A] => t1[B] = t2[B]
A, B R(….) , we say FD: AB if
Definition: A R(….) is a key if FD: A R
Definition: A R(….) is a candidate key if FD: AR
and not proper subset of A is a key.
4. Find all the candidate keys for the following table ( Mid1 Study Guide)
R(A B C D) 1 2 3 4 2 2 3 5 3 2 5 1 1 2 5 6 S (A B C D) 1 2 3 4 2 2 3 5 3 2 5 1 1 2 5 6
Queries: Q1:
AR (…)A is not a key
Q2:BR (…)B is not a key
Q3:CR (…)C is not a key
Q4:DR (…)Yes-D is a key
Q5:ABR (…)AB together cannot form the key
Q6:ACR (…)Yes- AC together cannot form the key
Q7:BCR (…)BC together cannot form the key
Relational Algebra Basic operations:
Selection ( ) Selects a subset of rows from relation. Projection ( ) Deletes unwanted columns from relation. Cross-product ( ) Allows us to combine two relations. Set-difference ( ) Tuples in reln. 1, but not in reln. 2. Union ( ) Tuples in reln. 1 and in reln. 2.
Additional operations: Intersection, join, division, renaming: Not essential, but (very!) useful.
Since each operation returns a relation, operations can be composed! (Algebra is “closed”.)
Projectionsname rating
yuppy 9lubber 8guppy 5rusty 10
sname rating
S,
( )2
age
35.055.5
age S( )2
Deletes attributes that are not in projection list.
Schema of result contains exactly the fields in the projection list, with the same names that they had in the (only) input relation.
Projection operator has to eliminate duplicates! (Why??, what are the consequences?) Note: real systems typically
don’t do duplicate elimination unless the user explicitly asks for it.
Selection
rating
S82( )
sid sname rating age28 yuppy 9 35.058 rusty 10 35.0
sname ratingyuppy 9rusty 10
sname rating rating
S,
( ( ))82
Selects rows that satisfy selection condition.
Schema of result identical to schema of (only) input relation.
Result relation can be the input for another relational algebra operation! (Operator composition.)
Union, Intersection, Set-Difference
All of these operations take two input relations, which must be union-compatible: Same number of fields. `Corresponding’ fields
have the same type. What is the schema of
result?
sid sname rating age
22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.044 guppy 5 35.028 yuppy 9 35.0
sid sname rating age31 lubber 8 55.558 rusty 10 35.0
S S1 2
S S1 2
sid sname rating age
22 dustin 7 45.0S S1 2
Cross-Product Each row of S1 is paired with each row of R1. Result schema has one field per field of S1 and
R1, with field names `inherited’ if possible.Conflict: Both S1 and R1 have a field called
sid.
( ( , ), )C sid sid S R1 1 5 2 1 1
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/ 10/ 96
22 dustin 7 45.0 58 103 11/ 12/ 96
31 lubber 8 55.5 22 101 10/ 10/ 96
31 lubber 8 55.5 58 103 11/ 12/ 96
58 rusty 10 35.0 22 101 10/ 10/ 96
58 rusty 10 35.0 58 103 11/ 12/ 96 Renaming operator:
Joins Condition Join:
Result schema same as that of cross-product. Fewer tuples than cross-product. Filters tuples not
satisfying the join condition. Sometimes called a theta-join.
R c S c R S ( )
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 58 103 11/ 12/ 9631 lubber 8 55.5 58 103 11/ 12/ 96
S RS sid R sid
1 11 1
. .
Joins Equi-Join: A special case of condition join where
the condition c contains only equalities.
Result schema similar to cross-product, but only one copy of fields for which equality is specified.
Natural Join: Equijoin on all common fields.
sid sname rating age bid day
22 dustin 7 45.0 101 10/ 10/ 9658 rusty 10 35.0 103 11/ 12/ 96
)11(,..,,..,
RSsidbidagesid
Division Not supported as a primitive operator, but useful for
expressing queries like: Find sailors who have reserved all boats.
Precondition: in A/B, the attributes in B must be included in the schema for A. Also, the result has attributes A-B. SALES(supId, prodId); PRODUCTS(prodId); Relations SALES and PRODUCTS must be built using
projections. SALES/PRODUCTS: the ids of the suppliers supplying
ALL products.
Examples of Division A/B
sno pnos1 p1s1 p2s1 p3s1 p4s2 p1s2 p2s3 p2s4 p2s4 p4
pnop2
pnop2p4
pnop1p2p4
snos1s2s3s4
snos1s4
snos1
A
B1B2
B3
A/B1 A/B2 A/B3
Expressing A/B Using Basic Operators
Division is not essential op; just a useful shorthand. (Also true of joins, but joins are so common that
systems implement joins specially. Division is NOT implemented in SQL).
Idea: For SALES/PRODUCTS, compute all products such that there exists at least one supplier not supplying it. x value is disqualified if by attaching y value from B,
we obtain an xy tuple that is not in A.))Pr)((( SalesoductsSales
sidsidA
The answer is sid(Sales) - A
EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN Equality join connects tuples from two relations that match on
certain attributes. The specified joining columns are kept in the resulting relation. ∏name(бdname=‘toy’(Emp Dept)))
Natural join connects tuples from two relations that match on the specified common attributes ∏name(бdname=‘toy’(Emp Dept)))
How is an equality join between Emp and Dept using dno different than a natural join between Emp and Dept using dno? Equality join: SS#, name, age, salary, Emp.dno,
Dept.dno, … Natural join: SS#, name, age, salary, dno, dname, …
Join is similar to equality join using different comparison operators A S op = {=, ≠, ≤, ≥, <, >} att op att
(dno)
(dno)
EXAMPLE JOIN
Equality Join, (Emp Dept)))
SS# Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 1
3 Bob 22 27000 1
4 Kathy 30 30000 2
5 Shideh 4 4000 1
EMP
dno dname floor mgrss#
1 Toy 1 5
2 Shoe 2 1
Dept
(dno)
SS# Name Age Salary EMP.dno Dept.dno dname floor mgrss#
1 Joe 24 20000 2 2 Shoe 2 1
2 Mary 20 25000 1 1 Toy 1 5
3 Bob 22 27000 1 1 Toy 1 5
4 Kathy 30 30000 2 2 Shoe 2 1
5 Shideh 4 4000 1 1 Toy 1 5
EXAMPLE JOIN
Natural Join, (Emp Dept)))
SS# Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 1
3 Bob 22 27000 1
4 Kathy 30 30000 2
5 Shideh 4 4000 1
EMP
dno dname floor mgrss#
1 Toy 1 5
2 Shoe 2 1
Dept
(dno)
SS# Name Age Salary dno dname floor mgrss#
1 Joe 24 20000 2 Shoe 2 1
2 Mary 20 25000 1 Toy 1 5
3 Bob 22 27000 1 Toy 1 5
4 Kathy 30 30000 2 Shoe 2 1
5 Shideh 4 4000 1 Toy 1 5
EXAMPLE JOIN
Join, (Emp ρx(Emp))))
SS# Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 1
3 Bob 22 27000 1
4 Kathy 30 30000 2
5 Shideh 4 4000 1
EMP
dno dname floor mgrss#
1 Toy 1 5
2 Shoe 2 1
Dept
Salary > 5 * salary
SS# Name Age Salary dno x.SS# x.Name x.Age x.Salary x.dno
2 Mary 20 25000 1 2 Shideh 4 4000 1
3 Bob 22 27000 1 3 Shideh 4 4000 1
4 Kathy 30 30000 2 4 Shideh 4 4000 1
EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN (Cont…)
Example: retrieve the name of employees who earn more than Joe: ∏name(Emp (sal>x.sal)бname=‘Joe’(ρ x(Emp)))
Semi-Join selects the columns of one relation that joins with another. It is equivalent to a join followed by a projection: Emp (dno)Dept ≡∏SS#, name, age, salary, dno(Emp
Dept)
JOIN OPERATORS Condition Joins: - Defined as a cross-product followed by a selection:
R ⋈c S = σc(R S) ( is called the bow-tie)⋈where c is the condition.
- Example:Given the sample relational instances S1 and R1
The condition join S ⋈S1.sid<R1.sid R1 yields
JOIN OPERATORS Condition Joins: - Defined as a cross-product followed by a selection:
R ⋈c S = σc(R S) ( is called the bow-tie)⋈where c is the condition.
- Example:Given the sample relational instances S1 and R1
The condition join S ⋈S1.sid<R1.sid R1 yields
Equijoin:Special case of the condition join where the join condition consists solely of
equalities between two fields in R and S connected by the logical AND operator ( ).∧
Example: Given the two sample relational instances S1 and R1
The operator S1 R.sid=Ssid R1 yields
SQL SQL (Structured Query Language) is
the standard language for commercial DBMSs SEQUEL (Structured English QUEry Language)
was originally defined by IBM for System R standardization of SQL began in the 80s current standard is SQL-99 SQL is more than a query language it includes a DDL, DML and
administration commands SQL is an example of a transform-oriented language. A language designed to use relations to transform inputs into required
outputs.
04/19/23
Basic structure of an SQL Basic structure of an SQL queryquery22
GeneralStructure
SELECT, ALL / DISTINCT, *,AS, FROM, WHERE
Comparison IN, BETWEEN, LIKE "% _"
Grouping GROUP BY, HAVING,COUNT( ), SUM( ), AVG( ), MAX( ), MIN( )
Display Order ORDER BY, ASC / DESC
LogicalOperators
AND, OR, NOT
Output INTO TABLE / CURSORTO FILE [ADDITIVE], TO PRINTER, TO SCREEN
Union UNION
04/19/23
fieldfield typetype widthwidth contentscontentsid numeric 4 student id numbername character 10 namedob date 8 date of birthsex character 1 sex: M / Fclass character 2 classhcode character 1 house code: R, Y, B, Gdcode character 3 district coderemission logical 1 fee remissionmtest numeric 2 Math test score
22 The Situation:Student ParticularsThe Situation:Student Particulars
04/19/23
General StructureGeneral StructureII
SELECTSELECT [[ALL / DISTINCTALL / DISTINCT] ] expr1expr1 [ [ASAS col1col1], ], expr2expr2 [ [ASAS col2col2] ] ;;
FROMFROM tablenametablename WHEREWHERE conditioncondition
SELECT ...... FROM ...... WHERE ......SELECT ...... FROM ...... WHERE ......
04/19/23
General StructureGeneral StructureII The query will select rows from the source tablename and output the result in table form.
Expressions expr1, expr2 can be : (1) a column, or (2) an expression of functions and fields.
SELECTSELECT [[ALL / DISTINCTALL / DISTINCT] ] expr1expr1 [ [ASAS col1col1], ], expr2expr2 [ [ASAS col2col2] ] ;;
FROMFROM tablenametablename WHEREWHERE conditioncondition
And col1, col2 are their corresponding column names in the output table.
3.2 SQL
SELECT a1, ..., an
FROM R1, R2, …, Rm
WHERE Con1, …,Conk
This means: π a1, ..., an
( σ Con1( … (σ Conk
( R1 × R2 × … × Rm))…))
Example:
Find the SSN and tax for each person.
SELECT SSN, Tax FROM Taxrecord, Taxtable WHERE wages + interest + capital_gain = income
AS – keyword used to rename relations
Two SQL expressions can be combined by:
INTERSECT
UNION
MINUS – set difference
Example:
Find the names of the streets that intersect.
SELECT S.NAME, T.NAME
FROM Streets AS S, Streets AS T
WHERE S.X = T.X and
S.Y = T.Y
Example: Assume we have the relations:
Broadcast ( Radio, X , Y )
Town ( Name, X, Y )
Find the parts of Lincoln, NE that can
be reached by at least one Radio station.
(SELECT X, Y
FROM Town
WHERE Name = “Lincoln”)
INTERSECT
(SELECT X, Y
FROM Broadcast)
Example:
Find the SSN and tax for each person.
πSSN,Tax σwages+interest+capital_gain = income Taxrecord × Taxtable
Example:
Find the area of Lincoln reached by a radio station.
( πX,Y ( σName=“Lincoln” Town ) ) ( πX,Y Broadcast )
Another way of connecting SQL expressions
is using the IN keyword.
SELECT ……..
FROM ……..
WHERE a IN ( SELECT b
FROM …..
WHERE ….. )
SQL with aggregation – SELECT aggregate_function FROM ……. WHERE ……
aggregate_function –
Max (c1a1 + ……..+ cnan) where ai are attributes
Min (c1a1 + ……..+ cnan) and ci are constants
Sum(a) where a is an attribute that is Avg(a) constant in each constraint tuple Count(a)
Example:
Package(Serial_No, From, Destination, Weight)
Postage (Weight , Fee)
Find the total postage of all packages sent
from Omaha.
SELECT Sum(Fee)
FROM Package, Postage
WHERE Package.Weight = Postage.Weight AND
Package.From = “ Omaha “
GROUP BY –
SELECT a1, …, an, aggregate_function FROM ….. WHERE ……
GROUP BY a1, ..., ak
• Evaluates basic SQL query• Groups the tuples according to different values of
a1,..,ak
• Applies the aggregate function to each group separately
• {a1, …, ak} {a1, …, an}
Example:
Find the total postage sent out from each city.
SELECT Package.From, Sum(Postage.Fee)
FROM Package, Postage
WHERE Package.Weight = Postage.Weight
GROUP BY Package.From
04/19/23
General StructureGeneral StructureII DISTINCT will eliminate duplication in the output
while ALL will keep all duplicated rows.
condition can be : (1) an inequality, or (2) a string comparison using logical operators AND, OR, NOT.
SELECTSELECT [[ALL / DISTINCTALL / DISTINCT] ] expr1expr1 [ [ASAS col1col1], ], expr2expr2 [ [ASAS col2col2] ] ;;
FROMFROM tablenametablename WHEREWHERE conditioncondition
04/19/23
General StructureGeneral StructureIIBefore using SQL, open the student file:
USE studentUSE student
eg. 1eg. 1 List all the student records.List all the student records.
SELECT * FROM student
id name dob sex class mtest hcode dcode remission9801 Peter 06/04/86 M 1A 70 R SSP .F.9802 Mary 01/10/86 F 1A 92 Y HHM .F.9803 Johnny 03/16/86 M 1A 91 G SSP .T.9804 Wendy 07/09/86 F 1B 84 B YMT .F.9805 Tobe 10/17/86 M 1B 88 R YMT .F.: : : : : : : : :
Result
04/19/23
General StructureGeneral StructureIIeg. 2eg. 2 List the names and house code of 1A students.List the names and house code of 1A students.
SELECT name, hcode, class FROM student ;
WHERE class="1A"
Class
11AA
11AA
11AA
11BB
11BB
::
Class
11AA
11AA
11AA
11BB
11BB
::
class="1A"
04/19/23
General StructureGeneral StructureII
name hcode classPeter R 1AMary Y 1AJohnny G 1ALuke G 1ABobby B 1AAaron R 1A: : :
Result
eg. 2eg. 2 List the names and house code of 1A students.List the names and house code of 1A students.
04/19/23
General StructureGeneral StructureIIeg. 3eg. 3 List the residential district of the Red House List the residential district of the Red House
members.members.
SELECT DISTINCT dcode FROM student ;
WHERE hcode="R"
dcodeHHMKWCMKKSSPTSTYMT
Result
Data Manipulation
Select: query data in the database Insert: insert data into a table Update: updates data in a table Delete: delete data from a table
Source: Database Systems Connolly/Begg
Retrieve all columns and all rows
SELECT firstColumn,…,lastColumn
FROM tableName;
SELECT *
FROM tableName;
Use of Distinct
SELECT DISTINCT columnName
FROM tableName;
columnName
A
B
C
D
columnName
A
A
B
B
C
D
Calculated fields
SELECT columnName/2
FROM tableName
price
5.00
3.00
6.00
price
10.00
6.00
12.00
Comparison Search Condition
= equals
< > is not equal to (ISO standard)
!= “ “ “ “ (allowed in some dialects)
< is less than
> is greater than
<= is less than or equal to
>= is greater than or equal to
Source: Database Systems Connolly/Begg
Comparison Search Condition
An expression is evaluated left to right.
Subexpressions in brackets are evaluated first.
NOTs are evaluated before ANDs and ORs.
ANDs are evaluated before ORs.
Source: Database Systems Connolly/Begg
Range Search ConditionSELECT columnNameFROM tableNameWHERE columnName BETWEEN 20
AND 30;
SELECT columnNameFROM tableNameWHERE columnName >= 20 AND columnName <= 30;
Set membership search condition
SELECT columnName
FROM tableName
WHERE columnName
IN (‘name1’, ‘name2’);
SELECT columnName
FROM tableName
WHERE columnName = ‘name1’
OR columnName = ‘name2’;
Pattern matching symbols
% represents any sequence of zero
or more characters (wildcard).
_ represents any single character
Source: Database Systems Connolly/Begg
Pattern match search condition‘h%’ : begins with the character h .
‘h_ _ _’ : four character string beginning with the character h.
‘%e’ : any sequence of characters, of length at least 1, ending with the character e.
‘%CS157B%’ : any sequence of characters of any length containing CS157B
Source: Database Systems Connolly/Begg
Pattern match search condition
LIKE ‘h%’
begins with the character h .
NOT LIKE ‘h%’
does not begin with the character h.
Source: Database Systems Connolly/Begg
Pattern match search condition
To search a string that includes a
pattern-matching character
‘15%’
Use an escape character to represent
the pattern-matching character.
LIKE ‘15#%’ ESCAPE ‘#’
Source: Database Systems Connolly/Begg
NULL search condition
DOES NOT WORK
comment = ‘ ’
comment != ‘ ’
DOES WORK
comment IS NULL
comment IS NOT NULL
Sorting
The ORDER BY clause consists of list of column
identifiers that the result is to be sorted on, separated by commas.
Allows the retrieved rows to be ordered by ascending (ASC) or descending (DESC) order
Source: Database Systems Connolly/Begg
Sorting
Column identifier may be A column name A column number (deprecated)
Source: Database Systems Connolly/Begg
Sorting
SELECT type, rent
FROM tableName
ORDER BY type, rent ASC;
Source: Database Systems Connolly/Begg
type rent
Apt
Apt
Flat
Flat
450
500
600
650
type rent
Flat
Apt
Flat Apt
650 450
600 500
Aggregate Functions
COUNT returns the number … SUM returns the sum … AVG returns the average … MIN returns the smallest … MAX returns the largest …
value in a specified column.
Source: Database Systems Connolly/Begg
Use of COUNT( * )
How many students in CS157B?
SELECT COUNT( * ) AS my count
FROM CS157B
my count
40
GROUP BY clause
When GROUP BY is used, each item in the SELECT list must be single-valued per group.
The SELECT clause may contain only Column names Aggregate functions Constants An expression involving combinations of the
above
Source: Database Systems Connolly/Begg
GroupingSELECT dept, COUNT(staffNo) AS my count
SUM(salary)
FROM tableName
GROUP BY dept
ORDER BY dept
dept my count
Salary
A
B
C
2
2
1
300.00300.00 200.00
dept staffNo Salary
A
B
C
A
B
1
1
1
2
2
200.00
200.00
200.00
100.00
100.00
Restricting Grouping
HAVING clause is with the GROUP BY clause. filters groups into resulting table. includes at least one aggregate
function. WHERE clause
filters individual rows into resulting table.
Aggregate functions cannot be used.
Source: Database Systems Connolly/Begg
SELECT dept, COUNT(staffNo) AS my count, SUM(salary) AS my sum
FROM StaffGROUP BY deptHAVING COUNT(staffNo) > 1ORDER BY dept;
Source: Database Systems Connolly/Begg
dept my count
my sum
A
B
2
2
300.00 300.00
dept staffNo Salary
A
B
C
A
B
1
1
1
2
2
200.00
200.00
200.00
100.00
100.00
Subqueries
SELECT columnNameA
FROM tableName1
WHERE columnNameB = (SELECT columnNameB
FROM tableName2
WHERE condition);
Source: Database Systems Connolly/Begg
result from inner SELECT applied as a condition for the outer SELECT
Subquery with Aggregate Function
SELECT fName, salary –
( SELECT AVG(salary)
FROM Staff ) AS salDiff
FROM Staff
WHERE salary > ( SELECT AVG(salary)
FROM Staff );
Source: Database Systems Connolly/Begg
List all staff whose salary is greater than the average salary,show by how much their salary is greater than the average.
Nested Subqueries: Use of IN
SELECT propertyFROM PropertyForRentWHERE staff IN(
SELECT staffFROM StaffWHERE branch = (
SELECT branch FROM Branch WHERE street = ‘112 A St’));
Source: Database Systems Connolly/Begg
Selects branch at 112 A St
Nested Subqueries: Use of IN
SELECT property
FROM PropertyForRent
WHERE staff IN(
SELECT staff
FROM Staff
WHERE branch = ( branch ) );
Source: Database Systems Connolly/Begg
Select staff members who works at branch.
Nested Subqueries: Use of IN
SELECT property
FROM PropertyForRent
WHERE staff IN( staffs who works
at branch on ‘112 A St’);
Source: Database Systems Connolly/Begg
Since there are more than one row selected, “=“ cannot be used.
Use of ANY/SOME
SELECT name, salary
FROM Staff
WHERE salary > SOME( SELECT salary
FROM Staff
WHERE branch = ‘A’ );
Source: Database Systems Connolly/Begg
Result:{2000,3000,4000}
Result: {list of staff with salary greater than 2000.}
Use of ALL
SELECT name, salary
FROM Staff
WHERE salary > ALL( SELECT salary
FROM Staff
WHERE branch = ‘A’ );
Source: Database Systems Connolly/Begg
Result:{2000,3000,4000}
Result: {list of staff with salary greater than 4000.}
Use of Any/Some and All
If the subquery is empty: ALL returns true ANY returns false
ISO standard allows SOME to be
used interchangeably with ANY.
Source: Database Systems Connolly/Begg