M.P. Johnson, DBMS, Stern/NYU, Spring 2008
1
C20.0046: Database Management SystemsLecture #10
M.P. JohnsonStern School of Business, NYUSpring, 2008
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
2
Agenda Subqueries, etc.
Sets, etc.
Nulls Outer joins
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
3
Operators on subqueries Several new operators applied to (unary)
selections:1. IN R2. EXISTS R3. UNIQUE R4. s > ALL R5. s > ANY R6. x IN R
> is just an example op Each expression can be negated with NOT
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
4
Next: ALL opEmployees(name, job, divid, salary)Find which employees are paid more than all the programmers
SELECT nameFROM EmployeesWHERE salary > ALL (SELECT salary FROM Employees WHERE job='programmer')
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
5
ANY/SOME opEmployees(name, job, divid, salary)Find which employees are paid more than at least one vice president
SELECT nameFROM EmployeesWHERE salary > ANY (SELECT salary FROM Employees WHERE job='VP')
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
6
ANY/SOME opEmployees(name, job, divid, salary)Find which employees are paid more than at least one vice president
SELECT nameFROM EmployeesWHERE salary > SOME (SELECT salary FROM Employees WHERE job='VP')
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
7
Existential/Universal conditionsEmployees(name, job, divid, salary)
Division(name, id, head)
Find all divisions with an employee whose salary is > 100000
Existential: easy!
SELECT DISTINCT Division.nameFROM Employees JOIN Division ON divid=idWHERE salary > 100000
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
8
Universal conditionsEmployees(name, job, divid, salary)Division(name, id, head)
Find all divisions in which everyone makes > 100000
Universal: less easy
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
9
Universal conditions
Idea: find divisions with some poor employee, and throw them out
(SELECT DISTINCT name from Division)
MINUS
(SELECT DISTINCT Division.nameFROM Employees JOIN Division ON divid=idWHERE salary > 100000)
Employees(name, job, divid, salary)Division(name, id, head)Find all divisions in which everyone makes > 100000
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
10
Or, universal with IN
2. Select the divisions we didn’t find:
1. Find the other divisions: in which someone makes <= 100000:
SELECT nameFROM DivisionWHERE id IN (SELECT divid FROM Employees WHERE salary <= 100000
SELECT nameFROM DivisionWHERE id NOT IN (SELECT divid FROM Employees WHERE salary <= 100000
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
11
Or, universal with ALL
Using <= ALL
Employees(name, job, divid, salary)Division(name, id, head)
Find all divisions in which everyone makes > 100000
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
12
Next: correlated subqueries Acc(name,bal,type…) Q: Who has the largest balance?
Can we do this with subqueries?
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
13
Acc(name,bal,type,…) Q: Find holder of largest account
(Later, could use MAX, but still need a subquery here…)
SELECT nameFROM AccWHERE bal >= ALL (SELECT bal FROM Acc)
Correlated Queries
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
14
Correlated Queries So far, subquery executed once;
result used for higher query More complicated: correlated queries
“[T]he subquery… [is] evaluated many times, once for each assignment of a value to some term in the subquery that comes from a tuple variable outside the subquery” (Ullman, p286).
Q: What does this mean? A: That subqueries refer to vars from outer queries
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
15
Acc(name,bal,type,…) Q2: Find holder of largest account of each type
SELECT name, typeFROM AccWHERE bal >= ALL (SELECT bal FROM Acc WHERE type=type)
Correlated Queries
correlation
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
16
Acc(name,bal,type,…) Q2: Find holder of largest account of each type
Note:1. scope of variables2. this can still be expressed as single SFW
SELECT name, typeFROM Acc a1WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=a1.type)
Correlated Queries
correlation
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
17
New topic: R.A./SQL Set Operators Relations are sets have set-theoretic ops
Venn diagrams
Union: R1 R2 Example:
ActiveEmployees RetiredEmployees
Difference: R1 – R2 Example:
AllEmployees – RetiredEmployees = ActiveEmployees
Intersection: R1 R2 Example:
RetiredEmployees UnionizedEmployees
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
18
Set operations - exampleName Address Gender BirthdateFisher 123 Maple F 9/9/99Hamill 456 Oak M 8/8/88
Name Address Gender BirthdateFisher 123 Maple F 9/9/99Ford 345 Palm M 7/7/77
R:
S:
Name Address Gender BirthdateFisher 123 Maple F 9/9/99Hamill 456 Oak M 8/8/88Ford 345 Palm M 7/7/77
R S:
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
19
Set operations - exampleName Address Gender BirthdateFisher 123 Maple F 9/9/99Hamill 456 Oak M 8/8/88
Name Address Gender BirthdateFisher 123 Maple F 9/9/99Ford 345 Palm M 7/7/77
R:
S:
R S: Name Address Gender BirthdateFisher 123 Maple F 9/9/99
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
20
Set operations - exampleName Address Gender BirthdateFisher 123 Maple F 9/9/99Hamill 456 Oak M 8/8/88
Name Address Gender BirthdateFisher 123 Maple F 9/9/99Ford 345 Palm M 7/7/77
R:
S:
R - S: Name Address Gender BirthdateHamill 456 Oak M 8/8/88
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
21
Set ops in SQL Orthodox SQL has set operators:
UNION, INTERSECT, EXCEPT Oracle SQL uses MINUS rather than EXCEPT See the Ullman page on more differences
These ops applied to queries:
(SELECT name FROM Person WHERE City = 'New York')
INTERSECT(SELECT custname FROM Purchase WHERE store='Kim''s')
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
22
Boat examples Reserve(ssn,bmodel,color)
Q: Find ssns of sailors who reserved red boats or green boats
SELECT DISTINCT ssnFROM reserveWHERE color = 'red' OR color = 'green'
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
23
Boat examples Reserve(ssn,bmodel,color)
Q: Find ssns of sailors who reserved red boats and green boats
SELECT DISTINCT ssnFROM reserveWHERE color = 'red' AND color = 'green'
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
24
Boat examples Reserve(ssn,bmodel,color)
Q: Find ssns of sailors who reserved red boats and green boats
SELECT DISTINCT r1.ssnFROM reserve r1, reserve r2WHERE r1.ssn = r2.ssn AND r1.color = 'red' AND r2.color = 'green'
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
25
Boat examples Reserve(ssn,bmodel,color)
Q: Find ssns of sailors who reserved red boats and green boats
(SELECT DISTINCT ssn FROM reserve WHERE color = 'red') INTERSECT(SELECT DISTINCT ssn
FROM reserve WHERE color = 'green')
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
26
Boat examples Reserve(ssn,bmodel,color)
Q: Find ssns of sailors who reserved red boats or green boats
(SELECT DISTINCT ssn FROM reserve WHERE color = 'red') UNION (SELECT DISTINCT ssn
FROM reserve WHERE color = 'green')
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
27
Boat examples Reserve(ssn,bmodel,color)
Q: Find ssns of sailors who reserved red boats but not green boats
(SELECT DISTINCT ssn FROM reserve WHERE color = 'red') EXCEPT (SELECT DISTINCT ssn
FROM reserve WHERE color = 'green')
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
28
(SELECT name, address FROM Cust1)
UNION(SELECT name FROM Cust2)
Union-Compatibility Situation: Cust1(name,address,…), Cust2(name,…) Want: report of all customer names and addresses
(if known) Can’t do:
Both tables must have same sequence of types Applies to all set ops
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
29
Union-Compatibility Situation: Cust1(name,address,…), Cust2(name,…) Want: report of all customer names and addresses
(if known) But can do:
Resulting field names taken from first table
(SELECT name, address FROM Cust1)
UNION(SELECT name, '(N/A)' FROM Cust2)
Result(name, address)
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
30
First Unintuitive SQLism Looking for R (S T)
But what happens if T is empty?
See transcript of this in Oracle on sales
SELECT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
31
New topic: Nulls in SQL If we don’t have a value, can put a NULL
Null can mean several things: Value does not exists Value exists but is unknown Value not applicable
But null is not the same as 0 See Douglas Foster Wallace…
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
32
Null Values x = NULL 4*(3-x)/7 = NULL x = NULL x + 3 – x = NULL x = NULL 3 + (x-x) = NULL x = NULL x = 'Joe' is UNKNOWN
In general: no row using null fields appear in the selection test will pass the test With one exception
Pace Boole, SQL has three boolean values: FALSE = 0 TRUE = 1 UNKNOWN = 0.5
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
33
Null values in boolean expressions C1 AND C2 = min(C1, C2) C1 OR C2 = max(C1, C2) NOT C1 = 1 – C1
height > 6 = UNKNOWN UNKNOWN OR weight > 190 = UNKOWN (age < 25) AND UNKNOWN = UNKNOWN
E.g.age=20height=NULLweight=180
SELECT *FROM PersonWHERE (age < 25) AND (height > 6 OR weight > 190)
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
34
Comparing null and non-nulls The schema specifies whether null is allowed for
each attribute NOT NULL to forbid Nulls are allowed by default
Unexpected behavior:
Some Persons are not included! The “trichotomy law” does not hold!
SELECT *FROM PersonWHERE age < 25 OR age >= 25
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
35
Testing for null values Can test for NULL explicitly:
x IS NULL x IS NOT NULL
But: x = NULL is never true
Now it includes all Persons
SELECT *FROM PersonWHERE age < 25 OR age >= 25 OR age IS NULL
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
36
Null/logic review TRUE AND UNKNOWN = ?
TRUE OR UNKNOWN = ?
UNKNOWN OR UNKNOWN = ?
X = NULL = ?
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
37
Next: Outer join Like inner join except that dangling tuples are
included, padded with nulls
Left outerjoin: dangling tuples from left are included Nulls appear “on the right”
Right outerjoin: dangling tuples from right are included Nulls appear “on the left”
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
38
Cross join - exampleName Address Gender BirthdateHanks 123 Palm Rd M 01/01/60Taylor 456 Maple Av F 02/02/40Lucas 789 Oak St M 03/03/55
Name Address NetworthSpielberg 246 Palm Rd 10MTaylor 456 Maple Av 20MLucas 789 Oak St 30M
MovieStar
MovieExec
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
39
Name Address G. Birthdate Name Address Net
Hanks 123 Palm Rd M 01/01/60
Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M
Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M
Spielberg 246 Palm Rd 10M
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
40
Outer Join - ExampleSELECT * FROM MovieStar LEFT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name
SELECT * FROM MovieStar RIGHT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name
Name Address G. Birthdate Name Address Net
Hanks 123 Palm Rd M 01/01/60 Null Null Null
Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M
Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M
Null Null Null Null Spielberg 246 Palm Rd 10M
Name Address G. Birthdate Name Address Net
Hanks 123 Palm Rd M 01/01/60 Null Null Null
Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M
Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M
Null Null Null Null Spielberg 246 Palm Rd 10M
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
41
Outer Join - Example
Name Address Gender Birthdate
Hanks 123 Palm Rd M 01/01/60
Taylor 456 Maple Av F 02/02/40
Lucas 789 Oak St M 03/03/55
Name Address Networth
Spielberg 246 Palm Rd 10M
Taylor 456 Maple Av 20M
Lucas 789 Oak St 30M
MovieStar MovieExec
SELECT * FROM MovieStar FULL OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name
Name Address G. Birthdate Name Address Net
Hanks 123 Palm Rd M 01/01/60 Null Null Null
Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M
Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M
Null Null Null Null Spielberg 246 Palm Rd 10M
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
42
New-style outer joins Outer joins may be left, right, or full
FROM A LEFT [OUTER] JOIN B; FROM A RIGHT [OUTER] JOIN B; FROM A FULL [OUTER] JOIN B;
OUTER is optional If OUTER is included, then FULL is the default
Q: How to remember left v. right? A: It indicates the side whose rows are always
included
M.P. Johnson, DBMS, Stern/NYU, Spring 2008
43
Review Examples from sqlzoo.net
SELECT LFROM R1, …, Rn
WHERE C
L(C(R1 x … Rn)