1 database systems lecture #7 yan pan school of software, sysu 2011

81
1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

Upload: kevin-glenn

Post on 12-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

1

Database SystemsLecture #7

Yan Pan

School of Software, SYSU

2011

Page 2: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

2

Agenda Subqueries, etc.

Page 3: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

3

Relation operators Basic operators:

Selection: Projection: Cartesian Product:

Other set-theoretic ops: Union: Intersection: Difference: -

Additional operators: Joins (natural, equijoin, theta join, semijoin) Renaming: Grouping…

Page 4: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

4

New-style join types Cross joins (simplest):

FROM A CROSS JOIN B

Inner joins (regular joins): FROM A [INNER] JOIN B ON …

Natural join: FROM A NATURAL JOIN B; Joins on common fields and merges

Outer joins (later) No dangling rows

Page 5: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

5

SQL e.g. with tuple vars Reps(ssn, name, etc.) Clients(ssn, name, rssn)

Q: Who are George’s clients, in SQL?

Conceptually: Clients.name(Reps.name=“George” and Reps.ssn=rssn(Reps x

Clients))

Page 6: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

6

New topic: Subqueries Powerful feature of SQL: one clause can

contain other SQL queries Anywhere where a value or relation is allowed

Several ways: Selection single constant (scalar) in SELECT Selection single constant (scalar) in WHERE Selection relation in WHERE Selection relation in FROM

Page 7: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

7

Standard multi-table example Purchase(prodname, buyerssn, etc.) Person(name, ssn, etc.) Q: What did Christo buy?

As usual, need to AND on equality identifying ssn’s row and buyerssn’s row

SELECT Purchase.prodnameFROM Purchase, PersonWHERE buyerssn = ssn AND name = 'Christo'

SELECT Purchase.prodnameFROM Purchase, PersonWHERE buyerssn = ssn AND name = 'Christo'

Page 8: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

8

Subquery motivation Purchase(prodname, buyerssn, etc.) Person(name, ssn, etc.) Q: What did Christo buy?

Natural intuition: Go find Christo ’s ssn Then find purchases

SELECT ssnFROM PersonWHERE name = 'Christo'

SELECT ssnFROM PersonWHERE name = 'Christo'

SELECT Purchase.prodnameFROM PurchaseWHERE buyerssn = Christo’s-ssn

SELECT Purchase.prodnameFROM PurchaseWHERE buyerssn = Christo’s-ssn

Page 9: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

9

Subqueries Subquery: copy in Christo ’s selection for his ssn:

The subquery returns one value, so the = is valid If it returns more (or fewer), we get a run-time error

SELECT Purchase.prodnameFROM PurchaseWHERE buyerssn = (SELECT ssn FROM Person WHERE name = 'Christo')

SELECT Purchase.prodnameFROM PurchaseWHERE buyerssn = (SELECT ssn FROM Person WHERE name = 'Christo')

Page 10: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

10

Operators on subqueries Several new operators applied to (unary)

selections:1. IN R

2. EXISTS R

3. UNIQUE R

4. s > ALL R

5. s > ANY R

6. x IN R > is just an example op Each expression can be negated with NOT

Page 11: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

Guifeng Zheng, DBMS, SS/SYSU 11

Subqueries with IN Product(prodname,maker), Person(name,ssn),

Purchase(buyerssn,product) Q: Find companies Martha bought products from Strategy:

1. Find Martha’s ssn2. Find products listed with that ssn as buyer3. Find company names of those products

SELECT DISTINCT Product.makerFROM ProductWHERE prodname IN (SELECT product FROM Purchase WHERE buyerssn =

(SELECT ssn FROM Person

WHERE name = 'Martha'))

SELECT DISTINCT Product.makerFROM ProductWHERE prodname IN (SELECT product FROM Purchase WHERE buyerssn =

(SELECT ssn FROM Person

WHERE name = 'Martha'))

Page 12: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

12

Subqueries returning relations Equivalent to:

Or:

SELECT DISTINCT Product.makerFROM Product, Purchase, PeopleWHERE prodname = product AND buyerssn = ssn AND name = 'Martha'

SELECT DISTINCT Product.makerFROM Product, Purchase, PeopleWHERE prodname = product AND buyerssn = ssn AND name = 'Martha'

SELECT DISTINCT Product.makerFROM Product JOIN Purchase ON prodname=product JOIN People ON buyerssn=ssnWHERE name = 'Martha'

SELECT DISTINCT Product.makerFROM Product JOIN Purchase ON prodname=product JOIN People ON buyerssn=ssnWHERE name = 'Martha'

Page 13: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

13

FROM subqueries Motivation for another way:

suppose we’re given Martha’s purchases Then could just cross with Products to get product makers

Substitute (named) subquery for Martha’s purchases

SELECT makerFROM Product, (SELECT product FROM Purchase WHERE buyerssn =

(SELECT ssn FROM Person WHERE name = 'Martha')) Marthas

WHERE Product.name = Marthas.product

SELECT makerFROM Product, (SELECT product FROM Purchase WHERE buyerssn =

(SELECT ssn FROM Person WHERE name = 'Martha')) Marthas

WHERE Product.name = Marthas.product

Page 14: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

14

Complex RA Expressions Scenario:

1. Purchase(pid, seller-ssn, buyer-ssn, etc.)

2. Person(ssn, name, etc.)

3. Product(pid, name, etc.)

Q: Who (give names) bought gizmos from Dick? Where to start? Purchase uses pid, ssn, so must get them…

Page 15: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

15

Complex RA Expressions

Person Purchase Person Product

name='Dick' name='Gizmo'

pid ssn

seller-ssn=ssn

pid=pid

buyer-ssn=Person.ssn

name

Page 16: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

16

Translation to SQL

We’re converting the tree on the last slide into SQL The result of the query should be the names indicated above One step at a time, we’ll make the query more complete, until

we’ve translated the English-language description to an actual SQL query

We’ll also simplify the query when possible

(the names of the people who bought gadgets from Dick)

(the names of the people who bought gadgets from Dick)

Page 17: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

17

Translation to SQL

Blue type = actual SQL Black italics = description of subquery

Note: the subquery above consists of purchase records, except with the info describing the buyers attached In the results, the column header for name will be 'buyer'

SELECT DISTINCT name buyer FROM

(the info, along with buyer names, for purchases of gadgets sold by Dick)

SELECT DISTINCT name buyer FROM

(the info, along with buyer names, for purchases of gadgets sold by Dick)

Page 18: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

18

Translation to SQL

Note: the subquery in this version is being given the name P2 We’re pairing our rows from Person with rows from P2

SELECT DISTINCT name buyer FROM

(SELECT *FROM Person, (the purchases of gadgets from Dick) P2

WHERE Person.ssn = P2.buyer-ssn)

SELECT DISTINCT name buyer FROM

(SELECT *FROM Person, (the purchases of gadgets from Dick) P2

WHERE Person.ssn = P2.buyer-ssn)

Page 19: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

19

Translation to SQL

We simplified by combining the two SELECTs

SELECT DISTINCT name buyer

FROM Person, (the purchases of gadgets from Dick) P2

WHERE Person.ssn = P2.buyer-ssn

SELECT DISTINCT name buyer

FROM Person, (the purchases of gadgets from Dick) P2

WHERE Person.ssn = P2.buyer-ssn

Page 20: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

Guifeng Zheng, DBMS, SS/SYSU 20

Translation to SQL

P2 is still the name of the subquery It’s just been filled in with a query that contains two

subqueries Outer parentheses are bolded for clarity

SELECT DISTINCT name buyer

FROM Person, (SELECT * FROM Purchases WHERE seller-ssn = (Dick’s ssn)

AND pid = (the id of gadget)) P2

WHERE Person.ssn = P2.buyer-ssn

SELECT DISTINCT name buyer

FROM Person, (SELECT * FROM Purchases WHERE seller-ssn = (Dick’s ssn)

AND pid = (the id of gadget)) P2

WHERE Person.ssn = P2.buyer-ssn

Page 21: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

21

Translation to SQL

Now the subquery to find Dick’s ssn is filled in

SELECT DISTINCT name buyer

FROM Person, (SELECT * FROM Purchases WHERE seller-ssn = (SELECT ssn FROM Person WHERE name='Dick') AND pid = (the id of gadget)) P2

WHERE Person.ssn = P2.buyer-ssn

SELECT DISTINCT name buyer

FROM Person, (SELECT * FROM Purchases WHERE seller-ssn = (SELECT ssn FROM Person WHERE name='Dick') AND pid = (the id of gadget)) P2

WHERE Person.ssn = P2.buyer-ssn

Page 22: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

22

Translation to SQL

And now the subquery to find Gadget’s product id is filled in, too Note: the SQL simplified by using subqueries

Not used in relational algebra

SELECT DISTINCT name buyer

FROM Person, (SELECT * FROM Purchases WHERE seller-ssn = (SELECT ssn FROM Person WHERE name='Dick') AND pid = (SELECT pid FROM Product WHERE name='Gadget')) P2WHERE Person.ssn = P2.buyer-ssn

SELECT DISTINCT name buyer

FROM Person, (SELECT * FROM Purchases WHERE seller-ssn = (SELECT ssn FROM Person WHERE name='Dick') AND pid = (SELECT pid FROM Product WHERE name='Gadget')) P2WHERE Person.ssn = P2.buyer-ssn

Page 23: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

23

George’s-neighbors with subqueries People(ssn, name, street, city, state, state) Q: Who lives on George’s street?

A: First, find George: name='George'(People)

And get George’s street/city/state: street(name='George'(People))

Look up people on that street…

Page 24: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

24

Next: ALL op

Employees(name, job, divid, salary)Find which employees are paid more than all the programmers

SELECT nameFROM EmployeesWHERE salary > ALL (SELECT salary FROM Employees WHERE job='programmer')

SELECT nameFROM EmployeesWHERE salary > ALL (SELECT salary FROM Employees WHERE job='programmer')

Page 25: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

Guifeng Zheng, DBMS, SS/SYSU 25

ANY/SOME op

Employees(name, job, divid, salary)Find which employees are paid more than at least one vice president

SELECT nameFROM EmployeesWHERE salary > ANY (SELECT salary FROM Employees WHERE job='VP')

SELECT nameFROM EmployeesWHERE salary > ANY (SELECT salary FROM Employees WHERE job='VP')

Page 26: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

Guifeng Zheng, DBMS, SS/SYSU 26

ANY/SOME op

Employees(name, job, divid, salary)Find which employees are paid more than at least one vice president

SELECT nameFROM EmployeesWHERE salary > SOME (SELECT salary FROM Employees WHERE job='VP')

SELECT nameFROM EmployeesWHERE salary > SOME (SELECT salary FROM Employees WHERE job='VP')

Page 27: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

27

Existential/Universal ConditionsEmployees(name, job, divid, salary)

Division(name, id, head)

Find all divisions with an employee whose salary is > 100000

Existential: easy!

SELECT DISTINCT Division.nameFROM Employees, DivisionWHERE salary > 100000 AND divid=id

SELECT DISTINCT Division.nameFROM Employees, DivisionWHERE salary > 100000 AND divid=id

Page 28: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

28

Existential/Universal ConditionsEmployees(name, job, divid, salary)

Division(name, id, head)

Find all divisions in which everyone makes > 100000

Existential: easy!

Page 29: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

29

Existential/universal with IN

2. Select the divisions we didn’t find:

1. Find the other divisions: in which someone makes <= 100000:

SELECT nameFROM DivisionWHERE id IN (SELECT divid FROM Employees WHERE salary <= 100000

SELECT nameFROM DivisionWHERE id IN (SELECT divid FROM Employees WHERE salary <= 100000

SELECT nameFROM DivisionWHERE id NOT IN (SELECT divid FROM Employees WHERE salary <= 100000

SELECT nameFROM DivisionWHERE id NOT IN (SELECT divid FROM Employees WHERE salary <= 100000

Page 30: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

30

Next: correlated subqueries Acc(name,bal,type…) Q: Who has the largest balance?

Can we do this with subqueries?

Page 31: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

31

Acc(name,bal,type,…) Q: Find holder of largest account

SELECT nameFROM AccWHERE bal >= ALL (SELECT bal FROM Acc)

SELECT nameFROM AccWHERE bal >= ALL (SELECT bal FROM Acc)

Correlated Queries

Page 32: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

32

Correlated Queries So far, subquery executed once;

result used for higher query More complicated: correlated queries

“[T]he subquery… [is] evaluated many times, once for each assignment of a value to some term in the subquery that comes from a tuple variable outside the subquery” (Ullman, p286).

Q: What does this mean? A: That subqueries refer to vars from outer queries

Page 33: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

33

Acc(name,bal,type,…) Q2: Find holder of largest account of each type

SELECT name, typeFROM AccWHERE bal >= ALL (SELECT bal FROM Acc WHERE type=type)

SELECT name, typeFROM AccWHERE bal >= ALL (SELECT bal FROM Acc WHERE type=type)

Correlated Queries

correlation

Page 34: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

34

Acc(name,bal,type,…) Q2: Find holder of largest account of each type

Note:1. scope of variables

SELECT name, typeFROM Acc as a1WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=a1.type)

SELECT name, typeFROM Acc as a1WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=a1.type)

Correlated Queries

correlation

Page 35: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

35

New topic: R.A./SQL Set Operators Relations are sets have set-theoretic ops

Venn diagrams

Union: R1 R2 Example:

ActiveEmployees RetiredEmployees

Difference: R1 – R2 Example:

AllEmployees – RetiredEmployees = ActiveEmployees

Intersection: R1 R2 Example:

RetiredEmployees UnionizedEmployees

Page 36: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

36

Set operations - exampleName Address Gender Birthdate

Fisher 123 Maple F 9/9/99

Hamill 456 Oak M 8/8/88

Name Address Gender Birthdate

Fisher 123 Maple F 9/9/99

Ford 345 Palm M 7/7/77

R:

S:

Name Address Gender Birthdate

Fisher 123 Maple F 9/9/99

Hamill 456 Oak M 8/8/88

Ford 345 Palm M 7/7/77

R S:

Page 37: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

37

Set operations - exampleName Address Gender Birthdate

Fisher 123 Maple F 9/9/99

Hamill 456 Oak M 8/8/88

Name Address Gender Birthdate

Fisher 123 Maple F 9/9/99

Ford 345 Palm M 7/7/77

R:

S:

R S: Name Address Gender Birthdate

Fisher 123 Maple F 9/9/99

Page 38: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

38

Set operations - exampleName Address Gender Birthdate

Fisher 123 Maple F 9/9/99

Hamill 456 Oak M 8/8/88

Name Address Gender Birthdate

Fisher 123 Maple F 9/9/99

Ford 345 Palm M 7/7/77

R:

S:

R - S: Name Address Gender Birthdate

Hamill 456 Oak M 8/8/88

Page 39: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

39

Set ops in SQL UNION, INTERSECT, EXCEPT Oracle SQL uses MINUS rather than

EXCEPT These ops applied to queries:

(SELECT name FROM Person WHERE City = 'New York')

INTERSECT(SELECT custname FROM Purchase WHERE store='Kim''s')

(SELECT name FROM Person WHERE City = 'New York')

INTERSECT(SELECT custname FROM Purchase WHERE store='Kim''s')

Page 40: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

40

Boat examples Reserve(ssn,bmodel,color)

Q: Find ssns of sailors who reserved red boats or green boats

SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red' OR color = 'green'

SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red' OR color = 'green'

Page 41: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

41

Boat examples Reserve(ssn,bmodel,color)

Q: Find ssns of sailors who reserved red boats and green boats

SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red' AND color = 'green'

SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red' AND color = 'green'

Page 42: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

Guifeng Zheng, DBMS, SS/SYSU 42

Boat examples Reserve(ssn,bmodel,color)

Q: Find ssns of sailors who reserved red boats and green boats

SELECT DISTINCT r1.ssn

FROM reserve r1, reserve r2

WHERE r1.ssn = r2.ssn AND r1.color = 'red' AND r2.color = 'green'

SELECT DISTINCT r1.ssn

FROM reserve r1, reserve r2

WHERE r1.ssn = r2.ssn AND r1.color = 'red' AND r2.color = 'green'

Page 43: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

43

Boat examples Reserve(ssn,bmodel,color)

Q: Find ssns of sailors who reserved red boats and green boats

(SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red') INTERSECT(SELECT DISTINCT ssn

FROM reserve

WHERE color = 'green')

(SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red') INTERSECT(SELECT DISTINCT ssn

FROM reserve

WHERE color = 'green')

Page 44: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

44

Boat examples Reserve(ssn,bmodel,color)

Q: Find ssns of sailors who reserved red boats or green boats

(SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red') UNION (SELECT DISTINCT ssn

FROM reserve

WHERE color = 'green')

(SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red') UNION (SELECT DISTINCT ssn

FROM reserve

WHERE color = 'green')

Page 45: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

45

Boat examples Reserve(ssn,bmodel,color)

Q: Find ssns of sailors who reserved red boats but not green boats

(SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red') EXCEPT (SELECT DISTINCT ssn

FROM reserve

WHERE color = 'green')

(SELECT DISTINCT ssn

FROM reserve

WHERE color = 'red') EXCEPT (SELECT DISTINCT ssn

FROM reserve

WHERE color = 'green')

Page 46: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

46

(SELECT name, address FROM Cust1)

UNION(SELECT name FROM Cust2)

(SELECT name, address FROM Cust1)

UNION(SELECT name FROM Cust2)

Union-Compatibility Situation: Cust1(name,address,…), Cust2(name,…) Want: report of all customer names and addresses

(if known) Can’t do:

Both tables must have same sequence of types Applies to all set ops

Page 47: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

47

Union-Compatibility Situation: Cust1(name,address,…), Cust2(name,…) Want: report of all customer names and addresses

(if known) But can do:

Resulting field names taken from first table

(SELECT name, address FROM Cust1)

UNION(SELECT name, '(N/A)' FROM Cust2)

(SELECT name, address FROM Cust1)

UNION(SELECT name, '(N/A)' FROM Cust2)

Result(name, address)Result(name, address)

Page 48: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

48

New topic: Nulls in SQL If we don’t have a value, can put a NULL

Null can mean several things: Value does not exists Value exists but is unknown Value not applicable

But null is not the same as 0

Page 49: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

49

Null Values x = NULL 4*(3-x)/7 = NULL x = NULL x + 3 – x = NULL x = NULL 3 + (x-x) = NULL x = NULL x = 'Joe' is UNKNOWN

In general: no row using null fields appear in the selection test will pass the test With one exception

Pace Boole, SQL has three boolean values: FALSE = 0 TRUE = 1 UNKNOWN = 0.5

Page 50: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

50

Null values in boolean expressions C1 AND C2 = min(C1, C2) C1 OR C2 = max(C1, C2) NOT C1 = 1 – C1

height > 6 = UNKNOWN UNKNOWN OR weight > 190 = UNKOWN (age < 25) AND UNKNOWN = UNKNOWN

E.g.age=20height=NULLweight=180

SELECT *FROM PersonWHERE (age < 25) AND (height > 6 OR weight > 190)

SELECT *FROM PersonWHERE (age < 25) AND (height > 6 OR weight > 190)

Page 51: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

51

Comparing null and non-nulls The schema specifies whether null is allowed for

each attribute NOT NULL to forbid Nulls are allowed by default

Unexpected behavior:

Some Persons are not included! The “trichotomy law” does not hold!

SELECT *FROM PersonWHERE age < 25 OR age >= 25

SELECT *FROM PersonWHERE age < 25 OR age >= 25

Page 52: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

52

Testing for null values Can test for NULL explicitly:

x IS NULL x IS NOT NULL

But: x = NULL is never true

Now it includes all Persons

SELECT *FROM PersonWHERE age < 25 OR age >= 25 OR age IS NULL

SELECT *FROM PersonWHERE age < 25 OR age >= 25 OR age IS NULL

Page 53: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

53

Null/logic review TRUE AND UNKNOWN = ?

TRUE OR UNKNOWN = ?

UNKNOWN OR UNKNOWN = ?

X = NULL = ?

Page 54: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

54

Next: Outer join Like inner join except that dangling tuples are

included, padded with nulls

Left outerjoin: dangling tuples from left are included Nulls appear “on the right”

Right outerjoin: dangling tuples from right are included Nulls appear “on the left”

Page 55: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

55

Cross join - example

Name Address Gender Birthdate

Hanks 123 Palm Rd M 01/01/60

Taylor 456 Maple Av F 02/02/40

Lucas 789 Oak St M 03/03/55

Name Address Networth

Spielberg 246 Palm Rd 10M

Taylor 456 Maple Av 20M

Lucas 789 Oak St 30M

MovieStar

MovieExec

Page 56: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

56

Name Address G. Birthdate Name Address Net

Hanks 123 Palm Rd M 01/01/60

Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M

Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M

Spielberg 246 Palm Rd 10M

Page 57: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

Guifeng Zheng, DBMS, SS/SYSU 57

Outer Join - ExampleSELECT * FROM MovieStar LEFT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name

SELECT * FROM MovieStar LEFT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name

SELECT * FROM MovieStar RIGHT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name

SELECT * FROM MovieStar RIGHT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name

Name Address G. Birthdate Name Address Net

Hanks 123 Palm Rd M 01/01/60 Null Null Null

Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M

Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M

Null Null Null Null Spielberg 246 Palm Rd 10M

Name Address G. Birthdate Name Address Net

Hanks 123 Palm Rd M 01/01/60 Null Null Null

Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M

Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M

Null Null Null Null Spielberg 246 Palm Rd 10M

Page 58: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

58

Outer Join - Example

Name Address Gender Birthdate

Hanks 123 Palm Rd M 01/01/60

Taylor 456 Maple Av F 02/02/40

Lucas 789 Oak St M 03/03/55

Name Address Networth

Spielberg 246 Palm Rd 10M

Taylor 456 Maple Av 20M

Lucas 789 Oak St 30M

MovieStar MovieExec

SELECT * FROM MovieStar FULL OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name

SELECT * FROM MovieStar FULL OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name

Name Address G. Birthdate Name Address Net

Hanks 123 Palm Rd M 01/01/60 Null Null Null

Taylor 456 Maple Av F 02/02/40 Taylor 456 Maple Av 20M

Lucas 789 Oak St M 03/03/55 Lucas 789 Oak St 30M

Null Null Null Null Spielberg 246 Palm Rd 10M

Page 59: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

59

New-style outer joins Outer joins may be left, right, or full

FROM A LEFT [OUTER] JOIN B; FROM A RIGHT [OUTER] JOIN B; FROM A FULL [OUTER] JOIN B;

OUTER is optional If OUTER is included, then FULL is the default

Q: How to remember left v. right? A: It indicates the side whose rows are always

included

Page 60: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

60

Next: Grouping & Aggregation In SQL:

aggregation operators in SELECT, Grouping in GROUP BY clause

Recall aggregation operators: sum, avg, min, max, count

strings, numbers, dates Each applies to scalars Count also applies to row: count(*) Can DISTINCT inside aggregation op: count(DISTINCT x)

Grouping: group rows that agree on single value Each group becomes one row in result

Page 61: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

61

Aggregation functions Numerical: SUM, AVG, MIN, MAX Char: MIN, MAX

In lexocographic/alphabetic order Any attribute: COUNT

Number of values

SUM(B) = 10 AVG(A) = 1.5 MIN(A) = 1 MAX(A) = 3 COUNT(A) = 4

A B

1 2

3 4

1 2

1 2

Page 62: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

62

Straight aggregation In R.A. sum(x)total(R) In SQL:

Just put the aggregation op in SELECT NB: aggreg. ops applied to each non-null val

count(x) counts the number of nun-null vals in field x Use count(*) to count the number of rows

SELECT SUM(x) totalFROM R

SELECT SUM(x) totalFROM R

Page 63: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

63

Straight aggregation example COUNT applies to duplicates, unless otherwise stated:

Better:

Can we say:

same as Count(*), except excludes nulls

SELECT Count(category)FROM ProductWHERE year > 1995

SELECT Count(category)FROM ProductWHERE year > 1995

SELECT COUNT(DISTINCT category)FROM ProductWHERE year > 1995

SELECT COUNT(DISTINCT category)FROM ProductWHERE year > 1995

SELECT category, COUNT(category)FROM ProductWHERE year > 1995

SELECT category, COUNT(category)FROM ProductWHERE year > 1995

Page 64: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

64

Straight aggregation example Purchase(product, date, price, quantity)

Q: Find total sales for the entire database:

Q: Find total sales of bagels:

SELECT SUM(price * quantity)FROM Purchase

SELECT SUM(price * quantity)FROM Purchase

SELECT SUM(price * quantity)FROM PurchaseWHERE product = 'bagel'

SELECT SUM(price * quantity)FROM PurchaseWHERE product = 'bagel'

Page 65: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

65

Largest balance again Acc(name,bal,type) Q: Who has the largest balance? Q: Who has the largest balance of each

type?

Can we do these with aggregation functions?

Page 66: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

66

Straight grouping Group rows together by field values Produces one row for each group

I.e., by each (combin. of) grouped val(s) Don’t select non-grouped fields

Reduces to DISTINCT selections:

SELECT productFROM PurchaseGROUP BY product

SELECT productFROM PurchaseGROUP BY product

SELECT DISTINCT productFROM Purchase

SELECT DISTINCT productFROM Purchase

Page 67: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

67

Grouping & aggregation Sometimes want to group and compute

aggregations by group Aggregation op applied to rows in group, not to all rows in table

Purchase(product, date, price, quantity) Find total sales for products that sold for > 0.50:

SELECT product, SUM(price*quantity) totalFROM PurchaseWHERE price > .50GROUP BY product

SELECT product, SUM(price*quantity) totalFROM PurchaseWHERE price > .50GROUP BY product

Page 68: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

68

Illustrated G&A example

Product Date Price Quantity

Bagel 10/21 0.85 15

Banana 10/22 0.52 7

Banana 10/19 0.52 17

Bagel 10/20 0.85 20

Purchase

Page 69: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

69

Product Date Price Quantity

Banana 10/19 0.52 17

Banana 10/22 0.52 7

Bagel 10/20 0.85 20

Bagel 10/21 0.85 15

First compute the FROM-WHERE Then GROUP BY product:

Illustrated G&A example

Page 70: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

70

Product TotalSales

Bagel $29.75

Banana $12.48

Finally, aggregate and select:

Illustrated G&A example

SELECT product, SUM(price*quantity) totalFROM PurchaseWHERW price > .50GROUP BY product

SELECT product, SUM(price*quantity) totalFROM PurchaseWHERW price > .50GROUP BY product

Page 71: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

71

Illustrated G&A example GROUP BY may be reduced to (a possibly more

complicated) subquery:

SELECT product, SUM(price*quantity) totalFROM PurchaseWHERE price > .50GROUP BY product

SELECT product, SUM(price*quantity) totalFROM PurchaseWHERE price > .50GROUP BY product

SELECT DISTINCT x.product, (SELECT SUM(y.price*y.quantity) FROM Purchase y WHERE x.product = y.product AND y.price > .50) totalFROM Purchase xWHERE x.price > .50

SELECT DISTINCT x.product, (SELECT SUM(y.price*y.quantity) FROM Purchase y WHERE x.product = y.product AND y.price > .50) totalFROM Purchase xWHERE x.price > .50

Page 72: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

72

For every product, what is the total sales and max quantity sold?

Product SumSales MaxQuantity

Banana $12.48 17

Bagel $29.75 20

Multiple aggregations

SELECT product, SUM(price * quantity) SumSales, MAX(quantity) MaxQuantityFROM PurchaseWHERE price > .50GROUP BY product

SELECT product, SUM(price * quantity) SumSales, MAX(quantity) MaxQuantityFROM PurchaseWHERE price > .50GROUP BY product

Page 73: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

73

Another grouping/aggregation e.g. Movie(title, year, length, studioName)

Q: How many total minutes of film have been produced by each studio?

Strategy: Divide movies into groups per studio, then add lengths per group

Page 74: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

74

Another grouping/aggregation e.g.

Title Year Length Studio

Star Wars 1977 120 Fox

Jedi 1980 105 Fox

Aviator 2004 800 Miramax

Pulp Fiction 1995 110 Miramax

Lost in Translation

2003 95 Universal

SELECT studio, sum(length) totalLengthFROM MoviesGROUP BY studio

SELECT studio, sum(length) totalLengthFROM MoviesGROUP BY studio

Page 75: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

75

Another grouping/aggregation e.g.

Title Year Length Studio

Star Wars 1977 120 Fox

Jedi 1980 105 Fox

Aviator 2004 800 Miramax

Pulp Fiction 1995 110 Miramax

Lost in Translation

2003 95 Universal

SELECT studio, sum(length) lengthFROM MoviesGROUP BY studio

SELECT studio, sum(length) lengthFROM MoviesGROUP BY studio

Page 76: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

76

Another grouping/aggregation e.g.

Title Year Length Studio

Star Wars 1977 120 Fox

Jedi 1980 105 Fox

Aviator 2004 800 Miramax

Pulp Fiction 1995 110 Miramax

Lost in Translation

2003 95 Universal

Studio Length

Fox 225

Miramax 910

Universal 95

SELECT studio, sum(length) totalLengthFROM MoviesGROUP BY studio

SELECT studio, sum(length) totalLengthFROM MoviesGROUP BY studio

Page 77: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

77

Grouping/aggregation example StarsIn(SName,Title,Year) Q: Find the year of each star’s first movie

Q: Find the span of each star’s career Look up first and last movies

SELECT sname, min(year) firstyearFROM StarsInGROUP BY sname

SELECT sname, min(year) firstyearFROM StarsInGROUP BY sname

Page 78: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

78

Account types again Acc(name,bal,type) Q: Who has the largest balance of each

type?

Can we do this with grouping/aggregation?

Page 79: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

79

G & A for constructed relations Movie(title,year,producerSsn,length) MovieExec(name,ssn,netWorth)

Can do the same thing for larger, non-atomic relations Q: How many mins. of film did each producer make?

What happens to non-producer movie-execs?

SELECT name, sum(length) totalFROM Movie, MovieExecWHERE producerSsn = ssnGROUP BY name

SELECT name, sum(length) totalFROM Movie, MovieExecWHERE producerSsn = ssnGROUP BY name

Page 80: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

80

HAVING clauses Sometimes want to limit which rows may be grouped Q: How many mins. of film did each rich producer

make? Rich = netWorth > 10000000

Q: Is HAVING necessary here? A: No, could just add rich req. to WHERE

SELECT name, sum(length) totalFROM Movie, MovieExecWHERE producerSsn = ssnGROUP BY nameHAVING netWorth > 10000000

SELECT name, sum(length) totalFROM Movie, MovieExecWHERE producerSsn = ssnGROUP BY nameHAVING netWorth > 10000000

Page 81: 1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011

81

HAVING clauses Sometimes want to limit which rows may be

grouped Q: How many mins. of film did each rich producer

make? Old = made movies before 1930

Q: Is HAVING necessary here?

SELECT name, sum(length) totalFROM Movie, MovieExecWHERE producerSsn = ssnGROUP BY nameHAVING min(year) < 1930

SELECT name, sum(length) totalFROM Movie, MovieExecWHERE producerSsn = ssnGROUP BY nameHAVING min(year) < 1930