1 theory, practice methodology of relational database design and programming copyright ellis cohen...

40
1 Theory, Practice & Methodology of Relational Database Design and Programming Copyright © Ellis Cohen 2002-2008 Collection Operators These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. For more information on how you may use them, please see http://www.openlineconsult.com/db

Upload: sharyl-hardy

Post on 18-Jan-2018

214 views

Category:

Documents


0 download

DESCRIPTION

3 © Ellis Cohen Union

TRANSCRIPT

Page 1: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

1

Theory, Practice & Methodology of Relational Database

Design and ProgrammingCopyright © Ellis Cohen 2002-2008

CollectionOperators

These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.

For more information on how you may use them, please see http://www.openlineconsult.com/db

Page 2: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 2

Overview of Lecture

Union

Intersect & Except

Collection Operators & Joins

Transitive Closure & Recursive Views

Page 3: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 3

Union

Page 4: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 4

Need for Unionsempno ename deptno … hiredate

7499 ALLEN 30 … 20-FEB-817654 MARTIN 30 … 28-SEP-817698 BLAKE 30 … 01-MAY-81 7839 KING 10 … 17-NOV-817844 TURNER 30 … 08-SEP-817986 STERN 50 … 23-NOV-99

pno pname pstart60492 Running Amuck … 12-FEB-8260493 Cooling Off … 01-JAN-0560498 Lifting Off … 01-JAN-05

Suppose you wanted to generate a combined list of important events (employee hirings + project starts) Event

NameEventDate

Page 5: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 5

Result of Unions

evname evdateALLEN 20-FEB-81

MARTIN 28-SEP-81BLAKE 01-MAY-81 KING 17-NOV-81

TURNER 08-SEP-81STERN 23-NOV-99

Running Amuck 12-FEB-82Cooling Off 01-JAN-05Lifting Off 01-JAN-05

SELECT ename AS evname, hiredate AS evdate FROM Emps

UNIONSELECT pname AS evname, pstart AS evdate FROM Projs

Page 6: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 6

ORDER BY

evname evdateALLEN 20-FEB-81BLAKE 01-MAY-81

Cooling Off 01-JAN-05KING 17-NOV-81

Lifting Off 01-JAN-05 MARTIN 28-SEP-81

Running Amuck 12-FEB-82STERN 23-NOV-99

TURNER 08-SEP-81

SELECT ename AS evname, hiredate AS evdate FROM Emps

UNIONSELECT pname AS evname, pstart AS evdate FROM ProjsORDER BY ename ORDER BY at the end

applies to entire result set

Page 7: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 7

SELECT ('Hired ' || ename) AS evname , hiredate AS evdate FROM Emps

UNIONSELECT ('Started ' || pname) AS evname , pstart AS evdate FROM Projs WHERE pstart is NOT NULL

ORDER BY evdate

UNION Syntax

Single ORDER BY at the end

Types (not names) must conform

First SELECT determines

column names

not needed, but put in in!

Page 8: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 8

Simple Case Expressions

SELECT ename, (CASE WHEN sal < 1500 THEN 'UNDERPAID' WHEN sal > 3000 THEN 'OVERPAID' ELSE to_char(sal) END) AS salary, salFROM Emps

Rewrite this using UNION instead of CASE

Page 9: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 9

Union of Disjoint Cases

SELECT ename, 'UNDERPAID' AS salary, sal FROM Emps WHERE sal < 1500

UNION SELECT ename, to_char(sal) AS salary, sal

FROM Emps WHERE sal BETWEEN 1500 AND 3000

UNION SELECT ename, 'OVERPAID' AS salary, sal

FROM Emps WHERE sal > 3000

Page 10: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 10

Unions with Duplicates

SELECT job FROM Emps WHERE sal >= 3000

ANALYSTPRESIDENTANALYST

SELECT job FROM Emps WHERE hiredate > '1-jan-82'

ANALYSTCLERKCLERK

What's the result ofSELECT job FROM Emps WHERE sal >= 3000 UNIONSELECT job FROM Emps WHERE hiredate > '1-jan-82'

Page 11: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 11

Counted Union vs. Set UnionSELECT job FROM Emps

WHERE sal >= 3000

ANALYSTPRESIDENTANALYST

SELECT job FROM Emps WHERE sal >= 3000

UNION ALL SELECT job FROM Emps

WHERE hiredate > '1-jan-82'ANALYSTPRESIDENTANALYSTANALYSTCLERKCLERK

SELECT job FROM Emps WHERE hiredate > '1-jan-82'

ANALYSTCLERKCLERK

SELECT job FROM Emps WHERE sal >= 3000

UNIONSELECT job FROM Emps

WHERE hiredate > '1-jan-82'

ANALYSTCLERKPRESIDENT

Set Union eliminates duplicates

Counted Unionincludes all tuples

1 2

1

2

Page 12: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 12

Counted & Set UnionSELECT job FROM Emps1UNIONSELECT job FROM Emps2

WITH CountedJobs AS ( SELECT job FROM Emps1 UNION ALL SELECT job FROM Emps2 )SELECT DISTINCT job FROM CountedJobs

Equivalent

Use UNION ALL whenever possible,

since it does not require removing duplicates, which can be expensive

Page 13: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 13

Disjunctive Joins vs UnionsList the names of all employees

who are project or department managersCompare

SELECT DISTINCT ename FROM Emps, ProjsWHERE job = 'DEPTMGR' OR empno = pmgr

vsSELECT ename FROM Emps

WHERE job = 'DEPTMGR' UNIONSELECT DISTINCT ename

FROM (Emps JOIN Projs ON empno = pmgr)

Which is better?Suppose there are no projects?

Page 14: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 14

Disjunctive Joins w/o ProjectsIf there are no projects in Projs

SELECT ename FROM Emps WHERE job = 'DEPTMGR'

UNIONSELECT DISTINCT ename FROM Emps, Projs

WHERE empno = pmgr

SELECT DISTINCT ename FROM Emps, ProjsWHERE job = 'DEPTMGR' OR empno = pmgr

Empty

Just the dept managers

Empty

Projs is empty, so ANY inner join with Projs will be empty too!

Beware of disjunctive joins!

Page 15: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 15

Intersect & Except

Page 16: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 16

IntersectionSELECT empno FROM Emps

WHERE job = 'CLERK'INTERSECTSELECT empno FROM Emps

WHERE deptno = 20

This can also be written as SELECT empno FROM Emps WHERE job = 'CLERK' AND deptno = 20Of course, this kind of rewriting won’t work when the intersected results are from different tables

1) Generates the empno's of clerks2) Generates the empno's of employees in dept 203) Generates the empno's common to both –

i.e. the clerks in department 20

1

2

3

Page 17: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 17

Intersection with DuplicatesSELECT job FROM Emps

WHERE deptno = 20

CLERKMANAGERANALYSTCLERKANALYST

SELECT job FROM Emps WHERE hiredate > '1-jan-82'

ANALYSTCLERKCLERK

What's the result ofSELECT job FROM Emps WHERE deptno = 20 INTERSECT SELECT job FROM Emps WHERE hiredate > '1-jan-82'

Page 18: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 18

INTERSECT ALL vs INTERSECTSELECT job FROM Emps

WHERE deptno = 20

CLERKMANAGERANALYSTCLERKANALYST

SELECT job FROM Emps WHERE deptno = 20

INTERSECT ALL SELECT job FROM Emps

WHERE hiredate > '1-jan-82'ANALYSTCLERKCLERK

SELECT job FROM Emps WHERE hiredate > '1-jan-82'

ANALYSTCLERKCLERK

SELECT job FROM Emps WHERE deptno = 20

INTERSECT SELECT job FROM Emps

WHERE hiredate > '1-jan-82'

ANALYSTCLERK

Set Intersect eliminates duplicates

Counted Intersect counts # of tuples & computes minimum #& keeps all of them

Not in Oracle

Page 19: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 19

Counted & Set IntersectSELECT job FROM Emps1INTERSECTSELECT job FROM Emps2

Equivalent

SELECT DISTINCT job FROM Emps1INTERSECT ALLSELECT DISTINCT job FROM Emps2

Page 20: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 20

DifferenceSELECT empno FROM Emps

WHERE job = 'CLERK'EXCEPTSELECT empno FROM Emps

WHERE deptno = 20

This can also be written as SELECT empno FROM Emps WHERE job = 'CLERK' AND (deptno != 20 OR deptno IS NULL)Of course, this kind of rewriting won’t work when the differenced results are from different tables

1) Generates the empno's of clerks2) Generates the empno's of employees in dept 203) Generates the empno's in (1) which are not in (2)

1

2

3

Oracle uses

MINUS in place of EXCEPT

Page 21: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 21

Difference with Duplicates

SELECT job FROM Emps WHERE sal >= 3000

ANALYSTPRESIDENTANALYST

SELECT job FROM Emps WHERE hiredate > '1-jan-82'

ANALYSTCLERKCLERK

What's the result ofSELECT job FROM Emps WHERE sal >= 3000 EXCEPTSELECT job FROM Emps WHERE hiredate > '1-jan-82'

Page 22: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 22

EXCEPT ALL vs EXCEPTSELECT job FROM Emps

WHERE sal >= 3000

ANALYSTPRESIDENTANALYST

SELECT job FROM Emps WHERE sal >= 3000

EXCEPT ALL SELECT job FROM Emps

WHERE hiredate > '1-jan-82'ANALYSTPRESIDENT

SELECT job FROM Emps WHERE hiredate > '1-jan-82'

ANALYSTCLERKCLERK

SELECT job FROM Emps WHERE sal >= 3000

EXCEPTSELECT job FROM Emps

WHERE hiredate > '1-jan-82'

PRESIDENT

Set Difference eliminates duplicates before taking the difference

Counted Difference counts # of tuples, computes difference& keeps all of them

Not in Oracle

Page 23: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 23

Counted & Set DifferenceSELECT job FROM Emps1EXCEPTSELECT job FROM Emps2

Equivalent

SELECT DISTINCT job FROM Emps1EXCEPT ALLSELECT DISTINCT job FROM Emps2

Page 24: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 24

Collection Operators& Joins

Page 25: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 25

Write a SQL expression (without using Outer Joins) to get the same result as

SELECT dname, ename FROM Depts NATURAL LEFT JOIN Emps

Representing Outer Joins using Collection Operators

Suppose SQL did not have an Outer Join operators. Could you use collection operators instead?

DNAME ENAME------------ ------ACCOUNTING CLARKRESEARCH SALES ALLENSALES BLAKEOPERATIONS

10 ACCOUNTING …

20 RESEARCH …

30 SALES …

40 OPERATIONS …

deptno dname … Depts Emps

*

*

empno ename … deptno7782 CLARK … 10

7499 ALLEN … 30

8214 JOJO …

7698 BLAKE … 30

Page 26: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 26

Outer Join = Inner Join + Remainder

DNAME ENAME------------ ------ACCOUNTING CLARKRESEARCH SALES ALLENSALES BLAKEOPERATIONS

10 ACCOUNTING …

20 RESEARCH …

30 SALES …

40 OPERATIONS …

deptno dname … Depts Emps

*

*

empno ename … deptno7782 CLARK … 10

7499 ALLEN … 30

8214 JOAN …

7698 BLAKE … 30

DNAME ENAME------------ ------ACCOUNTING CLARKSALES ALLENSALES BLAKE

DNAME ENAME------------ ------RESEARCH OPERATIONS

InnerJoin

Depts not in Inner

Join

Page 27: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 27

Representing Outer Joins using Collection Operators

WITH EmptyDepts AS (SELECT deptno FROM DeptsEXCEPTSELECT deptno FROM Emps )

SELECT dname, ename FROM Emps NATURAL JOIN Depts

UNION SELECT dname, NULL

FROM EmptyDepts NATURAL JOIN Depts

SELECT dname, ename FROMDepts NATURAL LEFT JOIN Emps

Remaining Depts

Inner Join

Remaining Depts with NULL ename

Page 28: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 28

Intersection & Inner Joins

Given two tables, e.g. T1( a, b, c )T2( a, b, c )

SELECT * FROM T1INTERSECTSELECT * FROM T2

Assuming• T1 and T2 have the

same attributes• Neither T1 nor T2

have duplicate tuples

• No matching tuples in T1 or T2 contain NULL values

SELECT * (T1 NATURAL JOIN T2)

Identical

Page 29: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 29

Unions and Full Outer Joins

Given two tables, e.g.T1( a, b, c )T2( a, b, c )

SELECT * FROM T1UNIONSELECT * FROM T2

SELECT * (T1 NATURAL FULL JOIN T2)

Identical

Assuming• T1 and T2 have the

same attributes• Neither T1 nor T2

have duplicate tuples

• No matching tuples in T1 or T2 contain NULL values

Page 30: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 30

Transitive Closure& Recursive Views

Page 31: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 31

Hierarchy Induced by Reflexive Relationships

7566

7788

73697876

7902

[mgr: 7902]

7839

7782 7698

KING

JONESBLAKE

CLARK

FORD

SMITH

7934

7499 7499

7499 7499

7900

Hierarchy induced by manages relationship, implemented by the

mgr attribute

&x&x is the root of a subtree

Page 32: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 32

1st Level Query Exercise

Write a query that liststhe empno and ename of &x's as well as all the employees directly managed by &x

7566

7788 7902

JONES

FORD

&x

Emps( empno, ename, mgr, … )

Page 33: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 33

1st Level Query Answer

LEVEL 0 & 1: Find &x and 1st level (i.e. direct) reports

SELECT empno, ename FROM EmpsWHERE empno = &x

UNIONSELECT empno, ename FROM Emps eWHERE mgr = &x

Page 34: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 34

2nd Level Query Exercise

Write a query that liststhe empno of

&x's as well as

all the employees directly managed by &xas well as

all the employees they directly manage

7566

7788

73697876

7902

&x

Page 35: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 35

2nd Level Query AnswerLEVEL 0 & 1:

Find &x and 1st level (i.e. direct) reports

SELECT empno, ename FROM EmpsWHERE empno = &x

UNIONSELECT empno, ename FROM Emps

WHERE mgr = &x

LEVEL 0 & 1 & 2: Find &x and 1st and 2nd level reports

SELECT empno, ename FROM EmpsWHERE empno = &x

UNIONSELECT empno, ename FROM Emps

WHERE mgr = &xUNIONSELECT e2.empno, e2.ename FROM Emps e1, Emps e2

WHERE e2.mgr = e1.empno AND e1.mgr = &x

What if you want to go one level deeper?

Page 36: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 36

3rd Level Query AnswerLEVEL 0 & 1 & 2 & 3:

Find &x and 1st, 2nd and 3rd level reportsSELECT empno, ename FROM Emps

WHERE e.empno = &xUNIONSELECT empno, ename FROM Emps

WHERE e.mgr = &xUNIONSELECT e2.empno, e2.ename FROM Emps e1, Emps e2

WHERE e2.mgr = e1.empno AND e1.mgr = &xUNIONSELECT e3.empno, e3.ename

FROM Emps e1, Emps e2, Emps e3WHERE e3.mgr = e2.empno AND e2.mgr = e1.empno AND e1.mgr = &x

And another level deeper?

Page 37: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 37

4th Level Query AnswerLEVEL 0 & 1 & 2 & 3 & 4:

Find &x and 1st, 2nd, 3rd & 4th level reportsSELECT empno, ename FROM Emps

WHERE e.empno = &xUNIONSELECT empno, ename FROM Emps

WHERE e.mgr = &xUNIONSELECT e2.empno, e2.ename FROM Emps e1, Emps e2

WHERE e2.mgr = e1.empno AND e1.mgr = &xUNIONSELECT e3.empno, e3.ename

FROM Emps e1, Emps e2, Emps e3WHERE e3.mgr = e2.empno AND e2.mgr = e1.empno AND e1.mgr = &x

UNIONSELECT e4.empno, e4.ename

FROM Emps e1, Emps e2, Emps e3, Emps e4WHERE e4.mgr = e3.empno AND

e3.mgr = e2.empno AND e2.mgr = e1.empno AND e1.mgr = &x

Page 38: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 38

Transitive ClosureFind &x plus all employees who work under &x at any level in the hierarchy.

Cannot be implemented directly using SQL learned so far!

Can be implemented by recursive views, defined in SQL-99, but not available in commercial RDBs

Oracle instead supports hierarchical queries

Page 39: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 39

Recursive Views

CREATE RECURSIVE VIEWRemps AS (SELECT empno, ename, 0 AS level FROM Emps WHERE empno = &xUNIONSELECT e.empno, e.ename, (1 + r.level) AS level FROM Remps r, Emps e WHERE r.empno = e.mgr)

Recursive Views are in the SQL-99 standard, but not implemented in Oracle

Page 40: 1 Theory, Practice  Methodology of Relational Database Design and Programming Copyright  Ellis Cohen 2002-2008 Collection Operators These slides are

© Ellis Cohen 2001-2008 40

Hierarchical Queries in OracleFind x plus all employees who work under x

along with their level in the hierarchy.

7566

7788

73697876

7902

SELECT empno, ename, level, prior empno, prior ename, FROM Emps START WITH empno = &x CONNECT BY mgr = prior empno

Start with empno = &x

[mgr: 7902]

prior empnoi.e. empno at the prior level of the

hierarchy

Current level of hierarchy being

evaluated

&x

Oracle only