oracle university live virtual seminar sql masterclass rob van wijk2011
TRANSCRIPT
Oracle University
Live Virtual Seminar SQL Masterclass
Rob van Wijk • 2011
About me
• Work with Oracle and SQL since 1995• From:
• Blog:
• Forums:
•
Utrecht, Netherlands
Agenda
Part One: Do More With SQL
Analytic Functions
15-minute break
Part Two: SQL Model Clause
one-hour break
Part Three: Recursive Subquery Factoring
15-minute break
Part Four: Grouping & Aggregating
Frequently Occuring SQL Problems
Part 1a: Do More With SQL
Goals
• As practical as possible
• As less regurgitating of documentation as possible
• Lots of example scripts
• Recognizable problems
• Do as much as possible in SQL and avoid shipping
records for processing to PL/SQL or even Java at a
middle tier.
If you want to build a ship, don't drum up the men togather wood, divide the work and give orders.
Instead, teach them to yearn for the vast and endless sea.
– Antoine de Saint Exupéry
One SQL engine versus …
… two engines.
procedural engine
SQL engine
context swtiches
dmws1.sql
You risk wrong results because of different start times of queries with default READ COMMITTED isolation level
dmws2.sql
Using SELECT statements in your DML
• INSERT INTO … SELECT …
• DELETE … WHERE rowid IN ( SELECT … )
• The trouble with UPDATE statements
• Use Merge
• Updateable Join Views uj2.sql
uj1.sql
Part 1b: Analytic Functions
Analytic Functions: Topics
• Introduction
• Mind set
• Evaluation order
• Main syntax
• Examples
• Window clause
Analytic Functions: Introduction
Of every employee please show me:
• His name
• The department he’s working in
• His salary
• The cumulative salary per department
• Percentage of salary within the department
• Percentage of salary within the company
where employees are sorted by department and salary
af1c.sqlaf1b.sqlaf1a.sql
Analytic Functions: Introduction
• Since 8.1.6 Enterprise Edition
• Look like well known aggregate functions like SUM,
COUNT and AVG
• … but they don’t aggregate
• Prevents self joins
• Have been extended with new functions and new
options in more recent versions
af2.sql
Analytic Functions: Mind set
Don’t think “rows” …
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO---------- ---------- --------- ---------- -------- ---------- ---------- ---------- 7782 CLARK MANAGER 7839 09-06-81 2450 10 7839 KING PRESIDENT 17-11-81 5000 10 7934 MILLER CLERK 7782 23-01-82 1300 10 7876 ADAMS CLERK 7788 23-05-87 1100 20
7902 FORD ANALYST 7566 03-12-81 3000 20
7566 JONES MANAGER 7839 02-04-81 2975 20 7788 SCOTT ANALYST 7566 19-04-87 3000 20 7369 SMITH CLERK 7902 17-12-80 800 20 7499 ALLEN SALESMAN 7698 20-02-81 1600 300 30 7698 BLAKE MANAGER 7839 01-05-81 2850 30 7900 JAMES CLERK 7698 03-12-81 950 30 7654 MARTIN SALESMAN 7698 28-09-81 1250 1400 30 7844 TURNER SALESMAN 7698 08-09-81 1500 0 30 7521 WARD SALESMAN 7698 22-02-81 1250 500 30
7902 FORD ANALYST 7566 03-12-81 3000 20
Analytic Functions: Mind set
… but think “sets” EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO---------- ---------- --------- ---------- -------- ---------- ---------- ----------
7782 CLARK MANAGER 7839 09-06-81 2450 10 7839 KING PRESIDENT 17-11-81 5000 10 7934 MILLER CLERK 7782 23-01-82 1300 10 7876 ADAMS CLERK 7788 23-05-87 1100 20 7902 FORD ANALYST 7566 03-12-81 3000 20 7566 JONES MANAGER 7839 02-04-81 2975 20 7788 SCOTT ANALYST 7566 19-04-87 3000 20 7369 SMITH CLERK 7902 17-12-80 800 20 7499 ALLEN SALESMAN 7698 20-02-81 1600 300 30 7698 BLAKE MANAGER 7839 01-05-81 2850 30 7900 JAMES CLERK 7698 03-12-81 950 30 7654 MARTIN SALESMAN 7698 28-09-81 1250 1400 30 7844 TURNER SALESMAN 7698 08-09-81 1500 0 30 7521 WARD SALESMAN 7698 22-02-81 1250 500 30
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO---------- ---------- --------- ---------- -------- ---------- ---------- ----------
7782 CLARK MANAGER 7839 09-06-81 2450 10 7839 KING PRESIDENT 17-11-81 5000 10 7934 MILLER CLERK 7782 23-01-82 1300 10
7876 ADAMS CLERK 7788 23-05-87 1100 20 7902 FORD ANALYST 7566 03-12-81 3000 20 7566 JONES MANAGER 7839 02-04-81 2975 20 7788 SCOTT ANALYST 7566 19-04-87 3000 20 7369 SMITH CLERK 7902 17-12-80 800 20
7499 ALLEN SALESMAN 7698 20-02-81 1600 300 30 7698 BLAKE MANAGER 7839 01-05-81 2850 30 7900 JAMES CLERK 7698 03-12-81 950 30 7654 MARTIN SALESMAN 7698 28-09-81 1250 1400 30 7844 TURNER SALESMAN 7698 08-09-81 1500 0 30 7521 WARD SALESMAN 7698 22-02-81 1250 500 30
Analytic Functions: Evaluation order
• Last
• Even after evaluating HAVING clause
• And after ROWNUM has been assigned
• But before ORDER BY clause
• Filtering on outcome of analytic function: nest the
query using an inline view or use subquery factoring
af3b.sqlaf3a.sql
Analytic Functions: Main syntax
<function> (<argument>, <argument>, …)
OVER
(<partition clause>
<order by clause>
<window clause>
)
Analytic Functions: The functions
LAG FIRST / LAST PERCENT_RANK
LEAD COUNT PERCENTILE_DISC
FIRST_VALUE SUM PERCENTILE_CONT
LAST_VALUE MAX CORR
NTH_VALUE MIN COVAR_POP
RANK AVG VARIANCE
DENSE_RANK NTILE VAR_x (2 times)
RATIO_TO_REPORT CUME_DIST STDDEV_x (3 times)
ROW_NUMBER LISTAGG REGR_x (9 times)
Analytic Functions: Partition clause
PARTITION BY <expression> [,<expression>]*
to let the analytic function operate on a subset of the rows with the same values for the
partition by expression values.
af4.sql
Analytic Functions: Order By clause
ORDER BY <expression> [ASC|DESC] [NULLS
FIRST|NULLS LAST], …
Its presence changes the default window of an analytic function from the total set to a running
total.
af5.sql
Analytic Functions: Example 1
Top N queries
What do I mean exactly with:
“Show me the top 3 earning employees per department”
• RANK
• DENSE_RANK
• ROW_NUMBER
af6.sql
Analytic Functions: Example 2
1. David Zabriskie (USA) 0.58:31
2. Ivan Basso (ITA) + 0:17
3. Paolo Savoldelli (ITA) + 0:44
4. Marzio Bruseghin (ITA) + 0:48
5. Serguei Gonchar (UKR) z.t.
6. Vladimir Karpets (RUS) + 1:07
7. Markus Fothen (GER) + 1:15
8. Thomas Dekker (NLD) + 1:23
9. Jan Hruska (CZE) + 1:34
10. Danilo di Luca (ITA) z.t.
af7.sql
Analytic Functions: Example 3
• Requirement: non-overlapping & consecutive periods
• Columns Startdate and maybe Enddate
• Optimize to retrieve current period
• Options:
1) No Enddate column and use correlated subquery
2) Enddate column and database trigger code to check
requirement
3) No Enddate column and use analytic function
af8.sql
Analytic Functions: Example 4
• Bills can be of type “Prepayment” or “Settlement”
• Bill lines have an amount.
• Each customer pays a prepayment each month. The
bill contains one bill line with the amount.
• Each customer receives once a year a settlement bill.
• How to calculate the previous prepayment amount?
This is the amount before the last settlement bill.
af9.sql
Analytic Functions: Example 5
TIME QUANTITY
-------- -----------
12:22:01 100
12:22:03 200
12:22:04 300
12:22:06 200
12:22:45 100
12:22:46 200
12:23:12 100
12:23:12 200
MIN(TIME) MAX(TIME) QUANTITY
--------- --------- -----------
12:22:01 12:22:06 800
12:22:45 12:22:46 300
12:23:12 12:23:12 300
af10.sql
Analytic Functions: Window clause
• Total set: ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING
• Anchored set / running aggregate: ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW
• ROW / RANGE
af13.sql
af11.sql
af12.sql
Part 2: SQL Model Clause
SQL Model Clause: Topics
• Introduction
• Syntax
• Examples
• Performance
• Alternatives
• Conclusion
SQL Model Clause: Introduction
• Treat data as multidimensional arrays
• Complex calculations across rows
• Syntax which resembles logic programming (Prolog)
• Can prevent exporting data to external applications
like Excel/Numbers
• No more several copies of data on several PC’s
anymore
SQL Model Clause: Syntax (1)
• Divide columns in three groups: PARTITION,
DIMENSION and MEASURES
• Every partition is a separate array
• Dimensions identify a cell in every partition
• Measures are the columns you want to (re-)calculate
• The rules tell you how the data is to be manipulated
SQL Model Clause: Examples
• A model clause that does nothing
• Adding an extra row to the result set
• RETURN UPDATED ROWS
• The difference between MEASURES and PARTITION
mc4.sql
mc3.sql
mc2.sql
mc1.sql
SQL Model Clause: Example from the doc
mc5.sql
SQL Model Clause: More examples
• ANY
• CV()
• FOR
• Iterating
• Reference models
• Difference between NULL and NAV
• IS PRESENT, PRESENTV and PRESENTNNV
mc11.sql
mc10.sql
mc9.sql
mc8.sql
mc7.sql
mc6.sql
mc12.sql
SQL Model Clause: Complete syntax
MODEL[<global reference options>][<reference models>][MAIN <main-name>] [PARTITION BY (<cols>)] DIMENSION BY (<cols>) MEASURES (<cols>) [<reference options>] [RULES] <rule options> (<rule>, <rule>,.., <rule>)
<global reference options> ::= <reference options> <ret-opt> <ret-opt> ::= RETURN {ALL|UPDATED} ROWS <reference options> ::= [IGNORE NAV | [KEEP NAV] [UNIQUE DIMENSION | UNIQUE SINGLE REFERENCE] <rule options> ::= [UPDATE | UPSERT | UPSERT ALL] [AUTOMATIC ORDER | SEQUENTIAL ORDER] [ITERATE (<number>) [UNTIL <condition>]] <reference models> ::= REFERENCE ON <ref-name> ON (<query>) DIMENSION BY (<cols>) MEASURES (<cols>) <reference options>
SQL Model Clause: Examples
• Financial spreadsheet
• Fibonacci
• OTN-question
• Interest and rates
mc13.sql
mc14.sql
mc15.sql
mc16.sql
SQL Model Clause: Performance
• Internal hash-tables in PGA
• Sequential Order SQL MODEL ORDERED [FAST]
• Automatic Order SQL MODEL [A]CYCLIC
• FAST left side cell references are single cell
references and aggregates at right side -if any- are
simple arithmetic non-distinct aggregates, like SUM,
COUNT, AVG and so on.
mc17.sql
SQL Model Clause: Advanced examples
• Calculating die probabilities
• Exponential Moving Average
X = (K * (C - P)) + P
Where:
X = Current EMA (i.e. EMA to be calculated)
C = Current original data value
K = Smoothing Constant
P = Previous EMA
• Sudoku solver
mc18.sql
mc19.sql
mc20.sql
Part 3: Recursive Subquery Factoring
Recursive Subquery Factoring: Topics
• Subquery Factoring
• Concepts
• Recursive Examples
• Simulating Connect By
• Performance
• More recursive examples
Subquery Factoring
• Since version 9
• Let’s you assign a name to a subquery block
• Modular Programming in SQL
• Also known as “WITH clause” or “Common Table
Expressions”
• Second to last factored subquery: comma instead of
“WITH”
• /*+ MATERIALIZE */ and /*+ INLINE */
• Must use each factored subquery?
rsf1.sql
rsf2.sql
rsf3.sql
Recursive Subquery Factoring: Concepts
• Since version 11.2
• Let’s you query hierarchical data
• More powerful than CONNECT BY
• Anchor member UNION ALL recursive member
• Recursive member cannot contain: DISTINCT, Model
clause, aggregate functions and analytic functions
• SEARCH DEPTH / BREADTH FIRST
• CYCLE
rsf4.sql
rsf5.sql
rsf6.sql
Recursive Subquery Factoring: Examples
• Fibonacci
• fib(0) = 0
• fib(1) = 1
• fib(n+2) = fib(n+1) + fib(n)
• Interest and rates
rsf7.sql
rsf8.sql
Simulating Connect By
• LEVEL
• SYS_CONNECT_BY_PATH
• CONNECT_BY_ROOT
• CONNECT_BY_ISCYCLE
• CONNECT_BY_ISLEAF
rsf9.sql
rsf10.sql
rsf11.sql
rsf12.sql
rsf13.sql
Recursive Subquery Factoring: Performance
• /*+ CONNECT_BY_FILTERING */
• /*+ NO_CONNECT_BY_FILTERING */
rsf14.sql
More Recursive Examples
• Calculating die probabilities
• Exponential Moving Average
X = (K * (C - P)) + P
Where:
X = Current EMA (i.e. EMA to be calculated)
C = Current original data value
K = Smoothing Constant
P = Previous EMA
• Sudoku solver
rsf15.sql
rsf16.sql
rsf17.sql
Part 4a: Grouping & Aggregating
aog1.sql
Grouping & Aggregating: Topics
• Introduction
• GROUPING SETS
• ROLLUP
• CUBE
• Combining and calculating
• Supporting functions
• Inner workings
• MIN/MAX … KEEP … (DENSE_RANK FIRST/LAST … )
Grouping & Aggregating: Grouping Sets (1)
GROUP BY expr1, …, exprn
≡
GROUP BY GROUPING SETS
( (expr1, …, exprn) )
aog2.sql
Grouping & Aggregating: Grouping Sets (2)
GROUP BY GROUPING SETS
( (expr11, …, expr1n), …, (exprx1, …, exprxm) )
≡
GROUP BY expr11, … expr1n
UNION ALL
…
UNION ALL
GROUP BY exprx1, …, exprxm
aog3.sql
Grouping & Aggregating: ROLLUP (1)
GROUP BY ROLLUP ( set1, …, setn )
≡
GROUP BY GROUPING SETS
( (set1, …, setn), (set1, …, setn-1), …, set1, () )
Grouping & Aggregating: ROLLUP (2)
ROLLUP (set1, …, setN)
with N ≥ 1
leads to N+1 GROUPING SETS
Grouping & Aggregating: ROLLUP (3)
Example:
GROUP BY ROLLUP ( (deptno), (job,mgr), (empno) )
≡
GROUP BY GROUPING SETS
( (deptno,job,mgr,empno)
, (deptno,job,mgr)
, (deptno)
, () )
aog4.sql
Grouping & Aggregating: CUBE (1)
GROUP BY CUBE ( set1, …, setn )≡
GROUP BY GROUPING SETS(all possible combinations between () and (set1, …, setn) )
Grouping & Aggregating: CUBE (2)
CUBE (set1, …, setN)
with N ≥ 1
leads to 2N GROUPING SETS
Grouping & Aggregating: CUBE (3)
Follows Pascal’s triangle
0 sets X
1 set
2 sets
3 sets
4 sets
Grouping & Aggregating: CUBE (4)
Example:
GROUP BY CUBE ( (deptno), (job,mgr), (empno) )
≡
GROUP BY GROUPING SETS
( (deptno,job,mgr,empno)
, (deptno,job,mgr), (deptno,empno), (job,mgr,empno)
, (deptno), (job,mgr), (empno)
, () )
aog5.sql
Grouping & Aggregating: Calculating (1)
GROUP BY deptno, ROLLUP(empno)
?
Grouping & Aggregating: Calculating (2)
GROUP BY deptno, ROLLUP(empno)
≡
GROUP BY GROUPING SETS (deptno)
, GROUPING SETS ( empno, () )
Grouping & Aggregating: Calculating (3)
Cartesian product !
GROUP BY deptno, ROLLUP(empno)
≡
GROUP BY GROUPING SETS (deptno)
, GROUPING SETS ( (empno), () )
≡
GROUP BY GROUPING SETS
( (deptno,empno), (deptno) )
aog6.sql
Grouping & Aggregating: Calculating (4)
Question:
How many grouping sets does the clause below yield?
GROUP BY ROLLUP(deptno,job)
, CUBE(mgr,hiredate)
aog7.sql
Grouping & Aggregating: Functions
GROUPING
GROUPING_ID
GROUP_ID
aog8.sql
Grouping & Aggregating: Inner working (1)
SORT GROUP BY
Versus
HASH GROUP BY
Grouping & Aggregating: Inner working (2)
1077822450
1078395000
1079341300
207369800
2075662975
2077883000
2078761100
2079023000
3074991600
3075211250
3076541250
3076982850
3078441500
307900950
10NULL8750
20NULL10875
30NULL9400+
+NULLNULL29025
incoming set
grouping set ( (deptno,empno) )
grouping set ( () )
grouping set ( (deptno) )
SORT GROUP BY
SORT GROUP BY
SORT GROUP BY
ROLLUP (DEPTNO,EMPNO)
aog9.sql
Grouping & Aggregating: Inner working (3)
SORT GROUP BY (deptno,job)
GENERATE CUBE
SORT GROUP BY (deptno,job)
incoming set 14 rows
9 rows
36
rows
18 rows
CUBE (DEPTNO,JOB)
aog10.sql
deptno null & job null
deptno not null & job not null deptno not null & job null
deptno null & job not null
Grouping & Aggregating: Inner working (4)
LOAD AS SELECT (into input table)
TABLE ACCESS FULL (EMP)
TEMP TABLE TRANSFORMATION
VIEW
TABLE ACCESS FULL (output table)
temporaryinput table
SYS_TEMP_...
temporaryoutput table
SYS_TEMP_...
LOAD AS SELECT (into outputtable)
HASH GROUP BY
TABLE ACCESS FULL (input table)
iterate as much times as
there are grouping sets
aog11.sql
Grouping & Aggregating: Inner working (5)
Optimize towards a ROLLUP or CUBE execution,if possible?
aog12.sql
Grouping & Aggregating: Agg. Functions (1)
• COUNT
• SUM
• AVG
• MAX
• MIN
• STDDEV
• VARIANCE
• LISTAGG
aog13.sql
Grouping & Aggregating: Agg. Functions (2)
• MAX(…) KEEP (DENSE_RANK FIRST ORDER BY …)
• MAX(…) KEEP (DENSE_RANK LAST ORDER BY …)
• MIN(…) KEEP (DENSE_RANK FIRST ORDER BY …)
• MIN(…) KEEP (DENSE_RANK LAST ORDER BY …)
aog14.sql
Frequently Occuring Problems
• Row / Number Generation
• Interval Based Row Generation
• Splitting Comma Separated Strings
• String Aggregation
• Pivoting
• Unpivoting
• Tabibitosan
Part 4b: Frequently Occuring Problems
Row / Number Generation
fop1.sql
Interval Based Row Generation
fop2.sql
Splitting Comma Separated Strings
→
fop3.sql
String Aggregation
→
fop4.sql
Pivoting
fop5.sql
Unpivoting
fop6.sql
Tabibitosan
fop7.sql
Thank you for your attention
81 • Title of presentation