Download - T15 Beyond SQL Rahimi
-
7/31/2019 T15 Beyond SQL Rahimi
1/127
-
7/31/2019 T15 Beyond SQL Rahimi
2/127
Outline
Data Warehousing concepts
On-Line Analytical Processing (OLAP) Using SQL to perform analytic functions
Oracles advanced analytic functions
Acknowledgement:
Some of the material in this presentation are based on thebook, Databases and Transaction Processing, fromAddison Wesley.
Copyright 2011 Saeed Rahimi 2
-
7/31/2019 T15 Beyond SQL Rahimi
3/127
-
7/31/2019 T15 Beyond SQL Rahimi
4/127
Data Warehouse Concepts OLTP vs. OLAP
On Line Transaction Processing (OLTP)
Short burst transactions Frequent modifications
Updates Inserts Deletes
Normalized (3NF, BCNF or 4NF) Transactions access only a small fraction of the database
On-Line Analytic Processing (OLAP) Main use is in decision support, business analysis
Complex, aggregate and time based queries
Almost never updates (bulk update or load) Queries access a large portion of the database Queries usually take longer to run Normalized (1NF, 2NF)
Copyright 2011 Saeed Rahimi 4
-
7/31/2019 T15 Beyond SQL Rahimi
5/127
Data Warehouse Concepts OLTP vs. OLAP example
OLTP query: Update the balance of an account to show a deposit Insert into the item table to show addition of a new
inventory
Transaction: Check the balance of checking and savings
Transfer $500 from savings to checking
OLAP query:
How many skis did we sell in the Northeast and Midwestregions of the US during the last quarter of the last fiveyears
Copyright 2011 Saeed Rahimi 5
-
7/31/2019 T15 Beyond SQL Rahimi
6/127
Data Warehouse Concepts OLAP: Traditional vs. Newer Applications
Traditional
Uses data the enterprise gathers in its usual activities,perhaps in its OLTP system
Queries are ad hoc, perhaps designed and carried out-
Newer Applications
Gather data (information) actively
Buy if have to Mine the information to find out better ways of serving the
customer and/or selling more products
Hire professional to do so
Copyright 2011 Saeed Rahimi 6
-
7/31/2019 T15 Beyond SQL Rahimi
7/127
Data Warehouse Concepts Example: Traditional vs. Newer Applications
Traditional
How many skis were sold in all Northeast warehouses inthe years 2004 and 2005?
Newer Prepare a profile of the skiers for the residents of the
Northeast region
Customize our advertising and marketing to actively sell
products types these residents would want The newer approach requires Data Mining
Finding of nuggets of gold in the vast see of informationcollected in the warehouse
Copyright 2011 Saeed Rahimi 7
-
7/31/2019 T15 Beyond SQL Rahimi
8/127
Data Warehouse Concepts
BitmapJoin
MaterializedMaterializedViews
WarehouseSql mergeMultitable
Insert
Externaltables
Data Warehouse Life Cycle
OperationalSystems
StoreTransform UsersExtract PerformanceAnalyzeLoad
Copyright 2011 Saeed Rahimi 8
-
7/31/2019 T15 Beyond SQL Rahimi
9/127
Data Warehouse Concepts Data Mining
Data Mining is the art of knowledge discovery
Knowledge is used to better the business
Data mining vs. OLAP OLAP: What percentage of people who make over
$50,000 defaulted on their mortgage in the year 2010?
Data Mining: How can information about salary, networth, and other historical data be used to predict whowill default on their mortgage?
Copyright 2011 Saeed Rahimi 9
-
7/31/2019 T15 Beyond SQL Rahimi
10/127
Data Warehouse Concepts Data warehouse as a database
OLAP applications are based on a table called, fact table
For example, a supermarket application might be based onthe fact table Sales
Sales Market_Id, Product_Id, Time_Id, Sales_Amt
The table is viewed as multidimensional
The first three columns are the dimensions representing
supermarkets, products and time intervals
The fourth column, the Sales_Amt, is a function of theother three
Copyright 2011 Saeed Rahimi 10
-
7/31/2019 T15 Beyond SQL Rahimi
11/127
Data Warehouse Concepts A Cube
The fact table can be viewed as a three-dimensional cube
Each entry in this three-dimensional view represents aspecific sales amount for a given market, product and for aspecific time period
Copyright 2011 Saeed Rahimi 11
-
7/31/2019 T15 Beyond SQL Rahimi
12/127
Data Warehouse Concepts Dimension Tables
The dimensions of the fact table can be furtherdescribed with dimension tables
Sales (Market_id, Product_Id, Time_Id, Sales_Amt)
Dimension Tables
Market (Market_Id, City, State, Region)
Product (Product_Id, Name, Category, Price)
Time (Time_Id, Week, Month, Quarter)
Copyright 2011 Saeed Rahimi 12
-
7/31/2019 T15 Beyond SQL Rahimi
13/127
Data Warehouse Concepts Star Schema
Time
ProductMarket Sales
Copyright 2011 Saeed Rahimi 13
-
7/31/2019 T15 Beyond SQL Rahimi
14/127
Data Warehouse Concepts Schema of the Sales data warehouse
Time
Time_Id
Week
Month
Quarter
A2
A10
A10
A8
Identifier_1
Sales Data
Warehouse
Market Product
Time
Market
Market_Id
City
State
Region
A2
A20
A20
A20
Identifier_1
Product
Product_Id
Name
Category
Price
A2
A20
A20
N6,2
Identifier_1
Sales
Sales_Amt N8,2
Copyright 2011 Saeed Rahimi 14
-
7/31/2019 T15 Beyond SQL Rahimi
15/127
Data Warehouse Concepts Warehouses dimension tables
Marketcreate table MARKET (
MARKET_ID CHAR(2) not null,CITY CHAR(20),
STATE CHAR(20),
REGION CHAR(20),
constraint PK_MARKET primary key (MARKET_ID)
)
ro uctcreate table PRODUCT (
PRODUCT_ID CHAR(2) not null,
NAME CHAR(20),
CATEGORY CHAR(20),
PRICE NUMBER(6,2),
constraint PK_PRODUCT primary key (PRODUCT_ID)
)
Timecreate table TIME (
TIME_ID CHAR(2) not null,
WEEK CHAR(10),
MONTH CHAR(10),
QUARTER CHAR(8),
constraint PK_TIME primary key (TIME_ID)
)Copyright 2011 Saeed Rahimi 15
-
7/31/2019 T15 Beyond SQL Rahimi
16/127
Data Warehouse Concepts
Warehouses fact tablecreate table SALES (
MARKET_ID CHAR(2) not null,
PRODUCT_ID CHAR(2) not null,
TIME_ID CHAR(2) not null,
SALES_AMT NUMBER(8,2),
constraint PK_SALES primary key (MARKET_ID, PRODUCT_ID, TIME_ID),
constraint FK_SALES_MARKET_MARKET foreign key (MARKET_ID)
references MARKET (MARKET_ID),
constraint FK_SALES_PRODUCT_PRODUCT foreign key (PRODUCT_ID)
references PRODUCT (PRODUCT_ID),
constraint FK_SALES_TIME_TIME foreign key (TIME_ID)
references TIME (TIME_ID)
)
Copyright 2011 Saeed Rahimi 16
-
7/31/2019 T15 Beyond SQL Rahimi
17/127
Data Warehouse Concepts Star Schema of the Warehouse
Time Dimension Table
Time
Time_Id
Week
Month
Quarter
CHAR(2)
CHAR(10)
CHAR(10)
CHAR(8)
Market Dimension Table
Sales Fact Table
Product Dimension Table
Market
Market_Id
City
State
Region
CHAR(2)
CHAR(20)
CHAR(20)
CHAR(20)
Product
Product_Id
Name
Category
Price
CHAR(2)
CHAR(20)
CHAR(20)
NUMBER(6,2)
Sales
Market_Id
Product_Id
Time_Id
Sales_Amt
CHAR(2)
CHAR(2)
CHAR(2)
NUMBER(8,2)
Copyright 2011 Saeed Rahimi 17
-
7/31/2019 T15 Beyond SQL Rahimi
18/127
Data Warehouse Concepts
SQL> select * from sales;
MI PI TI SALES_AMT
-- -- -- ----------
M1 P1 T1 1000
M1 P2 T1 2000
M1 P3 T1 1500
M1 P4 T1 2500
M2 P1 T1 500
SQL> select * from Market;
MI CITY STATE REGION
-- -------------------- -------------------- ----------
M1 Stony Brook New York East
M2 Newark New Jersey East
M3 Oakland California West
SQL> select * from Product;
The warehouse tables
M2 P3 T1 0
M2 P4 T1 3333M3 P1 T1 5000
M3 P2 T1 8000
M3 P3 T1 10
M3 P4 T1 3300
M1 P1 T2 1001
M1 P2 T2 2001
M1 P3 T2 1501
M1 P4 T2 2501
M2 P1 T2 501
M2 P2 T2 801
...
...
...
36 rows selected.
PI NAME CATEGORY PRICE
-- -------------------- -------------------- ----------P1 Beer Drink 1.98
P2 Diapers Soft Goods 2.98
P3 Cold Cuts Meat 3.98
P4 Soda Drink 1.25
SQL> select * from Time;
TI WEEK MONTH QUARTER
-- ---------- ---------- --------
T1 Wk-1 January FirstT2 Wk-24 June Second
T3 Wk-52 December Fourth
Copyright 2011 Saeed Rahimi 18
-
7/31/2019 T15 Beyond SQL Rahimi
19/127
Data Warehouse Concepts Constellation (snow flake) schema
A data warehouse may use more than one fact
table These fact tables may share the same dimension
therefore forming a schema that looks like a snow
flake.
Time
ProductMarket Sales
WarehouseInventory
Copyright 2011 Saeed Rahimi 19
-
7/31/2019 T15 Beyond SQL Rahimi
20/127
An Introduction to
OLAP Basic O erations
using SQL
-
7/31/2019 T15 Beyond SQL Rahimi
21/127
Data Warehouse Concepts
OLAP Operations
Aggregation OLAP queries usually total(aggregate) information in the fact table
,
product, in each market, for each quarter, weuse the following:
Copyright 2011 Saeed Rahimi 21
SELECT S.Market_Id, S.Product_Id, time_ID, SUM (S.Sales_Amt) AS Total_Sale
FROM Sales SGROUP BY S.Market_Id, S.Product_Id, time_ID
order by time_id;
Aggregation
-
7/31/2019 T15 Beyond SQL Rahimi
22/127
Data Warehouse ConceptsSELECT S.Market_Id, S.Product_Id, time_ID, SUM (S.Sales_Amt) AS Total_SaleFROM Sales S
GROUP BY S.Market_Id, S.Product_Id, time_ID
order by time_id;
MA PR TI TOTAL_SALE
-- -- -- ----------
M1 P1 T1 1000
M2 P1 T1 500
M3 P1 T1 5000
M1 P2 T1 2000
M2 P2 T1 800
Copyright 2011 Saeed Rahimi 22
M3 P2 T1 8000
M1 P3 T1 1500
M2 P3 T1 0
M3 P3 T1 10
M1 P4 T1 2500
M2 P4 T1 3333
M3 P4 T1 3300
M1 P1 T2 1001
M2 P1 T2 501M3 P1 T2 5001
M1 P2 T2 2001
M2 P2 T2 801
M3 P2 T2 8001
M1 P3 T2 1501
Not all rows are shown
-
7/31/2019 T15 Beyond SQL Rahimi
23/127
Data Warehouse Concepts OLAP Operations The query on previous page returns a three dimensional view of the
results (Cube)
We can collapse the time dimension and show sales for each product in
each market. This is a two-dimensional view of the same result.
SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt) AS Total_Sale
FROM Sales S
GROUP BY S.Market_Id, S.Product_Id;
MI PI TOTAL_SALES
-- -- -----------
M1 P1 3003
M1 P2 6003
M1 P3 4503
M1 P4 7503
M2 P1 1503
M2 P2 2403
M2 P3 3
M2 P4 7000
M3 P1 15003
M3 P2 24003
M3 P3 33
M3 P4 9903
Copyright 2011 Saeed Rahimi 23
-
7/31/2019 T15 Beyond SQL Rahimi
24/127
Data Warehouse Concepts OLAP Operations
The same result can be viewed as follows.
This is a pivoted view of the same results.
Total Sales M1 M2 M3
Product_Id
Mar et_I
P1 3003 1503 15003
P2 6003 2402 24003
P3 4503 3 33
P4 7503 7000 9903
Copyright 2011 Saeed Rahimi 24
-
7/31/2019 T15 Beyond SQL Rahimi
25/127
Data Warehouse Concepts Pivoted view of the same result
select Product_id,
sum(case when Market_ID = 'M1'
then Sales_Amt
else NULL end)as M1,
sum(case when Market_ID = 'M2'
then Sales_Amt
else NULL end)as M2,
Copyright 2011 Saeed Rahimi 25
sum(case when Market_ID = 'M3'
then Sales_Amt
else NULL end)as M3
FROM Sales
GROUP BY Product_Id;
PR M1 M2 M3
-- ---------- ---------- ----------P1 3003 1503 15003
P2 6003 2403 24003
P3 4503 3 33
P4 7503 7000 9903
-
7/31/2019 T15 Beyond SQL Rahimi
26/127
Data Warehouse Concepts OLAP Operations We can now get the product sales for all markets in all quarters as:
SELECT S.Market_Id, SUM (S.Sales_Amt) AS Total_Sale
FROM Sales S
GROUP BY S.Market_Id;
MA TOTAL_SALE
-- ----------
M1 21012
M2 10909
And, finally get the total sales over all products for all markets for alltime periods as:
M3 48942
Copyright 2011 Saeed Rahimi 26
SELECT SUM (S.Sales_Amt) AS Total_Sale
FROM Sales S;
TOTAL_SALE
----------
80863
-
7/31/2019 T15 Beyond SQL Rahimi
27/127
Data Warehouse Concepts OLAP Operations
Drilling Down Some dimension tables represent a hierarchy
For example:
Market dimension has: City State Region
Time dimension has: Week Month Quarter
When we execute queries that move down a hierarchy (e.g., fromaggregation over regions to aggregation over states) we are drillingdown.
We are adding more columns of the dimension to the query
To be able to drill down, we must have access to more specific
information.
Copyright 2011 Saeed Rahimi 27
-
7/31/2019 T15 Beyond SQL Rahimi
28/127
Data Warehouse Concepts OLAP Operations
Dimensions do not always form a hierarchy
Some dimensions may have a lattice
For example, time dimension can be represented as a lattice
Weeks are not contained in months
We can roll up days into weeks or months, but we can only rollup weeks into quarters
Copyright 2011 Saeed Rahimi 28
-
7/31/2019 T15 Beyond SQL Rahimi
29/127
Data Warehouse Concepts OLAP Operations
Drilling Down Example:
The first query aggregates total sales for products in each region
The second query drills down to state level.
SELECT S.Product_Id,M.Region, SUM (S.Sales_Amt)
FROM Sales S, Market MWHERE M.Market_Id = S.Market_Id
GROUP BY M.Region, S.Product_Id;
SELECT S.Product_Id,M.State, SUM (S.Sales_Amt)
FROM Sales S, Market M
WHERE M.Market_Id = S.Market_Id
GROUP BY M.State, S.Product_Id;
Copyright 2011 Saeed Rahimi 29
-
7/31/2019 T15 Beyond SQL Rahimi
30/127
Data Warehouse Concepts
OLAP Operations
Rolling Up When we execute queries that move upthe hierarchy (e.g., from states to regions) we arerolling up
e can ro up n e erarc y or use e resu s oprevious queries from lower aggregates
Copyright 2011 Saeed Rahimi 30
-
7/31/2019 T15 Beyond SQL Rahimi
31/127
Data Warehouse Concepts OLAP Operations
Rolling Up:
The following query creates a table containing the total sales for each stateas:
CREATE TABLE State_Sales AS
SELECT S.Product_Id, M.State, SUM (S.Sales_Amt)Sales_Amt
FROM Sales S, Market M
WHERE M.Market_Id = S.Market_Id
. , . _
Table created.
select * from state_sales;
PR STATE SALES_AMT
-- -------------------- ----------
P1 California 15003
P2 California 24003
P3 California 33
P4 California 9903
P1 New Jersey 1503P2 New Jersey 2403
P3 New Jersey 3
P4 New Jersey 7000
P1 New York 3003
P2 New York 6003
P3 New York 4503
P4 New York 7503
12 rows selected.
Copyright 2011 Saeed Rahimi 31
-
7/31/2019 T15 Beyond SQL Rahimi
32/127
Data Warehouse Concepts OLAP Operations
Rolling Up: Example
Then we can use the following to roll up the total sales for each region as
SELECT T.Product_Id, R.Region, SUM (T.Sales_Amt)
FROM State_Sales T,
(SELECT DISTINCT M.Region, M.State FROM Market M) R
WHERE R.State = T.State
GROUP BY R.Region, T.Product_Id;
PR REGION SUM(T.SALES_AMT)
-- -------------------- ----------------
P1 East 4506
P2 East 8406
P3 East 4506
P4 East 14503
P1 West 15003P2 West 24003
P3 West 33
P4 West 9903
8 rows selected.
Copyright 2011 Saeed Rahimi 32
-
7/31/2019 T15 Beyond SQL Rahimi
33/127
Data Warehouse Concepts OLAP Operations
Pivot pivoting is changing the orientation of the cube.
Dimensions that we are pivoting on are used in the GROUP BYclause aggregation (SUM) is used on the remaining attributes
Copyright 2011 Saeed Rahimi 33
PR QUARTER SUM(SALES_AMT)
-- -------- --------------
P3 Fourth 1516
P1 First 6500
P2 First 10800
P2 Second 10803
P2 Fourth 10806
P1 Second 6503
P3 Second 1513
P1 Fourth 6506
P3 First 1510
P4 First 9133
P4 Second 9136
P4 Fourth 6137
PR Q1 Q2 Q4
-- ---------- ---------- ----------
P1 6500 6503 6506
P2 10800 10803 10806
P3 1510 1513 1516
P4 9133 9136 6137
-
7/31/2019 T15 Beyond SQL Rahimi
34/127
Data Warehouse Concepts
OLAP Operations
Product sales per quarter for all regions.SELECT S.Product_Id, T.Quarter, SUM (Sales_Amt)
FROM Sales S, Time T
WHERE T.Time_Id = S.Time_Id
GROUP BY S.Product_Id, T.Quarter
ORDER BY S.Product_Id, T.Quarter;
PR QUARTER SUM(SALES_AMT)
-- -------- --------------
P1 First 6500
P1 Fourth 6506
P1 Second 6503
P2 First 10800
P2 Fourth 10806
P2 Second 10803
P3 First 1510
P3 Fourth 1516
P3 Second 1513
P4 First 9133
P4 Fourth 6137
P4 Second 9136
Copyright 2011 Saeed Rahimi 34
-
7/31/2019 T15 Beyond SQL Rahimi
35/127
Data Warehouse Concepts OLAP Operations Pivoted results so that we can see the sales for each quarter
over all products.
Note: T3 is Q4 and not Q3 in our time table
SQL> select S.Product_id,
2 sum(case when S.Time_id = 'T1'
3 then Sales_Amt
Copyright 2011 Saeed Rahimi 35
4 else NULL end) as Q1,
5 sum(case when S.Time_id = 'T2'6 then Sales_Amt
7 else NULL end) as Q2,
8 sum(case when S.Time_id = 'T3'
9 then Sales_Amt
10 else NULL end) as Q4
11 FROM Sales S
12 GROUP BY S.Product_Id;
PR Q1 Q2 Q4
-- ---------- ---------- ----------
P1 6500 6503 6506
P2 10800 10803 10806
P3 1510 1513 1516
P4 9133 9136 6137
-
7/31/2019 T15 Beyond SQL Rahimi
36/127
Data Warehouse Concepts Pivot
Oracle 11g has a pivot operation that can also be used
Unpivot does exactly the opposite of the pivot
SQL> select * from (
2 select Product_ID, Market_ID, Sales_Amt
3 from Sales )
Copyright 2011 Saeed Rahimi 36
4 pivot
5 (6 Sum(Sales_Amt)
7 for Market_ID in ('M1','M2','M3')
8 )
9 order by Product_ID;
PR 'M1' 'M2' 'M3'-- ---------- ---------- ----------
P1 3003 1503 15003
P2 6003 2403 24003
P3 4503 3 33
P4 7503 7000 9903
-
7/31/2019 T15 Beyond SQL Rahimi
37/127
Data Warehouse Concepts Pivot
Can you write a pivot statement that generates the report onpage 35?
Copyright 2011 Saeed Rahimi 37
-
7/31/2019 T15 Beyond SQL Rahimi
38/127
Data Warehouse Concepts Pivot
What if you do not know the exact number of values forMarket_ID ini the statement on page 36?
Copyright 2011 Saeed Rahimi 38
-
7/31/2019 T15 Beyond SQL Rahimi
39/127
Data Warehouse Concepts Pivot Oracle has the capability to deal with any number in the IN
construct of the pivot statement and generate an XML report.
SET LONG 99999
select * from (
select Product_ID, Market_ID, Sales_Amt
See answer on next page
Copyright 2011 Saeed Rahimi 39
_ _ _
from Sales )
pivot xml(
Sum(Sales_Amt)
for Market_ID in (any)
)
order by Product_ID;
-
7/31/2019 T15 Beyond SQL Rahimi
40/127
Data Warehouse ConceptsPR--MARKET_ID_XML
--------------------------------------------------------------------------------
P1
M13003M21503M315003
P2
M1
-
7/31/2019 T15 Beyond SQL Rahimi
41/127
Data Warehouse Concepts
OLAP Operations
Slice - A slice is a subset of a cube corresponding toa single value for one or more members of thedimensions not in the subset
value, we are performing a slice Slicing Sales Cube in the time dimension:
total sales of each product in Wk-1
SELECT S.Product_Id, SUM (Sales_Amt)
FROM Sales S, Time T
WHERE T.Time_Id = S.Time_Id AND T.Week = Wk-1
GROUP BY S. Product_Id;
Copyright 2011 Saeed Rahimi 41
-
7/31/2019 T15 Beyond SQL Rahimi
42/127
Data Warehouse Concepts
OLAP OperationsDice The dice operation is a slice on more than two
dimensions of a cube (or more than two consecutiveslices).
When we use a GROUP BY clause in a uer to
specify part of a hierarchy, we are partitioning themulti-dimensional cube into sub-cubes. Therefore, weare dicing the cube
Example:
SELECT S.Product_Id, T.Quarter, SUM (Sales_Amt)FROM Sales S, Time T
WHERE T.Time_Id = S.Time_Id
GROUP BY T.Quarter, S.Product_Id;
Copyright 2011 Saeed Rahimi 42
-
7/31/2019 T15 Beyond SQL Rahimi
43/127
Data Warehouse Concepts
OLAP Operations Dice Dicing Sales in the time dimension: total sales for each product in each quarter.
SELECT S.Product_Id, T.Quarter, SUM (Sales_Amt)
FROM Sales S, Time T
WHERE T.Time_Id = S.Time_Id
GROUP BY T.Quarter, S.Product_Id
ORDER BY Product_ID;
Copyright 2011 Saeed Rahimi 43
PR QUARTER SUM(SALES_AMT)
-- -------- --------------P1 First 6500
P1 Fourth 6506
P1 Second 6503
P2 First 10800
P2 Fourth 10806
P2 Second 10803P3 First 1510
P3 Fourth 1516
P3 Second 1513
P4 First 9133
P4 Fourth 6137
P4 Second 9136
-
7/31/2019 T15 Beyond SQL Rahimi
44/127
OLAP Basic Operations
usin Oracles Anal tic Functions
-
7/31/2019 T15 Beyond SQL Rahimi
45/127
Data Warehouse Concepts
OLAP Operations
OLAP queries use the GROUP BY clause ofSQL to get the answer Standard options for GROUP BY are limited
It is not easy to formulate all OLAP needs in SQL92
SQL 1999 has extended SQL with additionalaggregate functions to support OLAP needs
Oracle 11g supports these extensions We will examine these functions next
Copyright 2011 Saeed Rahimi 45
-
7/31/2019 T15 Beyond SQL Rahimi
46/127
Data Warehouse Concepts
OLAP Operations
The Cube Operator Suppose we want to obtain a tabular view of the
information that contains:
Total sales of each product for each market Total sales of each market for each product
And the grand total of all sales for all market for allproducts!
What we are after is depicted on the next page.
Copyright 2011 Saeed Rahimi 46
D t W h C t
-
7/31/2019 T15 Beyond SQL Rahimi
47/127
Data Warehouse Concepts
OLAP Operations
Sales application in the form of a spreadsheet
Market_Id
Sum(Sales_Amt) M1 M2 M3 Total
P1 3003 1503 15003 19509
P2 6003 2402 24003 32408
Product_Id P3 4503 3 33 4539
P4 7503 7000 9903 24406
Total 21012 10908 48942 80862
-
7/31/2019 T15 Beyond SQL Rahimi
48/127
Data Warehouse Concepts
OLAP Operations
To create this sheet using the standard SQL operations, we need to usethe following three queries:
-- One query to calculate the entries, without the totals
--
SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)
FROM Sales S
. ar e _ , . ro uc _ ;
MA PR SUM(S.SALES_AMT)-- -- ----------------
M1 P1 3003
M1 P2 6003
M1 P3 4503
M1 P4 7503
M2 P1 1503
M2 P2 2403M2 P3 3
M2 P4 7000
M3 P1 15003
M3 P2 24003
M3 P3 33
M3 P4 9903
Copyright 2011 Saeed Rahimi 48
-
7/31/2019 T15 Beyond SQL Rahimi
49/127
Data Warehouse Concepts
OLAP Operations-- One to calculate the row totals
--
SELECT S.Product_Id, SUM (Sales_Amt)FROM Sales S
GROUP BY S.Product_Id;
PR SUM(SALES_AMT)
-- --------------
P1 19509
P2 32409
P3 4539
P4 24406
-- And one to calculate the column totals
--
SELECT S.Market_Id, SUM (Sales_Amt)
FROM Sales S
GROUP BY S.Market_Id;
MA SUM(SALES_AMT)
-- --------------
M1 21012
M2 10909
M3 48942
Copyright 2011 Saeed Rahimi 49
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
50/127
Data Warehouse Concepts
Question:
Can we use some of the queries we used before togenerate the same results?
Give it a try (hint see slide on page 35)
Copyright 2011 Saeed Rahimi 50
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
51/127
Data Warehouse Concepts
SQL> select C.Product_id,
2 sum(case when Market_ID = 'M1'
3 then c.Sales_Amt
4 else NULL end)as M1,
5 sum(case when Market_ID = 'M2'6 then c.Sales_Amt
7 else NULL end)as M2,
8 sum(case when Market_ID = 'M3'
9 then c.Sales_Amt
Copyright 2011 Saeed Rahimi 51
e se en as ,
11 sum(c.sales_amt) as Total
12 FROM (SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)as sales_amt
13 FROM Sales S
14 GROUP BY S.Market_Id, S.Product_Id) C
15 GROUP BY C.Product_Id;
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
52/127
Data Warehouse Concepts
SQL> select C.Product_id,
2 sum(case when Market_ID = 'M1'
3 then c.Sales_Amt
4 else NULL end)as M1,
5 sum(case when Market_ID = 'M2'6 then c.Sales_Amt
7 else NULL end)as M2,
8 sum(case when Market_ID = 'M3'
9 then c.Sales_Amt
Copyright 2011 Saeed Rahimi 52
e se en as ,
11 sum(c.sales_amt) as Total
12 FROM (SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)as sales_amt
13 FROM Sales S
14 GROUP BY S.Market_Id, S.Product_Id) C
15 GROUP BY C.Product_Id;
PR M1 M2 M3 TOTAL
-- ---------- ---------- ---------- ----------
P4 7503 7000 9903 24406
P1 3003 1503 15003 19509
P2 6003 2403 24003 32409
P3 4503 3 33 4539
-
7/31/2019 T15 Beyond SQL Rahimi
53/127
Data Warehouse Concepts
OLAP Operations Using three queries is wasteful
The first query does much of the work of the other two
If we could save that result of the first query and thenaggregate over Market_Id and Product_Id, we could computethe other queries more efficiently
The Cube operator in SQL 1999 has been designed to helpwith these types of requirements in OLAP
The CUBE function is used in the GROUP BY clause as
GROUP BY CUBE(v1, v2, , vn)
This is equivalent to a collection of GROUP BYs, one foreach value of v
Copyright 2011 Saeed Rahimi 53
-
7/31/2019 T15 Beyond SQL Rahimi
54/127
Data Warehouse Concepts OLAP Operations
Example using the CUBE-- Doing the three queries in one using the CUBE operator
--
SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)FROM Sales S
GROUP BY CUBE (S.Market_Id, S.Product_Id);
MARK PROD SUM(S.SALES_AMT)
---- ---- ----------------
NULL NULL 80863
Questions: What do NULLs represent?
How many of them are there?
Why?
NULL P1 19509
NULL P2 32409
NULL P3 4539
NULL P4 24406
M1 NULL 21012
M1 P1 3003
M1 P2 6003
M1 P3 4503
M1 P4 7503
M2 NULL 10909
M2 P1 1503
M2 P2 2403
M2 P3 3
M2 P4 7000
M3 NULL 48942
M3 P1 15003
M3 P2 24003
M3 P3 33
M3 P4 9903
Copyright 2011 Saeed Rahimi 54
-
7/31/2019 T15 Beyond SQL Rahimi
55/127
Data Warehouse Concepts
OLAP Operations
The ROLLUP Operator ROLLUP is similar to CUBE except that instead ofaggregating all subsets of the arguments, it creates subsetsmovin from ri ht to left
ROLLUP is also supported in SQL1999 ROLLUP does exactly what is sounds
It first finds the fine-grained aggregations of the dimensions,
Then, it uses them to calculate coarse-grained aggregations,
and Uses these aggregations to find the grand total
Copyright 2011 Saeed Rahimi 55
h
-
7/31/2019 T15 Beyond SQL Rahimi
56/127
Data Warehouse Concepts
OLAP Operations ROLLUP Example:
SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)
FROM Sales S
GROUP BY ROLLUP(S.Market_Id, S.Product_Id)
GROUP BY S.Market_Id, S.Product_Id
Then with the next level of granularity aggregating the product sales foreach MarketsGROUP BY S.Market_Id
And finally, using the total sales of all products in each market it figuresout the grand total which corresponds to an empty GROUP BY clause
Copyright 2011 Saeed Rahimi 56
D W h C
-
7/31/2019 T15 Beyond SQL Rahimi
57/127
Data Warehouse Concepts
OLAP Operations
Example of ROLLUP-- The ROLLUP operator--
SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)
FROM Sales S
GROUP BY ROLLUP (S.Market_Id, S. Product_Id);
MARK PROD SUM(S.SALES_AMT)
---- ---- ----------------
M1 P1 3003M1 P2 6003
M1 P3 4503
M1 P4 7503
M1 NULL 21012
M2 P1 1503
M2 P2 2403
M2 P3 3
M2 P4 7000M2 NULL 10909
M3 P1 15003
M3 P2 24003
M3 P3 33
M3 P4 9903
M3 NULL 48942
NULL NULL 80863
Copyright 2011 Saeed Rahimi 57
D t W h C t
-
7/31/2019 T15 Beyond SQL Rahimi
58/127
Data Warehouse Concepts
What does the following Rollup generate?
SELECT S.Market_Id, S.Product_Id, S.Time_ID, SUM (S.Sales_Amt)FROM Sales S
GROUP BY ROLLUP (S.Market_Id, S. Product_Id, s.Time_ID);
Copyright 2011 Saeed Rahimi 58
D t W h C t
-
7/31/2019 T15 Beyond SQL Rahimi
59/127
Data Warehouse Concepts
MA PR TI SUM(S.SALES_AMT)
-- -- -- ----------------
M1 P1 T1 1000
M1 P1 T2 1001M1 P1 T3 1002
M1 P1 3003
M1 P2 T1 2000
M1 P2 T2 2001
Copyright 2011 Saeed Rahimi 59
M1 P2 T3 2002
M1 P2 6003M1 P3 T1 1500
M1 P3 T2 1501
M1 P3 T3 1502
M1 P3 4503
M1 P4 T1 2500
M1 P4 T2 2501
M1 P4 T3 2502
M1 P4 7503
M1 21012
And so on
D t W h C t
-
7/31/2019 T15 Beyond SQL Rahimi
60/127
Data Warehouse Concepts
OLAP Operations ROLLUP Vs. CUBE
By contrast, the same query with CUBE
first aggregates with the finest granularity
GROUP BY S.Market Id S.Product Id_ _
then with the next level of granularity (both subsets)
GROUP BY S.Market_Id
GROUP BY S.Product_Id
then the grand total with
GROUP BY
Copyright 2011 Saeed Rahimi 60
Data Warehousing
-
7/31/2019 T15 Beyond SQL Rahimi
61/127
Data Warehousing
Oracles Advanced OLAP Operations
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
62/127
Data Warehouse Concepts
Advanced OLAP Operations
For this portion of the presentation, we will
use a general purpose practice database This database has the following schema
Copyright (c) 2011 Saeed Rahimi 62
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
63/127
Data Warehouse Concepts
Department
Department_name
Department Payroll_numberSocial_security_numberFk_department
Last_nameFirst_nameStreetCityStatePhone
Employee
Payroll_number
Security_option
Sectab
Wge_maint
M1 1 M
Fk_payroll_numberPurchase_dateOpticianCostCheck_number
Glasses
Fk_payroll_numberPurchase_dateTool_nameTool_costPayroll_deductPaymentLast_paymentFirst_payment_dat
Emp_tools
senseCurrent_positionEmployment_dateBirth_dateWagesGender
Tax_rateBottom_wageTop_wage
ax_ra eFk_payroll_number
Fk_department_numberClassificationClassification_dateOld_wagesNew_wages
1M
MM
1 1
Copyright (c) 2011 Saeed Rahimi 63
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
64/127
Data Warehouse Concepts
Advanced OLAP Operations
ROLLUP revisited
ROLLUP is used with the GROUP BY option
GROUP BY ROLLUP (expr1, expr2)
To compute, Oracle will first group by data by expr2
,
different values of expr1 It rolls up these aggregates to figure out sub-totals for eachvalue of expr1
And it adds up these sub-totals into a grand total
Copyright (c) 2011 Saeed Rahimi 64
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
65/127
Data Warehouse Concepts
Advanced OLAP Operations
GROUP BY vs. ROLLUPSQL> -- Group by - one expression
SQL> --
SQL> select department, SUM(wages)
SQL> -- Rollup - one expression
SQL> --
SQL> select department, SUM(wages)
Copyright (c) 2011 Saeed Rahimi 65
2 from department, employee
3 where department = fk_department
4 group by department
5 order by 1;
DEPA SUM(WAGES)
---- ----------
INT 65000
POL 87700
WEL 52000
2 from department, employee
3 where department = fk_department
4 group by rollup(department)
5 order by 1;
DEPA SUM(WAGES)
---- ----------
INT 65000
POL 87700
WEL 52000
204700
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
66/127
Data Warehouse Concepts
Advanced OLAP Operations
GROUP BY vs. ROLLUPSQL> -- Rollup - two expressions
SQL> --
SQL> select gender, department, SUM(wages)
SQL> select gender, department, SUM(wages)
2 from department, employee
3 where de artment = fk de artment2 from department, employee
3 where department = fk_department
4 group by rollup(department, gender)5 order by 1,2;
G DEPA SUM(WAGES)
- ---- ----------
F POL 9800
F WEL 7000
M INT 65000M POL 77900
M WEL 45000
INT 65000
POL 87700
WEL 52000
204700
_
4 group by department, gender
5 order by 1,2;
G DEPA SUM(WAGES)
- ---- ----------
F POL 9800
F WEL 7000
M INT 65000
M POL 77900
M WEL 45000
Copyright (c) 2011 Saeed Rahimi 66
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
67/127
Data Warehouse Concepts
Advanced OLAP Operations
ROLLUP revisited
Example: figuring out sub-totals for each gender andthen grand total for the department
SQL> select gender, department, sum(wages)
2 from department, employee
3 where department = fk_department4 group by rollup (gender, department)
5 order by 1,2;
GENDER DEPARTMENT SUM(WAGES)
------ ---------- ----------
F POL 9800
F WEL 7000F 16800
M INT 65000
M POL 77900
M WEL 45000
M 187900
204700
Aggregate value
Rollup total female wages
Rollup total male wages
Rollup total wages
Copyright (c) 2011 Saeed Rahimi 67
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
68/127
Data Warehouse Concepts
Advanced OLAP Operations
Partial ROLLUP
If you place GROUP BY expressions outside theROLLUP option, Oracle will:
Aggregate values based on these expressions outside
the GROUP BY Calculates ROLLUP or subtotals on the expressions
within the ROLLUP parameter list
Computes a ROLLUP values for each unique occurrence
of the expressions outside the ROLLUP Does NOT figure out the grand total
Copyright (c) 2011 Saeed Rahimi 68
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
69/127
Data Warehouse Concepts
Advanced OLAP Operations
Partial ROLLUP example:
Figuring out a partial rollup of wages per gender within adepartment
SQL> select gender, department, sum(wages)
2 from de artment em lo ee
Rolled-up values for the
POL department
Aggregate or grouped
Value for the POL
department
3 where department = fk_department
4 group by rollup(gender), department
5 order by 1,2;
GENDER DEPARTMENT SUM(WAGES)
------ ---------- ----------
F POL 9800
F WEL 7000
M INT 65000
M POL 77900M WEL 45000
INT 65000
POL 87700
WEL 52000
Copyright (c) 2011 Saeed Rahimi 69
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
70/127
Data Warehouse Concepts
Advanced OLAP Operations
ROLLUP vs. Partial ROLLUPROLLUP Partial ROLLUPgroup by rollup (gender), departmentgroup by rollup (gender, department)
GENDER DEPARTMENT SUM(WAGES)
------ ---------- ----------
F POL 9800
F WEL 7000
F 16800
M INT 65000
M POL 77900
M WEL 45000
M 187900
204700
GENDER DEPARTMENT SUM(WAGES)
------ ---------- ----------
F POL 9800
F WEL 7000
M INT 65000
M POL 77900
M WEL 45000
INT 65000
POL 87700
WEL 52000
What does this ROLLUP generate?
select gender Gender, department, sum(wages)
from department, employee
where department = fk_department
group by rollup(department, gender)
order by 1,2;Copyright (c) 2011 Saeed Rahimi 70
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
71/127
Data Warehouse Concepts
Advanced OLAP Operations
ROLLUP vs. Partial ROLLUPROLLUP Partial ROLLUPgroup by rollup (gender), departmentWhat does this ROLLUP generate?
select gender Gender, department, sum(wages)
GENDER DEPARTMENT SUM(WAGES)
------ ---------- ----------
F POL 9800
F WEL 7000
M INT 65000
M POL 77900
M WEL 45000
INT 65000
POL 87700
WEL 52000
rom epartment, emp oyee
where department = fk_department
group by rollup(department, gender)order by 1,2;
Copyright (c) 2011 Saeed Rahimi 71
G DEPA SUM(WAGES)
- ---- ----------
F POL 9800
F WEL 7000
M INT 65000
M POL 77900M WEL 45000
INT 65000
POL 87700
WEL 52000
204700
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
72/127
Data Warehouse Concepts
Advanced OLAP Operations
CUBE operator revisited
Unlike ROLLUP, CUBE will figure out sub-totals for allexpressions inside the CUBE and then rolls them up
SQL> select gender, department, sum(wages)
2 from department, employee
3 where department = fk_department
4 group by cube (gender, department)
5 order by 1,2;
GENDER DEPARTMENT SUM(WAGES)
------ ---------- ----------
F POL 9800
F WEL 7000
F 16800
M INT 65000
M POL 77900M WEL 45000
M 187900
INT 65000
POL 87700
WEL 52000
204700
11 rows selected.Copyright (c) 2011 Saeed Rahimi 72
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
73/127
Data Warehouse Concepts Advanced OLAP Operations
Partial CUBE operator
Similar to partial ROLLUP, partial CUBE will calculate rollup
values for each unique occurrence of expression(s) outsidethe cube
SQL> select gender, department, sum(wages)
2 from department, employee
3 where department = fk_department
4 group by cube (gender), department5 order by 1,2;
GENDER DEPARTMENT SUM(WAGES)
------ ---------- ----------
F POL 9800
F WEL 7000
M INT 65000M POL 77900
M WEL 45000
INT 65000
POL 87700
WEL 52000
8 rows selected.
Why is this partial cube exactly
the same as the partial Rollup?
Copyright (c) 2011 Saeed Rahimi 73
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
74/127
p
Advanced OLAP Operations
CUBE vs. Partial CUBEPartial CUBE
group by cube (gender), department
CUBE
group by cube (gender, department)
GENDER DEPARTMENT SUM(WAGES)
------ ---------- ----------
F POL 9800
F WEL 7000
M INT 65000
M POL 77900
M WEL 45000
INT 65000
POL 87700
WEL 52000
GENDER DEPARTMENT SUM(WAGES)
------ ---------- ----------
F POL 9800
F WEL 7000
F 16800
M INT 65000
M POL 77900
M WEL 45000
M 187900
INT 65000
POL 87700
WEL 52000
204700
Copyright (c) 2011 Saeed Rahimi 74
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
75/127
p
Advanced OLAP Operations
CUBE vs. ROLLUPCUBE
group by cube (gender, department)
ROLLUP
group by rollup (gender, department)
------ ---------- ----------
F POL 9800
F WEL 7000
F 16800
M INT 65000
M POL 77900
M WEL 45000
M 187900
INT 65000
POL 87700
WEL 52000
204700
------ ---------- ----------
F POL 9800F WEL 7000
F 16800
M INT 65000
M POL 77900
M WEL 45000
M 187900
204700
Copyright (c) 2011 Saeed Rahimi 75
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
76/127
p
Advanced OLAP Operations
Next pages shows the side by side views of
the results for CUBE, ROLLUP, PartialCUBE and Partial ROLLUP
ny n eres ng o serva ons
Copyright (c) 2011 Saeed Rahimi 76
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
77/127
Advanced OLAP Operations
CUBE Partial CUBE
Department
Sum(Wages) INT POL WEL Total
F 0 9800 7000 16800
Gender M 65000 77900 45000 187900
Department
Sum(Wages) INT POL WEL Total
F 0 9800 7000
Gender M 65000 77900 45000
Tota l 65000 87 700 52 00 0 2 04 700
ROLLUP Partial ROLLUP
Total 65 000 87 700 5200 0
Department
Sum(Wages) INT POL WEL Total
F 0 9800 7000 16800
Gender M 65000 77900 45000 187900
Total 204700
Department
Sum(Wages) INT POL WEL Total
F 0 9800 7000
Gender M 65000 77900 45000
Total 65 000 87 700 5200 0
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
78/127
p
Advanced OLAP Operations GROUPING Function
What is the problem with CUBE and ROLLUPfunction?
cu y n en y ng e rows a are su - o a
One way is to find the rows that contain NULL values Expressions that are sub-totaled will have a value in the
column that determines the ROLLUP
The other expressions will contain null values
This works well if the database does not have any nullvalues in it otherwise, it will be confusing
GROUPING function can help with this.
Copyright (c) 2011 Saeed Rahimi 78
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
79/127
Advanced OLAP Operations
GROUPING Function Usage: GROUPING (expression)
More than one GROUPING function calls areallowed in one SQL statement
GROUPING will return a 1 if the row is a sub-totalrow for the expression
GROUPING will return a 0 if the row is NOT a sub-total row for the expression
Copyright (c) 2011 Saeed Rahimi 79
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
80/127
Advanced OLAP Operations
GROUPING Function
Example:SQL> select gender, department, sum(wages),
2 grouping(gender) as gdr,
3 grouping(department) as dpt
4 from department, employee
5 where department = fk_department
6 group by rollup(gender, department)7 order by 1,2;
G DEPA SUM(WAGES) GDR DPT
- ---- ---------- ---------- ----------
F POL 9800 0 0
F WEL 7000 0 0
F 16800 0 1
M INT 65000 0 0
M POL 77900 0 0
M WEL 45000 0 0
M 187900 0 1
204700 1 1
Copyright (c) 2011 Saeed Rahimi 80
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
81/127
Advanced OLAP Operations
Making the report more readable Use of DECODE Function
column value
It acts as a complex IFTHENELSE or a CASEstatement
Copyright (c) 2011 Saeed Rahimi 81
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
82/127
Advanced OLAP Operations
Making the report more readable Example: Assume the following query results
SQL> select empno, deptno
2 from emp
3 order by deptno;
EMPNO DEPTNO
---------- ----------
7782 10
7839 10
7934 10
7369 20
7876 20
7902 20
7788 20
7566 207499 30
7698 30
7654 30
7900 30
7844 30
7521 30
Copyright (c) 2011 Saeed Rahimi 82
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
83/127
Advanced OLAP Operations
Making the report more readable
Now assume we want to print Ten for 10, Twenty for 20 and thirty for
30 in deptno column Decode can do this very easily
SQL> select empno, decode (deptno,
2 10, 'Ten',
3 20, 'Twent ',
4 30, 'Thirty', 'OTHER')
5 from emp
6 order by deptno;
EMPNO DECODE
---------- ------
7782 Ten
7839 Ten
7934 Ten
7369 Twenty
7876 Twenty
7902 Twenty7788 Twenty
7566 Twenty
7499 Thirty
7698 Thirty
7654 Thirty
7900 Thirty
7844 Thirty
7521 Thirty
Copyright (c) 2011 Saeed Rahimi 83
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
84/127
Advanced OLAP Operations
Use of DECODE function
Example:SQL> select decode(grouping(gender),1, 'Total Wages', gender) gender,
2 decode(grouping(department),1, 'Total Per Gender', department) department,
4 from department, employee
5 where department = fk_department
6 group by rollup(gender, department)7 order by 1;
GENDER DEPARTMENT SUM(WAGES)
----------- ---------------- ----------
F POL 9800
F WEL 7000
F Total Per Gender 16800
M INT 65000
M POL 77900
M WEL 45000
M Total Per Gender 187900
Total Wages Total Per Gender 204700
Copyright (c) 2011 Saeed Rahimi 84
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
85/127
How do we generate the following report?
DEPARTMENT GENDER SUM(WAGES)
----------- -------------------- ----------
INT Total Per Department 65000
INT M 65000
POL M 77900
POL F 9800
POL Total Per Department 87700
WEL F 7000
WEL M 45000
WEL Total Per Department 52000
Total Wages Total Per Department 204700
Copyright (c) 2011 Saeed Rahimi 85
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
86/127
How do we generate the following report?SQL> select decode(grouping(department),1, 'Total Wages', department) department,
2 decode(grouping(gender),1, 'Total Per Department', gender) gender,3 sum(wages)
4 from department, employee
5 where department = fk_department
6 group by rollup(department, gender);
DEPARTMENT GENDER SUM(WAGES)
----------- -------------------- ----------
INT Total Per Department 65000
INT M 65000
POL M 77900
POL F 9800
POL Total Per Department 87700
WEL F 7000
WEL M 45000
WEL Total Per Department 52000
Total Wages Total Per Department 204700
Copyright (c) 2011 Saeed Rahimi 86
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
87/127
Advanced OLAP Operations
Use of DECODE function
Example: To suppress the extra Fs and MsSQL>break on gender;
SQL> select decode(grouping(gender),1, 'Total Wages', gender) gender,
' ', , , ,
3 sum(wages)
4 from department, employee
5 where department = fk_department6 group by rollup(gender, department)
7 order by 1;
GENDER DEPARTMENT SUM(WAGES)
----------- ---------------- ----------
F POL 9800
WEL 7000
Total Per Gender 16800M INT 65000
POL 77900
WEL 45000
Total Per Gender 187900
Total Wages Total Per Gender 204700
Copyright (c) 2011 Saeed Rahimi 87
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
88/127
In the previous example, for the row thatindicates the Total Wages, we should NOT
print the Total Per Gender How do we do that?
GENDER DEPARTMENT SUM(WAGES)
----------- ---------------- ----------
F POL 9800
WEL 7426.3
Total Per Gender 17226.3
M INT 65000
POL 77900
WEL 47740.5
Total Per Gender 190640.5
Total Wages 207866.8
Do not print anything here!
Copyright (c) 2011 Saeed Rahimi 88
Data Warehouse ConceptsGENDER DEPARTMENT SUM(WAGES)
-
7/31/2019 T15 Beyond SQL Rahimi
89/127
GENDER DEPARTMENT SUM(WAGES)
----------- ---------------- ----------
F POL 9800
WEL 7426.3
Total Per Gender 17226.3
M INT 65000
POL 77900
WEL 47740.5
Tota Per Gen er .
Total Wages 207866.8Do not print anything here!
Copyright (c) 2011 Saeed Rahimi 89
break on gender;
select decode(grouping(gender),1, 'Total Wages', gender) gender,
decode(grouping(department),1,
decode(grouping(gender),0,'Total Per Gender',' '), department)department,
sum(wages)
from department, employee
where department = fk_departmentgroup by rollup(gender, department)
order by 1;
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
90/127
Advanced OLAP Operations
The RANK Function Oracle 11g providesseveral functions for ranking rows returnedfrom a SELECT statement
The functions can calculate rankings,percentiles and n-tiles
These functions are performed after the select
statement returns the rows and prior toprinting the results
Copyright (c) 2011 Saeed Rahimi 90
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
91/127
Advanced OLAP Operations The RANK Function Syntax
Rank() over (
[partition by expression, expression]Order by expression[collate clause] [asc | desc]
nu s rs nu s as
Only Rank and Order by are mandatory clauses That is because in order to rank rows, the result set must be
sorted The expression in the order by clause is used for ranking Default sort order is ascending
By default, NULL values are considered the largest Can change where NULL values will appear using nulls first or
nulls last
Copyright (c) 2011 Saeed Rahimi 91
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
92/127
Advanced OLAP Operations The RANK function Example:
Departments CEN and TRF do not have employees sum of thewages is null
Nulls by default are printed first
ere are emp oyees w ou sa ary n e a a ase as we
SQL> select department DPT, sum(wages),2 rank() over (order by sum(wages) desc)
3 as rank_all
4 from department, employee
5 where department = fk_department(+) -- Outer join department
6 group by department;
DPT SUM(WAGES) RANK_ALL
---- ---------- ----------
CEN 1
TRF 1
POL 87700 3
INT 65000 4
WEL 52000 5
Copyright (c) 2011 Saeed Rahimi 92
Rank does not
advance for equal
values
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
93/127
S L> select de artment DPT sum wa es
Advanced OLAP Operations The RANK function Example continued
NULLs last
2 rank() over (order by sum(wages) desc nulls last )as rank_all
3 from department, employee
4 where department = fk_department(+) -- Outer join department5 group by department;
DPT SUM(WAGES) RANK_ALL
---- ---------- ----------
POL 87700 1
INT 65000 2
WEL 52000 3CEN 4
TRF 4
Copyright (c) 2011 Saeed Rahimi 93
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
94/127
Advanced OLAP Operations The RANK function Example continued
We will use the function nvl to provide values for where there is
none Syntax nvl(col_name, val)
If column does not have a value, then val is used instead
In this example, we provide the value of 90,000 for null values
of salary
Value of 4 is missing
SQL> select department, sum(nvl(wages,90000)) total_wages,
2 rank() over (order by sum(nvl(wages,90000)) desc)
3 as rank_all
4 from department, employee
5 where department = fk_department(+)
6 group by department;
DEPA TOTAL_WAGES RANK_ALL
---- ----------- ----------
INT 155000 1
WEL 142000 2
CEN 90000 3
TRF 90000 3
POL 87700 5
Copyright (c) 2011 Saeed Rahimi 94
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
95/127
Advanced OLAP Operations The following does not look right since on page 93 the INT department
had total wages of 65500. Now it says, 155000 what is going on?
SQL> select department, sum(nvl(wages,90000)) total_wages,
2 rank() over (order by sum(nvl(wages,90000)) desc)
_
4 from department, employee
5 where department = fk_department(+)
6 group by department;
DEPA TOTAL_WAGES RANK_ALL
---- ----------- ----------
INT 155000 1
WEL 142000 2
CEN 90000 3
TRF 90000 3POL 87700 5
Copyright (c) 2011 Saeed Rahimi 95
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
96/127
Advanced OLAP Operations The following does not look right since on page 93 the INT department
had total wages of 65500. Now it says, 155000 what is going on?SQL> select payroll_number, wages
2 from employee
= ' '_
PAYROLL_NUMBER WAGES
-------------- ----------25 9500
46 9500
36 14000
33 13000
29
28 11000
22 8000
7 rows selected.
Copyright (c) 2011 Saeed Rahimi 96
One employee without salary.
NVL replaces this with 90000
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
97/127
Advanced OLAP Operations The DENSE_RANK Function
Dense RANK function does the same thing as the RANK function
except that it does not count the number of equal ranks. It makes sure that all ranks are used
Here is the same example as before using the dense RANK
SQL> select department, sum(nvl(wages, 90000)) total_wages,
2 dense_rank() over (order by sum(nvl(wages,90000)) desc) as rank_dense
3 from department, employee
4 where department = fk_department(+)
5 group by department;
DEPA TOTAL_WAGES RANK_DENSE
---- ----------- ----------INT 155000 1
WEL 142000 2
CEN 90000 3
TRF 90000 3
POL 87700 4
Value of 4 is NOT missing
Copyright (c) 2011 Saeed Rahimi 97
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
98/127
Advanced OLAP Operations
Top-N and Bottom-N queries Rank functions rank the rows of the result set - -
portion of the ranked rows from the top or thebottom
There are two steps required to do this:
Create an inline view to develop the data and the
rankings Use the RANK expression in the where clause to identify
the number of Top and Bottom ranked records
Copyright (c) 2011 Saeed Rahimi 98
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
99/127
Advanced OLAP Operations
The Top-N rows example: Display the top three salary earning people in thecom an if salar is missin re lace it with 0
SQL> select last_name, first_name, wages, emp_wage_rank
2 from (select last_name, first_name, wages,
3 rank() over(order by nvl(wages,0) desc) as emp_wage_rank4 from employee)
5 where emp_wage_rank
-
7/31/2019 T15 Beyond SQL Rahimi
100/127
Advanced OLAP Operations
How do get the three lowest paid employeesin the organization?LAST_NAME FIRST_NAME WAGES EMP_WAGE_RANK
--------------- --------------- ---------- -------------
EISENHOWER DWIGHT 1
ROOSEVELT ELEANOR 1
ANTHONY SUSANNE 7000 2
JOHNSON ANDREW 7500 3
Copyright (c) 2011 Saeed Rahimi 100
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
101/127
Advanced OLAP Operations
How do get the three lowest paid employeesin the organization?LAST_NAME FIRST_NAME WAGES EMP_WAGE_RANK
--------------- --------------- ---------- -------------
EISENHOWER DWIGHT 1
ROOSEVELT ELEANOR 1
ANTHONY SUSANNE 7000 2
JOHNSON ANDREW 7500 3
Copyright (c) 2011 Saeed Rahimi 101
select last_name, first_name, wages, emp_wage_rank
from (select last_name, first_name, wages,
dense_rank() over(order by nvl(wages,0) asc) as emp_wage_rank
from employee)where emp_wage_rank
-
7/31/2019 T15 Beyond SQL Rahimi
102/127
result set SQL> select last_name as LNAME, wages,2 row_number() over(order by wages) as Row_number
3 from employee;
LNAME WAGES ROW_NUMBER
--------------- ---------- ----------
ANTHONY 7000 1
JOHNSON 7500 2
ROOSEVELT 8000 3
TAFT 8500 4
Copyright (c) 2011 Saeed Rahimi 102
COOLIDGE 9500 6
MILLER 9500 7
DWORCZAK 9800 8HOOVER 10000 9
ROOSEVELT 10400 10
TRUMAN 11000 11
KENNEDY 11500 12
JOHNSON 12000 13
NIXON 12500 14
FORD 13000 15
CARTER 13000 16REAGAN 13500 17
BUSH 14000 18
CLINTON 15000 19
EISENHOWER 20
ROOSEVELT 21
21 rows selected.
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
103/127
Advanced OLAP Operations Top-N and RANK functions allow selecting top n rows
in a set of records that the select statement returns Consider the following:
We need to print the top two paid employees within each
This requires ranking of employees based on their salaries
within each department The PARTITION function can achieve this!
NOTE: This partition function is NOT the same as physically
partitioning tables for performance purposes This is a logical partitioning (temporary) that results inmemory and is lost after the query executes
Copyright (c) 2011 Saeed Rahimi 103
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
104/127
Advanced OLAP Operations
The use of PARTITION clause in conjunction with the
RANK FunctionSQL> select department_name, last_name, first_name, wages, emp_wage_rank
2 from (select department_name, last_name, first_name, wages,
_ , _ _
4 from department, employee where department = fk_department)
5 where emp_wage_rank
-
7/31/2019 T15 Beyond SQL Rahimi
105/127
Advanced OLAP OperationsWindowing Oracle has the functionality that allows
you to calculate values based on a period of time(called a window)
The functions in this class can be used to com utemoving, cumulative and centered aggregates
They include moving averages, moving sums,moving min/max, cumulative sum, and LAG/LEAD
These functions create a value that is based onvalues that precede or follow the record
The windowing functions can be used in the SELECTand ORDER BY clauses
Copyright (c) 2011 Saeed Rahimi 105
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
106/127
Advanced OLAP Operations
The syntax{Sum | Avg |Max | Min | Count | Stddev | Variance| First_value |
Last_value}
*
Over ({partiton by [,]]
Order by [collate clause>][asc |desc] [nulls first |nulls last] [, ]
Rows | range {{unbounded preceding | preceding} between {unbounded preceding |
-
7/31/2019 T15 Beyond SQL Rahimi
107/127
Advanced OLAP Operations
Windowing functions clauses
Over Tell Oracle that the function will operate over a query result set.
Partition by Determines how the data will be segmented for analysis
Order By Determines how the data will be sorted within the partition. Options are ASC (default),, .
Rows | Range These keywords determine the windows used for the calculation. The rows keyword isused to specify the window as a set of rows. Range sets the window as a logical offset.
This function cannot be used unless the order by clause is used
Between .. AND Determines the starting point and end point of the window. Omitting the betweenkeyword and specifying only one end point will cause Oracle to consider the endpoint asthe starting point. The current row will then consist of the current row.
UnboundedPreceding
Sets the first row of the partition as the starting point of the window
UnboundedFollowing
Sets the last row of the partition as the endpoint of the window
Current Row Sets the current row as the starting point or as the end point of the window
Data Warehouse Concepts
-
7/31/2019 T15 Beyond SQL Rahimi
108/127
Advanced OLAP Operations Windowing function: cumulative aggregate function
example find cumulative cost of tools purchased within POL departmentSQL> select department, last_name, first_name, tool_cost, sum(tool_cost)2 over (order by purchase_date rows unbounded preceding) balance
3 from department, employee, emp_tools
4 where department = fk_department
5 and payroll_number = fk_payroll_number
6 and department = 'POL';
DEPA LAST_NAME FIRST_NAME TOOL_COST BALANCE
---- --------------- --------------- ---------- ----------
POL JOHNSON ANDREW 5.95 5.95
POL JOHNSON ANDREW 10.75 16.7
POL WILSON WOODROW 4.95 21.65
POL WILSON WOODROW 100 121.65
POL WILSON WOODROW 12 133.65
POL ROOSEVELT FRANKLIN 12 145.65
POL ROOSEVELT FRANKLIN 8 153.65
POL NIXON RICHARD 12.75 166.4
POL NIXON RICHARD 5.75 172.15
Copyright (c) 2011 Saeed Rahimi 108
Data Warehouse Concepts Advanced OLAP Operations Windowing function: cumulative aggregate function
-
7/31/2019 T15 Beyond SQL Rahimi
109/127
example find cumulative cost of tools purchased within each department
SQL> select department, last_name, first_name, tool_cost, sum(tool_cost)
2 over (partition by department order by purchase_date rows unbounded preceding) balance
3 from department, employee, emp_tools
4 where department = fk_department
5 and payroll_number = fk_payroll_number;
DEPA LAST_NAME FIRST_NAME TOOL_COST BALANCE
---- --------------- --------------- ---------- ----------
INT ROOSEVELT THEODORE 34 34
INT ROOSEVELT THEODORE 290 324
INT COOLIDGE CALVIN 25 349
INT COOLIDGE CALVIN 10 359
INT EISENHOWER DWIGHT 25 384
INT EISENHOWER DWIGHT 200 584INT EISENHOWER DWIGHT 150 734
INT FORD GERALD 12 746
INT FORD GERALD 0 746
INT FORD GERALD 0 746
INT BUSH GEORGE 2.75 748.75
INT BUSH GEORGE 35.95 784.7
INT BUSH GEORGE 7.5 792.2
INT MILLER KEVIN 100 892.2
INT MILLER KEVIN 0 892.2
POL JOHNSON ANDREW 5.95 5.95POL JOHNSON ANDREW 10.75 16.7
POL WILSON WOODROW 4.95 21.65
POL WILSON WOODROW 100 121.65
POL WILSON WOODROW 12 133.65
POL ROOSEVELT FRANKLIN 12 145.65
POL ROOSEVELT FRANKLIN 8 153.65
POL NIXON RICHARD 12.75 166.4
POL NIXON RICHARD 5.75 172.15
Copyright (c) 2011 Saeed Rahimi 109
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
110/127
p Windowing Moving Averages
Moving averages can be computed when the
window is changed over time A moving average can be computed if several
function
A range interval is needed to identify the number ofvalues used
A time unit is needed. Common time units are year,month and day
The preceding and/or following keywords are neededto indicate which records in the ordered set will beused
Copyright (c) 2011 Saeed Rahimi 110
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
111/127
Moving average function example
This example computes the moving average tools cost for INTdepartment in the past 20 years
SQL> select department as DPT, to_char(purchase_date, 'YYYY-DD-MON'), tool_cost,
2 avg(tool_cost) over (order by purchase_date range interval '20' year preceding)as Average
3 from department, employee, emp_tools
0 records in the
previous 20 years
1 records in the
previous 20 years
2 records in the
previous 20 years
1 records in the
previous 20 years
Copyright (c) 2011 Saeed Rahimi 111
4 where department = fk_department and payroll_number = fk_payroll_number and department = 'INT'
5 order by purchase_date;
DPT TO_CHAR(PUR TOOL_COST AVERAGE
---- ----------- ---------- ----------
INT 1903-01-FEB 34 34
INT 1905-10-MAR 290 162
INT 1922-01-OCT 25 116.333333
INT 1923-01-FEB 10 89.75
INT 1953-01-MAR 25 25
INT 1953-31-MAR 200 125
INT 1953-31-MAR 150 125
INT 1974-01-JAN 12 12INT 1974-10-AUG 0 6
INT 1977-23-MAR 0 4
INT 1988-23-SEP 2.75 3.6875
INT 1988-10-NOV 35.95 10.14
INT 1989-23-FEB 7.5 9.7
INT 2001-08-APR 100 36.55
INT 2001-23-MAY 0 29.24
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
112/127
Moving average function example
This example computes the total salary for employees in a 40
year period around the hire date of the current employeeSQL> select department as DPT, first_name, Last_name,
2 to_char(employment_date, 'YYYY-DD-MON') hire_date, wages,
3 SUM(wages) OVER (ORDER BY employment_date RANGE BETWEEN
4 INTERVAL '20' YEAR PRECEDING AND INTERVAL '20' YEAR FOLLOWING) ctrd_sum
5 from department, employee
6 where department = fk_department
7 and department = 'INT' order by 4;
DPT FIRST_NAME LAST_NAME HIRE_DATE WAGES CTRD_SUM
---- --------------- --------------- ----------- ---------- ----------
INT THEODORE ROOSEVELT 1902-20-NOV 8000 17500
INT CALVIN COOLIDGE 1921-07-AUG 9500 17500
INT HAROLD TRUMAN 1945-15-APR 11000 11000
INT DWIGHT EISENHOWER 1953-20-MAR 11000INT GERALD FORD 1973-20-MAY 13000 27000
INT GEORGE BUSH 1988-05-JAN 14000 36500
INT KEVIN MILLER 2000-12-OCT 9500 23500
7 rows selected.
Copyright (c) 2011 Saeed Rahimi
112
-
7/31/2019 T15 Beyond SQL Rahimi
113/127
Questions
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
114/127
Advanced OLAP Operations
Question 1: Print the cost of tools per classification
(position) within gender. Subtotal the costs for eachgender GENDER CURRENT_POSITIO Tool Cost------ --------------- ----------
F ADMINISTRATOR
F SALESPERSON 2 88.85
F SYSTEM ANALYST 61.95
F 150.8
M CLERK 1 20
M CLERK 2 46.2M CONTROLLER 324
M COUNSELER 2
M GUARD 4 375
M JANITOR 35
M LABORER 2 12
M LABORER 3
M MAINT. MAN 2 24
M MAINT. MAN 3 116.95
M PRESIDENT 28.7M PROGRAMMER 1
M SALESPERSON 1 16.7
M TREASURER 18.5
M TREASURER CLERK
M VICE PRESIDENT 123
M 1140.05
1290.85
22 rows selected.
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
115/127
Advanced OLAP Operations
Question 1: Print the cost of tools per classification
(position) within gender. Subtotal the costs for eachgender
select gender, current_position,sum(tool_cost) "Tool Cost"
from employee, emp_tools
where payroll_number = fk_payroll_number(+)group by rollup (gender, current_position)
order by 1,2;
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
116/127
Advanced OLAP Operations
Question 2: Determine the two employees in each
department who had the largest cost of eye glasses
' '_ , _ _ _ _ _
---------- -------------------------------- ------------- -------------------------
INT BUSH, GEORGE 1
INT EISENHOWER, DWIGHT 15 2POL CLINTON, WILLIAM 1
POL KENNEDY, JOHN 1
POL DWORCZAK, ALICE 1
WEL HOOVER, HERBERT 1
WEL ANTHONY, SUSANNE 120 2
7 rows selected.
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
117/127
Advanced OLAP Operations
Question 2: Determine the two employees in each
department who had the largest cost of eye glasses
*
from (select department, last_name||', '||first_name, sum(cost) eyeglass_cost,
rank() over (partition by department
order by sum(cost) asc nulls first) Lowest_eyeglass_cost_rankfrom department, employee, glasses
where department = fk_department
and payroll_number = fk_payroll_number(+)
group by department, last_name, first_name)
where lowest_eyeglass_cost_rank
-
7/31/2019 T15 Beyond SQL Rahimi
118/127
Advanced OLAP Operations
Question 3:Create a checkbook style
cumulative cost of eye glassesPURCHASE_ LAST_NAME||','||FIRST_NAME EYEGLASS_COST BALANCE
--------- -------------------------------- ------------- ----------
12-MAR-03 ROOSEVELT, THEODORE 123 123
06-MAY-04 ROOSEVELT, THEODORE 145 268
08-NOV-10 TAFT, WILLIAM 145 413
01-JAN-17 WILSON, WOODROW 123 536
15-NOV-23 COOLIDGE, CALVIN 175 71103-JUN-33 ROOSEVELT, FRANKLIN 129 840
01-JUL-33 ROOSEVELT, ELEANOR 134 974
20-JUL-35 ROOSEVELT, ELEANOR 143 1117
12-AUG-40 ANTHONY, SUSANNE 120 1237
12-OCT-47 TRUMAN, HAROLD 110 1347
31-MAR-53 EISENHOWER, DWIGHT 15 1362
31-JAN-64 JOHNSON, LYNDON 170 1532
31-MAY-67 JOHNSON, ANDREW 165 1697
23-JUN-70 NIXON, RICHARD 123 182001-FEB-74 FORD, GERALD 145 1965
08-SEP-77 CARTER, JIMMY 164 2129
12-AUG-79 CARTER, JIMMY 175 2304
23-OCT-83 REAGAN, RONALD 165 2469
21-DEC-00 MILLER, KEVIN 165 2634
19 rows selected.
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
119/127
Advanced OLAP Operations
Question 3:Create a checkbook style
cumulative cost of eye glasses
select purchase_date, last_name||', '||first_name,
cost eyeglass_cost,
sum(cost) over (order by purchase_date
rows unbounded preceding) balance
from employee, glasses
where payroll_number = fk_payroll_number;
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
120/127
Advanced OLAP Operations
Question 4: Write a SQL statement that
counts the number of eyeglasses within oneof the four cost classes: less than $100,$100 to $125, $126 to $150 and Above $150
cost cat COUNT
------------- ----------
100 to 125 5126 to 150 6
Above 150 7
Less than 100 1
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
121/127
Question 4: Write a SQL statement that counts the number ofeyeglasses within one of the four cost classes: less than $100,
$100 to $125, $126 to $150 and Above $150
select (case when cost = 100 and cost 126 and cost 150 then 'Above 150' end)
"cost cat", count(*) as count
from glasses
group by (case when cost = 100 and cost 126 and cost 150 then 'Above 150' end);
-
7/31/2019 T15 Beyond SQL Rahimi
122/127
Questions?
Contact information
[email protected] 962 5514
Copyright (c) 2011 Saeed Rahimi122
SQL Statement with Group By
-
7/31/2019 T15 Beyond SQL Rahimi
123/127
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
124/127
Advanced OLAP Operations
The LAG and LEAD Functions
These functions are useful for computing thedifference between values in different rows
The LAG and LEAD functions return the value of a
preceding or following row to the current row Syntax:
Lag (expression, record offset)
Lead (expression, record offset) For example, offset 1 refers to the row immediately
before the current row (for Lag) and the rowimmediately after the current row (for Lead)
Data Warehouse Concepts Advanced OLAP Operations
The use of Lag and Lead
Example: find the wage salary difference of each employee and the one
-
7/31/2019 T15 Beyond SQL Rahimi
125/127
immediately hired before and after the employee
SQL> select last_name, wages,
2 lead(wages, 1) over(order by employment_date) - wages as "Lead Diff",
3 lag(wages, 1) over(order by employment_date) - wages as "Lag Diff"4 from employee;
LAST_NAME WAGES Lead Diff Lag Diff
--------------- ---------- ---------- ----------
ROOSEVELT 8000 500
TAFT 8500 500 -500
WILSON 9000 500 -500
COOLIDGE 9500 500 -500
HOOVER 10000 -500
ROOSEVELT
ROOSEVELT 10400 -3400
ANTHONY 7000 4000 3400
TRUMAN 11000 -4000
EISENHOWER
KENNEDY 11500 500
JOHNSON 12000 -4500 -500
JOHNSON 7500 5000 4500
NIXON 12500 500 -5000
FORD 13000 0 -500
CARTER 13000 500 0
REAGAN 13500 500 -500
BUSH 14000 1000 -500
CLINTON 15000 -5200 -1000
DWORCZAK 9800 -300 5200
MILLER 9500 300
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
126/127
p
Question: Join table employee with itself (self join) to
find out the name of the person immediately hiredafter each employee for department WEL
e are oo ng or s s
LAST_NAME Hire Date Next emp--------------- ---------- -----------
TAFT 1908-06-01 HOOVER
HOOVER 1928-04-06 ROOSEVELT
ROOSEVELT 1932-03-20 ANTHONY
ANTHONY 1940-03-30 CARTERCARTER 1976-07-10 REAGAN
REAGAN 1980-03-03
6 rows selected.
Data Warehouse Concepts
Advanced OLAP Operations
-
7/31/2019 T15 Beyond SQL Rahimi
127/127
d a ced O Ope at o s
Question:
Does the use of self join provide the right answer?
If not why?
How do we get the right answer?
Use of Lag function
Repeat the work but this time use the Lag (orLead) functions to get the right answer
_ _
Prev emp
-------------- --------------- -------------------------
20 ANTHONY 19
ROOSEVELT
35 REAGAN 34
CARTER