trivadis techevent 2016 useful oracle 12c features for data warehousing by dani schnider
TRANSCRIPT
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA
HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH
Useful Oracle 12c Features
for Data WarehousingTechEvent, 9 September 2016
Dani Schnider, Trivadis AG
Oracle 12c: Many new Features – not only for DWH
Useful Oracle 12c Features for Data Warehousing2 19.09.2016
In-Memory Option
Multitenant DatabasesSQL Pattern Matching
Information Lifecycle Management
Adaptive Query Optimization
JSON Support
Out-of-Place Materialized Views
Data Redaction Temporal Validity
SQL Query Row Limits
UTL_CALL_STACK
PL/SQL in SQL WITH Clause
Invisible Columns
Online Statistics Gathering
Default Values Enhancements
IDENTITY Columns
APPROX_COUNT_DISTINCT
Partial Indexing
Asynchronous Global Index Maintanance
ETL Processes and Object Statistics
Useful Oracle 12c Features for Data Warehousing4 19.09.2016
DBMS_STATS.gather_table_stats
(ownname => 'STG'
,tabname => 'T1');T1
T2
T3DBMS_STATS.gather_table_stats
(ownname => 'STG'
,tabname => 'T2');
| 1 | INSERT STATEMENT | | 1500 |
| 2 | INSERT | T3 | 1500 |
| 3 | HASH JOIN | | 1500 |
| 4 | TABLE ACCESS FULL| T1 | 2000 |
| 5 | TABLE ACCESS FULL| T2 | 3000 |
ETL-Mapping
T1
2000
T2
3000
T3
1500
See presentation „So beschleunigen Sie Ihre ETL-Prozesse“ (DOAG BI 2015)
Online Statistics Gathering
Useful Oracle 12c Features for Data Warehousing5 19.09.2016
INSERT /*+ append */ INTO T3
SELECT ...
FROM T1 JOIN T2 ON ...
– Direct-Load INSERT in empty table (after TRUNCATE)
CREATE TABLE T3 AS
SELECT ... FROM T1 JOIN T2 ON ...
After loading a table, statistics should always be gathered
Since Oracle 12c, table statistics are gathered automatically for the following cases:
– CREATE TABLE AS SELECT
Online Statistics Gathering
Useful Oracle 12c Features for Data Warehousing6 19.09.2016
Use Cases:
Volatile tables in ETL job flow (Staging Area, Cleansing Area)
Auxiliary tables for intermediate results of load jobs
Restrictions:
No index statistcs
No histograms
No statistics for partitions and subpartitions
See blog https://danischnider.wordpress.com/2015/12/23/online-statistics-gathering-in-oracle-12c/
Replace Missing Attributes with „Singletons“
Useful Oracle 12c Features for Data Warehousing8 19.09.2016
INSERT INTO cls_products (product_code, product_desc)
SELECT product_code
, NVL(product_desc, 'Unknown')
FROM stg_products;
STG_PRODUCTS
Edradour 10 years
Glenfarclas 105
Black Bowmore 1964
NULL
Laphroaig 15 years
CLS_PRODUCTS
Edradour 10 years
Glenfarclas 105
Black Bowmore 1964
Unknown
Laphroaig 15 years
See presentation „Fehlertolerante Ladeprozesse in Oracle“ (DOAG BI 2012)
DEFAULT Extensions in Oracle 12c
Useful Oracle 12c Features for Data Warehousing9 19.09.2016
CREATE TABLE dwh_whisky
(dwh_id NUMBER(8) DEFAULT seq_whisky.NEXTVAL
,whisky_code VARCHAR2(8) NOT NULL
,whisky_name VARCHAR2(40) NOT NULL
,price NUMBER (6,2) DEFAULT ON NULL 0
,age VARCHAR2(3) DEFAULT ON NULL '< 7'
,distillery VARCHAR2(30) DEFAULT ON NULL 'Unknown Distillery'
,region VARCHAR2(30) DEFAULT ON NULL 'Unknown Region')
DEFAULT ON NULL
DEFAULT can be a sequence (finally!)
IDENTITY Columns
Useful Oracle 12c Features for Data Warehousing10 19.09.2016
CREATE TABLE dwh_whisky
(dwh_id NUMBER(8) GENERATED BY DEFAULT AS IDENTITY
,whisky_code VARCHAR2(8) NOT NULL
,whisky_name VARCHAR2(40) NOT NULL
,...)
GENERATED BY DEFAULT AS IDENTITY
GENERATED BY DEFAULT ON NULL AS IDENTITY
GENERATED ALWAYS AS IDENTITY
Automatic generation of sequence numbers („auto increment column“)
A sequence is created in the background (ISEQ$$_nnnnn)
Function APPROX_COUNT_DISTINCT
Useful Oracle 12c Features for Data Warehousing12 19.09.2016
SELECT APPROX_COUNT_DISTINCT(cust_id) FROM sales
Approximate number of distinct customers in SALES table:
SELECT COUNT(DISTINCT cust_id) FROM sales
Number of distinct customers in SALES table:
New function APPROX_COUNT_DISTINCT
Same algorithm as for AUTO_SAMPLE_SIZE in DBMS_STATS (Oracle 11g)
For large data volumes faster than COUNT(DISTINCT)
Result only approximately (+/- 4%)
APPROX_COUNT_DISTINCT: Performance
Useful Oracle 12c Features for Data Warehousing13 19.09.2016
See blog https://antognini.ch/2014/10/the-approx_count_distinct-function-a-test-case/
APPROX_COUNT_DISTINCT: Accuracy
Useful Oracle 12c Features for Data Warehousing14 19.09.2016
See blog https://antognini.ch/2014/10/the-approx_count_distinct-function-a-test-case/
Global Indexes in Data Warehouse
Useful Oracle 12c Features for Data Warehousing16 19.09.2016
DROP / TRUNCATE PARTITION sets global indexes to UNUSABLE
Index rebuild required
Problematic for rolling history windows (dropping of outdated partitions)
Whenever possible, avoid global indexes in Data Warehouses
…
…
Asynchronous Update of Global Indexes
Useful Oracle 12c Features for Data Warehousing17 19.09.2016
SELECT index_name, status, orphaned_entries
WHERE index_name = 'SALES_PK'
INDEX_NAME STATUS ORPHANED_ENTRIES
-------------- -------- ----------------
SALES_PK VALID YES
ALTER TABLE sales DROP PARTITION p_2016_02 UPDATE INDEXES
In Oracle 12c, global indexes can be updated asynchronously:
Step 1: DROP or TRUNCATE PARTITION)
Index is still usable, but constains „orphaned entries“
Asynchronous Update of Global Indexes
Useful Oracle 12c Features for Data Warehousing18 19.09.2016
dbms_part.cleanup_gidx(schema_name_in => USER,
table_name_in => 'SALES’)
Step 2: Delete orphaned entries from index with one of these alternatives:
– Scheduler Job SYS.PMO_DEFERRED_GIDX_MAINT_JOB
– dbms_part.cleanup_gidx
– ALTER INDEX REBUILD [PARTITION]
– ALTER INDEX [PARTITION] COALESCE CLEANUP
Have Orphans to be deleted at all?
Useful Oracle 12c Features for Data Warehousing19 19.09.2016
Yes
Orphans stay in index and are not overwritten
Index size increases because space is not released
Exception: unique indexes, if same value is inserted again
See Richard Foote‘s Oracle Blog
https://richardfoote.wordpress.com/2013/08/02/12c-asynchronous-global-index-maintenance-part-i-where-are-we-now/
https://richardfoote.wordpress.com/2013/08/06/12c-asynchronous-global-index-maintenance-part-ii-the-space-between/
https://richardfoote.wordpress.com/2013/08/07/12c-asynchronous-global-index-maintenance-part-iii-re-makere-model/
Partial Indexing – Use Cases
Useful Oracle 12c Features for Data Warehousing21 19.09.2016
Selective queries are more frequently on current partitions
Index on newest partition should be created after loading data
Partitioning on status values with much different cardinalities
Creating Table with Partial Indexing
Useful Oracle 12c Features for Data Warehousing22 19.09.2016
CREATE TABLE t_part (n NUMBER, name VARCHAR2(40))
INDEXING OFF
PARTITION BY RANGE (n)
(PARTITION p1 VALUES LESS THAN (100)
,PARTITION p2 VALUES LESS THAN (200) INDEXING OFF
,PARTITION p3 VALUES LESS THAN (300)
,PARTITION p4 VALUES LESS THAN (400)
,PARTITION p5 VALUES LESS THAN (500) INDEXING ON
,PARTITION p6 VALUES LESS THAN (600) INDEXING ON
)
Default on table level: INDEXING ON / OFF
Can be redefined for individual partitions
Creating Partial Indexes
Useful Oracle 12c Features for Data Warehousing23 19.09.2016
CREATE INDEX idx_part_local
ON t_part (name)
[GLOBAL] INDEXING PARTIAL
Global Partial Index
CREATE INDEX idx_part_local
ON t_part (name)
LOCAL INDEXING PARTIAL
Local Partial Index
Partial Local Index
Useful Oracle 12c Features for Data Warehousing24 19.09.2016
Table
Partition
P1
Index
Partition P1
Table
Partition
P2
Index
Partition P2
Table
Partition
P3
Index
Partition P3
Table
Partition
P4
Index
Partition P4
Table
Partition
P5
Index
Partition P5
Table
Partition
P6
Index
Partition P6
INDEXING OFF INDEXING ON
Partial Global Index
Useful Oracle 12c Features for Data Warehousing25 19.09.2016
Table
Partition
P1
Table
Partition
P2
Table
Partition
P3
Table
Partition
P4
Table
Partition
P5
Table
Partition
P6
INDEXING OFF INDEXING ON
Global Index
Partial Indexes on Star Schema
Useful Oracle 12c Features for Data Warehousing26 19.09.2016
Typical index strategy on Star Schema
(Local) bitmap index on each dimension key
No global index for primary key
What happens if the bitmap indexes are created as Partial Indexes?
Does Star Transformation work in combination with Partial Indexes?
See blog post „Partial Indexes Trilogy“
https://danischnider.wordpress.com/2016/06/28/partial-indexes-part-1-local-partial-indexes/
https://danischnider.wordpress.com/2016/07/08/partial-indexes-trilogy-part-2-global-partial-indexes/
https://danischnider.wordpress.com/2016/07/21/partial-indexes-trilogy-part-3-queries-on-partial-indexes/
Thank you.Dani Schnider
Principal Consultant
Tel. +41 58 459 50 81
19.09.2016 Useful Oracle 12c Features for Data Warehousing27
Session Feedback – now
TechEvent September 201628 09.09.2016
Please use the Trivadis Events Mobile App to give session feedback
Use "My schedule" if you registered for this session
Otherwise use "Agenda" and the search function
If the mobile App does not work (or if you have a Windows Phone) use your Mobile
Browser
– URL: http://trivadis.quickmobileplatform.eu/
– Username: <your_loginname> (like svv)
– Password: sent by mail...