trivadis techevent 2016 useful oracle 12c features for data warehousing by dani schnider

28
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH Useful Oracle 12c Features for Data Warehousing TechEvent , 9 September 2016 Dani Schnider, Trivadis AG

Upload: trivadis

Post on 14-Apr-2017

81 views

Category:

Data & Analytics


0 download

TRANSCRIPT

BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA

HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH

Useful Oracle 12c Features

for Data WarehousingTechEvent, 9 September 2016

Dani Schnider, Trivadis AG

Oracle 12c: Many new Features – not only for DWH

Useful Oracle 12c Features for Data Warehousing2 19.09.2016

In-Memory Option

Multitenant DatabasesSQL Pattern Matching

Information Lifecycle Management

Adaptive Query Optimization

JSON Support

Out-of-Place Materialized Views

Data Redaction Temporal Validity

SQL Query Row Limits

UTL_CALL_STACK

PL/SQL in SQL WITH Clause

Invisible Columns

Online Statistics Gathering

Default Values Enhancements

IDENTITY Columns

APPROX_COUNT_DISTINCT

Partial Indexing

Asynchronous Global Index Maintanance

Useful Oracle 12c Features for Data Warehousing3 19.09.2016

Online Statistics Gathering

ETL Processes and Object Statistics

Useful Oracle 12c Features for Data Warehousing4 19.09.2016

DBMS_STATS.gather_table_stats

(ownname => 'STG'

,tabname => 'T1');T1

T2

T3DBMS_STATS.gather_table_stats

(ownname => 'STG'

,tabname => 'T2');

| 1 | INSERT STATEMENT | | 1500 |

| 2 | INSERT | T3 | 1500 |

| 3 | HASH JOIN | | 1500 |

| 4 | TABLE ACCESS FULL| T1 | 2000 |

| 5 | TABLE ACCESS FULL| T2 | 3000 |

ETL-Mapping

T1

2000

T2

3000

T3

1500

See presentation „So beschleunigen Sie Ihre ETL-Prozesse“ (DOAG BI 2015)

Online Statistics Gathering

Useful Oracle 12c Features for Data Warehousing5 19.09.2016

INSERT /*+ append */ INTO T3

SELECT ...

FROM T1 JOIN T2 ON ...

– Direct-Load INSERT in empty table (after TRUNCATE)

CREATE TABLE T3 AS

SELECT ... FROM T1 JOIN T2 ON ...

After loading a table, statistics should always be gathered

Since Oracle 12c, table statistics are gathered automatically for the following cases:

– CREATE TABLE AS SELECT

Online Statistics Gathering

Useful Oracle 12c Features for Data Warehousing6 19.09.2016

Use Cases:

Volatile tables in ETL job flow (Staging Area, Cleansing Area)

Auxiliary tables for intermediate results of load jobs

Restrictions:

No index statistcs

No histograms

No statistics for partitions and subpartitions

See blog https://danischnider.wordpress.com/2015/12/23/online-statistics-gathering-in-oracle-12c/

Useful Oracle 12c Features for Data Warehousing7 19.09.2016

Default Values

Replace Missing Attributes with „Singletons“

Useful Oracle 12c Features for Data Warehousing8 19.09.2016

INSERT INTO cls_products (product_code, product_desc)

SELECT product_code

, NVL(product_desc, 'Unknown')

FROM stg_products;

STG_PRODUCTS

Edradour 10 years

Glenfarclas 105

Black Bowmore 1964

NULL

Laphroaig 15 years

CLS_PRODUCTS

Edradour 10 years

Glenfarclas 105

Black Bowmore 1964

Unknown

Laphroaig 15 years

See presentation „Fehlertolerante Ladeprozesse in Oracle“ (DOAG BI 2012)

DEFAULT Extensions in Oracle 12c

Useful Oracle 12c Features for Data Warehousing9 19.09.2016

CREATE TABLE dwh_whisky

(dwh_id NUMBER(8) DEFAULT seq_whisky.NEXTVAL

,whisky_code VARCHAR2(8) NOT NULL

,whisky_name VARCHAR2(40) NOT NULL

,price NUMBER (6,2) DEFAULT ON NULL 0

,age VARCHAR2(3) DEFAULT ON NULL '< 7'

,distillery VARCHAR2(30) DEFAULT ON NULL 'Unknown Distillery'

,region VARCHAR2(30) DEFAULT ON NULL 'Unknown Region')

DEFAULT ON NULL

DEFAULT can be a sequence (finally!)

IDENTITY Columns

Useful Oracle 12c Features for Data Warehousing10 19.09.2016

CREATE TABLE dwh_whisky

(dwh_id NUMBER(8) GENERATED BY DEFAULT AS IDENTITY

,whisky_code VARCHAR2(8) NOT NULL

,whisky_name VARCHAR2(40) NOT NULL

,...)

GENERATED BY DEFAULT AS IDENTITY

GENERATED BY DEFAULT ON NULL AS IDENTITY

GENERATED ALWAYS AS IDENTITY

Automatic generation of sequence numbers („auto increment column“)

A sequence is created in the background (ISEQ$$_nnnnn)

Useful Oracle 12c Features for Data Warehousing11 19.09.2016

APPROX_COUNT_DISTINCT

Function APPROX_COUNT_DISTINCT

Useful Oracle 12c Features for Data Warehousing12 19.09.2016

SELECT APPROX_COUNT_DISTINCT(cust_id) FROM sales

Approximate number of distinct customers in SALES table:

SELECT COUNT(DISTINCT cust_id) FROM sales

Number of distinct customers in SALES table:

New function APPROX_COUNT_DISTINCT

Same algorithm as for AUTO_SAMPLE_SIZE in DBMS_STATS (Oracle 11g)

For large data volumes faster than COUNT(DISTINCT)

Result only approximately (+/- 4%)

APPROX_COUNT_DISTINCT: Performance

Useful Oracle 12c Features for Data Warehousing13 19.09.2016

See blog https://antognini.ch/2014/10/the-approx_count_distinct-function-a-test-case/

APPROX_COUNT_DISTINCT: Accuracy

Useful Oracle 12c Features for Data Warehousing14 19.09.2016

See blog https://antognini.ch/2014/10/the-approx_count_distinct-function-a-test-case/

Useful Oracle 12c Features for Data Warehousing15 19.09.2016

Asynchronous Global Index

Maintenance

Global Indexes in Data Warehouse

Useful Oracle 12c Features for Data Warehousing16 19.09.2016

DROP / TRUNCATE PARTITION sets global indexes to UNUSABLE

Index rebuild required

Problematic for rolling history windows (dropping of outdated partitions)

Whenever possible, avoid global indexes in Data Warehouses

Asynchronous Update of Global Indexes

Useful Oracle 12c Features for Data Warehousing17 19.09.2016

SELECT index_name, status, orphaned_entries

WHERE index_name = 'SALES_PK'

INDEX_NAME STATUS ORPHANED_ENTRIES

-------------- -------- ----------------

SALES_PK VALID YES

ALTER TABLE sales DROP PARTITION p_2016_02 UPDATE INDEXES

In Oracle 12c, global indexes can be updated asynchronously:

Step 1: DROP or TRUNCATE PARTITION)

Index is still usable, but constains „orphaned entries“

Asynchronous Update of Global Indexes

Useful Oracle 12c Features for Data Warehousing18 19.09.2016

dbms_part.cleanup_gidx(schema_name_in => USER,

table_name_in => 'SALES’)

Step 2: Delete orphaned entries from index with one of these alternatives:

– Scheduler Job SYS.PMO_DEFERRED_GIDX_MAINT_JOB

– dbms_part.cleanup_gidx

– ALTER INDEX REBUILD [PARTITION]

– ALTER INDEX [PARTITION] COALESCE CLEANUP

Have Orphans to be deleted at all?

Useful Oracle 12c Features for Data Warehousing19 19.09.2016

Yes

Orphans stay in index and are not overwritten

Index size increases because space is not released

Exception: unique indexes, if same value is inserted again

See Richard Foote‘s Oracle Blog

https://richardfoote.wordpress.com/2013/08/02/12c-asynchronous-global-index-maintenance-part-i-where-are-we-now/

https://richardfoote.wordpress.com/2013/08/06/12c-asynchronous-global-index-maintenance-part-ii-the-space-between/

https://richardfoote.wordpress.com/2013/08/07/12c-asynchronous-global-index-maintenance-part-iii-re-makere-model/

Useful Oracle 12c Features for Data Warehousing20 19.09.2016

Partial Indexing

Partial Indexing – Use Cases

Useful Oracle 12c Features for Data Warehousing21 19.09.2016

Selective queries are more frequently on current partitions

Index on newest partition should be created after loading data

Partitioning on status values with much different cardinalities

Creating Table with Partial Indexing

Useful Oracle 12c Features for Data Warehousing22 19.09.2016

CREATE TABLE t_part (n NUMBER, name VARCHAR2(40))

INDEXING OFF

PARTITION BY RANGE (n)

(PARTITION p1 VALUES LESS THAN (100)

,PARTITION p2 VALUES LESS THAN (200) INDEXING OFF

,PARTITION p3 VALUES LESS THAN (300)

,PARTITION p4 VALUES LESS THAN (400)

,PARTITION p5 VALUES LESS THAN (500) INDEXING ON

,PARTITION p6 VALUES LESS THAN (600) INDEXING ON

)

Default on table level: INDEXING ON / OFF

Can be redefined for individual partitions

Creating Partial Indexes

Useful Oracle 12c Features for Data Warehousing23 19.09.2016

CREATE INDEX idx_part_local

ON t_part (name)

[GLOBAL] INDEXING PARTIAL

Global Partial Index

CREATE INDEX idx_part_local

ON t_part (name)

LOCAL INDEXING PARTIAL

Local Partial Index

Partial Local Index

Useful Oracle 12c Features for Data Warehousing24 19.09.2016

Table

Partition

P1

Index

Partition P1

Table

Partition

P2

Index

Partition P2

Table

Partition

P3

Index

Partition P3

Table

Partition

P4

Index

Partition P4

Table

Partition

P5

Index

Partition P5

Table

Partition

P6

Index

Partition P6

INDEXING OFF INDEXING ON

Partial Global Index

Useful Oracle 12c Features for Data Warehousing25 19.09.2016

Table

Partition

P1

Table

Partition

P2

Table

Partition

P3

Table

Partition

P4

Table

Partition

P5

Table

Partition

P6

INDEXING OFF INDEXING ON

Global Index

Partial Indexes on Star Schema

Useful Oracle 12c Features for Data Warehousing26 19.09.2016

Typical index strategy on Star Schema

(Local) bitmap index on each dimension key

No global index for primary key

What happens if the bitmap indexes are created as Partial Indexes?

Does Star Transformation work in combination with Partial Indexes?

See blog post „Partial Indexes Trilogy“

https://danischnider.wordpress.com/2016/06/28/partial-indexes-part-1-local-partial-indexes/

https://danischnider.wordpress.com/2016/07/08/partial-indexes-trilogy-part-2-global-partial-indexes/

https://danischnider.wordpress.com/2016/07/21/partial-indexes-trilogy-part-3-queries-on-partial-indexes/

Thank you.Dani Schnider

Principal Consultant

Tel. +41 58 459 50 81

[email protected]

19.09.2016 Useful Oracle 12c Features for Data Warehousing27

Session Feedback – now

TechEvent September 201628 09.09.2016

Please use the Trivadis Events Mobile App to give session feedback

Use "My schedule" if you registered for this session

Otherwise use "Agenda" and the search function

If the mobile App does not work (or if you have a Windows Phone) use your Mobile

Browser

– URL: http://trivadis.quickmobileplatform.eu/

– Username: <your_loginname> (like svv)

– Password: sent by mail...