query rewrite: the supreme league of materialized views · materialized view need the fresh_mv hint...

34
BASEL | BERN | BRUGG | BUCHAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I. BR. | GENEVA HAMBURG | COPENHAGEN | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICH BASEL | BERN | BRUGG | BUCHAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I. BR. | GENEVA HAMBURG | COPENHAGEN | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICH danischnider.wordpress.com @dani_schnider Query Rewrite: The Supreme League of Materialized Views Dani Schnider, Trivadis AG

Upload: others

Post on 19-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

BASEL | BERN | BRUGG | BUCHAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I. BR. | GENEVA HAMBURG | COPENHAGEN | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICHBASEL | BERN | BRUGG | BUCHAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I. BR. | GENEVA HAMBURG | COPENHAGEN | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICH

danischnider.wordpress.com@dani_schnider

Query Rewrite:The Supreme League of Materialized ViewsDani Schnider, Trivadis AG

Page 2: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

BASEL | BERN | BRUGG | BUCHAREST | COPENHAGEN | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I.BR. GENEVA | HAMBURG | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICH

Dani Schnider• Senior Principal Consultant at Trivadis AG in

Glattbrugg/Zurich

• Trainer of several Trivadis courses

• Co-Author of Books “Data Warehousing mitOracle” and “Data Warehouse Blueprints”

• Oracle ACE Director

@dani_schnider danischnider.wordpress.com

| HOMEOFFICE

Page 3: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Usage of Materialized Views

1. Who uses Materialized Views with Query Rewrite?2. Who uses Materialized Views, but accesses them directly?3. Who does not use Materialized Views at all?

My Expectation

Page 4: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Concept ofQuery Rewrite

Page 5: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Concept of Query Rewrite

?

SALES

TIMES

PRODUCTS

Query Rewrite

SQL Query

MV_PRODUCT_MONTH_SALES

Optimizer

Concept of Query Rewrite

?

SALES

TIMES

PRODUCTS

Query Rewrite

SQL Query

MV_PRODUCT_MONTH_SALES

Optimizer

Page 6: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

CREATE MATERIALIZED VIEW mv_product_month_salesENABLE QUERY REWRITEASSELECT t.calendar_month_desc

, p.prod_name, SUM(s.amount_sold)

FROM sales s, times t, products p

WHERE t.time_id = s.time_idAND p.prod_id = s.prod_id

GROUP BY t.calendar_month_desc, p.prod_name

Example: Create Materialized View

SALES

TIMES

PRODUCTS

Page 7: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

SELECT t.calendar_month_desc, p.prod_name, SUM(s.amount_sold)

FROM sales s, times t, products p

WHERE t.time_id = s.time_idAND p.prod_id = s.prod_id

GROUP BY t.calendar_month_desc, p.prod_name

---------------------------------------------------------------| Id | Operation | Name |---------------------------------------------------------------| 0 | SELECT STATEMENT | || 1 | MAT_VIEW REWRITE ACCESS FULL| MV_PRODUCT_MONTH_SALES |---------------------------------------------------------------

Example: Query RewriteBasic Query Rewrite with Full Text Match

SALES

TIMES

PRODUCTS

MV_PRODUCT_MONTH_SALES

Page 8: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Configuration Parameters for Query Rewrite

QUERY_REWRITE_ENABLED

FALSE No query rewrite is used

TRUE Query rewrite is used if costs are lower (default since Oracle 10g)

FORCE Query rewrite is used, independent from costs

ENFORCED Oracle enforces and guarantees consistency and integrity.

TRUSTED Oracle allows rewrites using relationships that have been declared, but that are not enforced by Oracle.

STALE_TOLERATED Oracle allows rewrites using unenforced relationships. Materialized views are eligible for rewrite even if they are stale.

QUERY_REWRITE_INTEGRITY

Page 9: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Types of Query Rewrite

Text Match Rewrite

• Full Text Match

• Partial Text Match

General Rewrite• Join Back

• Aggregate Computability• Aggregate Rollup

• Rollup Using a Dimension• Materialized Views with Only a Subset of Data

• Partition Change Tracking (PCT) Rewrite• Rewrite Using Multiple Materialized Views

Page 10: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

AdvancedQuery Rewrite

Page 11: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

CREATE MATERIALIZED VIEW mv_product_month_salesENABLE QUERY REWRITEASSELECT t.calendar_month_desc

, p.prod_id, s.channel_id, SUM(s.amount_sold), COUNT(s.amount_sold), COUNT(*)

FROM sales s, times t, products p

WHERE t.time_id = s.time_idAND p.prod_id = s.prod_id

GROUP BY t.calendar_month_desc, p.prod_id, s.channel_id

Materialized View on 3 Tables

SALES

TIMES

PRODUCTS

Page 12: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

• Additional columns in query (not part of materialized view)

• Join between materialized view and dimension required

SELECT t.calendar_month_desc, p.prod_name, p.prod_subcategory, p.prod_category, SUM(s.amount_sold) AS amount_sold

FROM sales sJOIN times t ON t.time_id = s.time_idJOIN products p ON p.prod_id = s.prod_id

GROUP BY t.calendar_month_desc, p.prod_name, p.prod_subcategory, p.prod_category

Join Back

PRODUCTS

MV_PRODUCT_MONTH_SALES

Page 13: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

CREATE MATERIALIZED VIEW mv_product_month_salesENABLE QUERY REWRITE ASSELECT t.calendar_month_desc

, p.prod_id, s.channel_id, SUM(s.amount_sold), COUNT(s.amount_sold), COUNT(*)

FROM sales s, times t, products p

WHERE t.time_id = s.time_idAND p.prod_id = s.prod_id

GROUP BY t.calendar_month_desc, p.prod_id, s.channel_id

Aggregate Computability

SELECT t.calendar_month_desc, p.prod_name, AVG(s.amount_sold) AS average_amount

FROM sales sJOIN times t ON t.time_id = s.time_idJOIN products p ON p.prod_id = s.prod_id

GROUP BY t.calendar_month_desc, p.prod_name

• Materialized Views contains SUM(n) and COUNT(n), query needs AVG(n)

• Average can be calculated: AVG(n) = SUM(n) / COUNT(n)

Page 14: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

SELECT p.prod_name, SUM(s.amount_sold) AS amount_sold

FROM sales sJOIN products p

ON p.prod_id = s.prod_idGROUP BY

p.prod_name

• Not all joins of materialized view are part of the query

• Only possible for «lossless» joins (no data is lost through join)

Materialized View Delta Join

MV_PRODUCT_MONTH_SALES

SALES

TIMES

PRODUCTS

Page 15: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

• Join with additional table that does not appear in the materialized view

• Materialized view must contain the join key

SELECT t.calendar_month_desc, p.prod_name, c.channel_desc, SUM(s.amount_sold) AS amount_sold

FROM sales sJOIN times t ON t.time_id = s.time_idJOIN products p ON p.prod_id = s.prod_idJOIN channels c ON c.channel_id = s.channel_id

GROUP BY t.calendar_month_desc, p.prod_name, c.channel_desc

Query Delta Join

CHANNELS

MV_PRODUCT_MONTH_SALES

Page 16: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Query Rewritewith Dimensions

Page 17: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

CREATE MATERIALIZED VIEW mv_product_month_salesENABLE QUERY REWRITE ASSELECT t.calendar_month_desc

, p.prod_id, s.channel_id, SUM(s.amount_sold), COUNT(s.amount_sold), COUNT(*)

FROM sales s, times t, products p

WHERE t.time_id = s.time_idAND p.prod_id = s.prod_id

GROUP BY t.calendar_month_desc, p.prod_id, s.channel_id

Rollup Using a Dimension

SELECT t.calendar_year, p.prod_id, SUM(s.amount_sold)

FROM sales sJOIN times t ON t.time_id = s.time_idJOIN products p ON p.prod_id = s.prod_id

GROUP BY t.calendar_year, p.prod_id

• Materialized Views aggregates data on monthly level

• Query aggregates data per quarter or per year

Page 18: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

What are Dimensions?

DIMENSION ≠ Dimension Table

A DIMENSION is• A kind of “Hierarchy Constraint”

(only declarative, data is not checked)

• Additional metadata for the optimizer about hierarchical relationships with a dimension table

A DIMENSION contains• All hierarchy levels of a dimension

• One or more hierarchies definitions• List of all attributes assigned to each level

{ }

{ }

{ }

{ }

{ }

{ }

{ }

Page 19: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

CREATE DIMENSION times_dimLEVEL day IS TIMES.TIME_IDLEVEL month IS TIMES.CALENDAR_MONTH_DESCLEVEL quarter IS TIMES.CALENDAR_QUARTER_DESCLEVEL year IS TIMES.CALENDAR_YEAR

HIERARCHY cal_rollup(day CHILD OFmonth CHILD OFquarter CHILD OFyear)

ATTRIBUTE day DETERMINES (day_number_in_week, day_name, day_number_in_month, calendar_week_number)

ATTRIBUTE month DETERMINES(calendar_month_desc, calendar_month_number, calendar_month_name, …)

ATTRIBUTE quarter DETERMINES(calendar_quarter_desc, calendar_quarter_number, days_in_cal_quarter, …)

ATTRIBUTE year DETERMINES(calendar_year, days_in_cal_year, end_of_cal_year)

Example: Create Dimension

Page 20: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

• The optimizer can use dimensions to derive hierarchical relationships(e.g. “a month belongs to a quarter, and a quarter belongs to a year)

• Precondition: QUERY_REWRITE_INTEGRITY = TRUSTED

-----------------------------------------------------------------| Id | Operation | Name |-----------------------------------------------------------------| 0 | SELECT STATEMENT | || 1 | HASH GROUP BY | ||* 2 | HASH JOIN | || 3 | VIEW | || 4 | HASH UNIQUE | || 5 | TABLE ACCESS FULL | TIMES || 6 | MAT_VIEW REWRITE ACCESS FULL| MV_PRODUCT_MONTH_SALES |-----------------------------------------------------------------

2 - access("from$_subquery$_008"."CALENDAR_MONTH_DESC"="MV_PRODUCT_MONTH_SALES"."CALENDAR_MONTH_DESC")

Using Dimensions for Query Rewrite

Page 21: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Query Rewrite onStale Materialized Views

Page 22: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

What happens after data changes?• If data is changed in at least one base table, all related Materialized Views become STALE

• By default, Query Rewrite does not work on stale Materialized Views

Solutions:

• Refresh Materialized Views after data changes

• Allow Query Rewrite on stale Materialized Views

• Use Real-time Materialized Views (≥ Oracle 12.2)

dbms_mview.refresh('MV_PRODUCT_MONTH_SALES');

ALTER SESSION SET QUERY_REWRITE_INTEGRITY = STALE_TOLERATED;

Stale Materialized Views

Page 23: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

• Using materialized view for query rewrite even though not fully synchronized

• Possible with materialized view logs on queried tables

• Requirement: fast refresh must be possible

• Not possible when using ON COMMIT refresh logic

• Queries directly addressing the materialized view need the FRESH_MV hint to get the current data

CREATE MATERIALIZED VIEW mv_rt_prod_salesREFRESH FAST ON DEMANDENABLE QUERY REWRITEENABLE ON QUERY COMPUTATIONASSELECT p.prod_name

, SUM(s.amount_sold), COUNT(s.amount_sold), COUNT(*)

FROM sales s, products p

WHERE p.prod_id = s.prod_idGROUP BY p.prod_name

Real-Time Materialized Views

12.2

Page 24: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Query Rewritewith COUNT(DISTINCT)

Page 25: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

COUNT(DISTINCT) on different hierarchy levels or different dimensions

• COUNT(DISTINCT) cannot be aggregated

• A separate materialized view is required for each level

SELECT t.calendar_month_desc, COUNT(DISTINCT s.prod_id)

FROM sales s, times t

WHERE t.time_id = s.time_idGROUP BY t.calendar_month_descORDER BY t.calendar_month_desc

SELECT t.calendar_quarter_desc, COUNT(DISTINCT s.prod_id)

FROM sales s, times t

WHERE t.time_id = s.time_idGROUP BY t.calendar_quarter_descORDER BY t.calendar_quarter_desc

COUNT(DISTINCT) Challenge

SELECT t.calendar_year, COUNT(DISTINCT s.prod_id)

FROM sales s, times t

WHERE t.time_id = s.time_idGROUP BY t.calendar_yearORDER BY t.calendar_year

Year

Quarter

Month

Page 26: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Bitmap-based COUNT(DISTINCT) FunctionsBITMAP_BIT_POSITION Mapping of a number to the absolute bit position in a

bitmap.

BITMAP_BUCKET_NUMBER Bucket number within the bitmap. Each bucket contains 16000 numbers.

BITMAP_CONSTRUCT_AGG Bitmap array of aggregated numbers. For each existing number, the bit at the corresponding position is set

BITMAP_COUNT Counts the bits set to “1” in a bitmap array. The input parameter is the result of the function BITMAP_CONSTRUCT_AGG.

BITMAP_OR_AGG Aggregation of bitmap arrays of multiple rows. The input parameter is the result of the function BITMAP_CONSTRUCT_AGG. The result is a bitmap array that combines all bitmaps with an OR predicate.19c

Page 27: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

CREATE MATERIALIZED VIEW mv_sales_distinctENABLE QUERY REWRITEASSELECT t.calendar_month_desc

, t.calendar_quarter_desc, t.calendar_year, BITMAP_BUCKET_NUMBER(s.prod_id), BITMAP_CONSTRUCT_AGG(BITMAP_BIT_POSITION(s.prod_id))

FROM sales s, times t

WHERE t.time_id = s.time_idGROUP BY

t.calendar_month_desc, t.calendar_quarter_desc, t.calendar_year, BITMAP_BUCKET_NUMBER(s.prod_id)

Materialized View for COUNT(DISTINCT)

19c

Page 28: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Verify Query RewriteCapabilities

Page 29: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

dbms_mview.explain_mview('MV_PRODUCT_MONTH_SALES');

DBMS_MVIEW.EXPLAIN_MVIEW

Explains what is possible with a specific materialized view

• Fast Refresh

• Query Rewrite

• Partition Change Tracking (PCT)

Results are written to table MV_CAPABILITIES_TABLE

• Script $ORACLE_HOME/rdbms/admin/utlxmv.sql to create table

Page 30: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Explains whether query rewrite is possible for a specific query

• If query rewrite failed, what is the reason

• If query rewrite works, which materialized views will be used

Results are writte to table REWRITE_TABLE

• Script $ORACLE_HOME/rdbms/admin/utlxrw.sql to create table

DBMS_MVIEW.EXPLAIN_REWRITE

dbms_mview.explain_rewrite(v_query);

Page 31: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Design Tips forQuery Rewrite

Page 32: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Design Tips for Query Rewrite (1)Always use Oracle join syntax in materialized views

• User queries can use ANSI or Oracle join syntaxhttps://danischnider.wordpress.com/2016/11/30/ansi-join-syntax-and-query-rewrite/

Use constraints on base tables

• Define referential integrity with primary/foreign key constraints

• Define foreign key columns as NOT NULL whenever possible

Additional tips for data warehouses

• Define foreign key constraints with RELY DISABLE NOVALIDATEhttps://danischnider.wordpress.com/2015/12/01/foreign-key-constraints-in-an-oracle-data-warehouse/

• Define dimension objects for dimension tables with hierarchies

• Set QUERY_REWRITE_INTEGRITY = TRUSTED

Page 33: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Design Tips for Query Rewrite (2)Refresh materialized views after data changes in a suitable way

• Complete Refresh or Fast Refresh – depending on the situationhttps://danischnider.wordpress.com/2019/02/18/materialized-view-refresh-for-dummies/

• Avoid QUERY_REWRITE_INTEGRITY = STALE_TOLERATED

• Use real-time materialized views, if required

Try to reduce the number of materialized views

• Create flexible materialized views that can be used for different queries

• As many materialized views as required, but as few as possible

• For queries with COUNT(DISTINCT), use the bitmap-based functions of Oracle 19chttps://danischnider.wordpress.com/2019/04/20/bitmap-based-countdistinct-functions-in-oracle-19c/

Verify capabilities with DBMS_MVIEW procedures

• dbms_mview.explain_mview, dbms_mview.explain_rewrite

Page 34: Query Rewrite: The Supreme League of Materialized Views · materialized view need the FRESH_MV hint to get the current data CREATE MATERIALIZED VIEW mv_rt_prod_sales REFRESH FAST

Design Tips for Query Rewrite (3)Read the Oracle documentation

• Data Warehousing Guide, Part II Optimizing Data Warehouseshttps://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/

RTFM«read the fucking manual»