www.semantec.de oracle 8i/9i features which support data warehousing author: krasen paskalev...

35
www.semantec .de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

Upload: mitchell-flynn

Post on 12-Jan-2016

224 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Oracle 8i/9i features which support Data Warehousing

Author: Krasen Paskalev

Certified Oracle DBA

Semantec GmbH.

D-71083 Herrenberg

Page 2: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Agenda

• ETL Features

• Data Warehouse Management

• Data Warehouse Querying

• Parallel Operations

Page 3: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Agenda

• ETL (Extraction, Transformation, Transportation and Loading)

– Transportable Tablespaces– External Tables– Table Functions– MERGE Statement

• Data Warehouse Management• Data Warehouse Querying• Parallel Operations

Page 4: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Transportable tablespaces

• The fastest method for moving data between databases

• The tablespeces with all their data are plugged into the data warehouse database

Production Data WarehouseTablespace Tablespace

ftp

Page 5: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

External Tables

• Can be directly queried and joined in SQL, PL/SQL and Java

• Avoid data staging• One step loading and transformation• Save DB space

ASCIIfile

Excelsheet

Read-only virtual tables

External files

Page 6: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Table Functions• Can take a set of rows as input• Can return a set of rows as output• Can be used in the FROM clause• Can be paralellized• Can be pipelined• User defined in PL/SQL, Java or C

Region %

West

Central

East

30

50

20

Sales

TableFunction

Page 7: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Table Functions

• Pipelining Data Transformation

TableFunction

TableFunction

Source TargetStep 1 Step 2

Log table

Page 8: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

MERGE statement

id amount

4 3000

8 1000

9 2000

id amount

4 5000

7 3000

8 6000

9 2000

UPDATE

UPDATE

INSERT

new_sales sales

MERGE INTO sales s

USING new_sales n

ON (s.id = n.id)

WHEN MATCHED THEN

UPDATE s.amount = s.amount + n.amount

WHEN NOT MATCHED THEN

INSERT (s.id, s.amount)

VALUES (n.id, n.amount)

id amount

4 2000

7 3000

8 5000

Page 9: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

MERGE Advantages

• Single simple SQL statement

• Can be paralellized

• Can use Bulk DML

• Fewer scans of the base table

Page 10: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

More ETL Features

• Direct-path Interface– SQL*Loader– CREATE AS SELECT– INSERT– Oracle Call Interface

• Multi-table INSERTs

Page 11: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Agenda

• ETL Features

• Data Warehouse Management– Partitioning– Materialized Views– DBMS_STATS

• Data Warehouse Querying

• Parallel Operations

Page 12: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Partitioning

Jan‘2002

Tablespace 0102

Feb‘2002

Tablespace 0202

Dec‘2002

Tablespace 1202

...

Table Sales

Page 13: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Advantages of Partitioning

• Partition independance– LOAD, MOVE, Purge and DROP partitions– MERGE, SPLIT, EXCHANGE partitions– BACKUP, RESTORE, SET READ ONLY

• Partition elimination– SELECT or JOIN only the partition needed

• Parallel Operations– SELECT, UPDATE, DELETE, MERGE

Page 14: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Partitioning Methods

• Hash Partitioning– Even row distribution by hash function

• Range Patitioning– <01.01.2002 | <01.02.2002 | ... | <01.01.2003

• List Partitioning– Stuttgart, Munich | Manheim, Frankfurt | ...

Page 15: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Table Compression

• Stores tables or partitions in compressed format

• Reduces disk space requirements• Reduces memory requirements• Speeds up query execution• Speeds up backup and recovery• Very efficient for highly redundant data –

the FACT table• 2 to 4 times compression is usual

Page 16: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Materialized Views

revenue_sum

region month revenue

sales

region month invc_sum...

SELECT region, month,

sum(invc_sum) revenue

FROM sales

GROUP BY region, month

Page 17: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Advantages of Materialized Views

• Improved query/reporting performance for:– Summaries– Agregates– Joins

• Fast Refresh– Data change tracking– Partition change tracking

• No application change needed – their usage is automatic

Page 18: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

DBMS_STATS

• New package for gathering table and index statistics

• Gathers statistics in parallel

• Can export and import statistics

ProductionData Warehouse

DevelopmentData Warehouse

Statistics

Page 19: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

More Data Warehouse Management Features

• Index-organized tables

• Online index rebuild

• Online table rebuild

Page 20: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Agenda

• ETL Features• Data Warehouse Management• Data Warehouse Querying

– Bitmap Indexing

– Star Query Transformation

– Agregation – ROLLUP, CUBE, Grouping Sets

– Analytic functions

• Parallel Operations

Page 21: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Bitmap IndexesRegion east central west NULL

rowid 1 0 0 0

rowid 0 0 1 0

... 0 0 0 1

rowid 0 1 0 0

1

0

0

0

0

1

0

0

0

0

1

0

OR AND NOT( ) =

1

1

0

0

Page 22: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Advantages of Bitmap Indexes

• Reduced response time for ad-hoq queries• Uses much less space than a B-tree index• Dramatic performance gains for large class

of queries:– Multiple AND, OR and NOT conditions– IS NULL conditions– COUNT– NOT IN - Bitmap MINUS– BETWEEN - Bitmap UNION

Page 23: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Star Query Transformation• The query is re-written for efficient execution

sales cust_id prod_id amountq_id

cust_id name prod_id name q_id name

customers products quarters • Steps:1. Filter all

dimentions

2. Combine the bitmap indexes of the fact table‘s foreign keys

3. Retrieve fact and dimention other rows

Page 24: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Agregation Operators

• Oracle extends the GROUP BY clause by:– ROLLUP– CUBE– Grouping Sets

2500 8000

4000

6500

10500

SELECT SUM(amount)

FROM sales

GROUP BY county, quarter

Q1

Q2

UK US

1000 3000

1500 5000

Page 25: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

ROLLUP and CUBE

ROLLUP(country, department, quarter)

(country, department, quarter)

(country, department)

(country)

() - Grand Total

CUBE(country, department, quarter)

(country, department, quarter)

(country, department)

(country, quarter)

(department, quarter)

(country)

(department)

(quarter)

() - Grand Total

ROLLUP – subtotals at increasing levels of agregation – from right to left

CUBE – subtotals on all combinations

n+1

2n

Page 26: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Agregation Operators Advantages

• Applicable on many agregation functions:– SUM, AVG, COUNT– MIN, MAX– STDDEV, VARIANCE

• Flexible agregation groups and levels

• Runs in parallel

Page 27: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Analytic functions

• Significantly improved performance for complex reports as:– Ranking – Find top 10 sales in each region– Moving agregates – What is the 90 day moving

sales average?– Period-over-period comparison – What are the

revenues from January 2002 compared to January 2001?

Page 28: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Example – Moving WindowSELECT c.cust_id, t.month,

SUM(amount_sold) SALES,

AVG(SUM(amount_sold))

OVER (ORDER BY c.cust_id, t.month ROWS 2 PRECEDING) MOV_3_MONTH

FROM sales s, times t, customers c

WHERE s.time_id = t.time_id AND

s.cust_id = c.cust_id AND

t. year = 1999 AND

c.cust_id IN (6380)

GROUP BY c.cust_id, t.month

ORDER BY c.cust_id, t.month;

CUST_ID MONTH SALES MOV_3_MONTH

------- ------- ------- -----------

6380 1999-01 19,642 19,642

6380 1999-02 19,324 19,483

6380 1999-03 21,655 20,207

6380 1999-04 27,091 22,690

6380 1999-05 16,367 21,704

6380 1999-06 24,755 22,738

Page 29: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

More Data Warehouse Querying Features

• Function-based Indexes• Optimizer Plan Stability• Statistics for Long Running Operations• Resumable Statements• Full Outer Join• With Operator• Oracle Text “Advanced Searching with Oracle Text”

14.11.2002, 2nd Conference day

11:50-12:30, Konferenzraum EG

Page 30: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Agenda

• ETL Features

• Data Warehouse Management

• Data Warehouse Querying

• Parallel Operations

Page 31: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Parallel Operations

• Dramatically reduce execution time of data intensive operations

• Loading– Direct Path Load

• DDL Statements– CREATE AS SELECT, CREATE INDEX– REBUILD INDEX, REBUILD INDEX PARTITION– MOVE, SPLIT, COALESCE PARTITION

• DML Statements– INSERT AS SELECT– UPDATE, DELETE and MERGE

Page 32: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Parallel Operations

• Access methods– Table and index range and full scans

• Join methods– Nested loops, Sort merge, Hash, Star transformation

• SQL operations– GROUP BY, ROLLUP , CUBE

– DISTINCT, UNION, UNION ALL

– Agregate functions

Page 33: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Parallel System Requirements

• Symetric Multiprocessor Systems, Clusters or Massively Parallel Systems

• Sufficient I/O Bandwidth

• Sufficient (Underutilized) CPUs

• Sufficient Memory

Page 34: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Summary

• Effective handling of multi-terabyte Data Warehouses

• Rich feature set for all Data Warehouse operations

• Flexible agregation and analytical features for high performance queries

• Effective parallelizm

Page 35: Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg

www.semantec.de

Want to know more?

Telephone:

Telephone:

Fax:

E-Mail:

Internet:

Company:Name:

Address:

Semantec GmbH.

Krasen Paskalev, Armin Singer, Peter Kopecki

Benzstr. 32D-71083 Herrenberg, Germany

Meet us here -> booth 2C at the ground floor

+49(7032)9130-0

+49(7032)9130-12

+49(7032)9130-22

[email protected]

[email protected]

www.semantec.de