oracle database 11g for data warehousing presenter’s name presenter’s title

51

Upload: reed-mobbs

Post on 14-Dec-2015

228 views

Category:

Documents


1 download

TRANSCRIPT

<Insert Picture Here>

Oracle Database 11g for Data WarehousingPresenter’s NamePresenter’s Title

<Insert Picture Here>

Agenda

• Technology• Monitoring• Information Life-cycle Management (ILM)• Oracle Optimized Warehouse Initiative• Market

<Insert Picture Here>

Technology

Parallel Execution

select c.cust_last_name, sum(s.amount_sold)from customers c, sales swhere c.cust_id = s.cust_idgroup by c.cust_last_name ;

Data on Disk Parallel Servers

scanscan

scanscan

scanscan

aggregateaggregate

Scanners

Coordinator

joinjoin

join join

joinjoin

aggregateaggregate

aggregateaggregate

Joiners Aggregators

Partitioning – Benefits

Large TableDifficult to Manage

PartitionDivide and Conquer

Easier to Manage

Improve Performance

Composite PartitionBetter Performance

More flexibility to match business needs

JAN FEB

JAN FEB

USA

EUROPEORDERSORDERS

ORDERS

Transparent to applications

Partitioning in Oracle Database 11gInterval Partitioning

JAN FEB MAR APR

ORDERS

JANFEB

ORDERS

MAR

JANFEB

INVENTORY

• Partitions are created automatically as data arrives

Partitioning in Oracle Database 11gComplete Composite Partitioning

• Range – range• List – list• List – hash• List – range

JANFEB

>5000

1000-

5000

ORDERS

RANGE-RANGEOrder Date by

Order Value

USA EUROPE

>5000

1000-

5000

ORDERS

LIST-RANGERegion by

Order Value

USA EUROPE

Gold

Silver

ORDERS

LIST-LISTRegion by

Customer Type

Partitioning in Oracle Database 11gReference Partitioning

ORDERS

Line

Items

Pick

Lists

Stock

Holds

Back

Orders

ORDERS

Line

Items

Pick

Lists

Stock

Holds

Back

Orders

ORDERS

Line

Items

Pick

Lists

Stock

Holds

Back

Orders

ORDERS

Line

Items

Pick

Lists

Stock

Holds

Back

Orders

ORDERS

Line

Items

Pick

Lists

Stock

Holds

Back

Orders

PartitionORDERSby Date

JAN

FEB

MAR

APR

• Inherit partitioning strategy

Partitioning in Oracle Database 11gVirtual Column-Based Partitioning

ORDERS

ORDER_ID ORDER_DATE CUSTOMER_ID...---------- ----------- ----------- --9834-US-14 12-JAN-2007 659208300-EU-97 14-FEB-2007 396543886-EU-02 16-JAN-2007 45292566-US-94 19-JAN-2007 153273699-US-63 02-FEB-2007 18733

JANFEB

USA

EUROPEORDERS

• REGION requires no storage• Partition by ORDER_DATE, REGION

REGION AS (SUBSTR(ORDER_ID,6,2))------ US EU EU US US

Compression

• Tables and indexes can be compressed• Can be specified on a per-partition basis• Typical compression ratio 3:1

• Requires more CPU to load data• Decompression hardly costs resources• Compress for all DML operations

• Less data on disk• Requires less time to read

• Completely transparent Up To

3XCompression

SQL Query Result Cache

• Store query results in cache• Repetitive executions can use cached result

• Data Warehouse queries• Long-running, IO-intensive• Expensive computations• Return few rows• Excellent opportunity for SQL Query Result Cache

------------------------------------------------------------------| Id | Operation | Name |------------------------------------------------------------------| 0 | SELECT STATEMENT | || 1 | RESULT CACHE | fz6cm4jbpcwh48wcyk60m7qypu || 2 | SORT GROUP BY ROLLUP | ||* 3 | HASH JOIN | |etc.

SQL Query Result CacheOpportunity

• Retail customer data (~50 GB)• Concurrent users submitting queries randomly

• Executive dashboard with 12 heavy analytical queries• Cache results only at in-line view level• 12 queries run in random, different order – 4 queries cached

• Measure average, total response time for all users

447 s

267 s

186 s

No cache

334 s

201 s

141 s

Cache

25%

25%

24%

Improvement

8

4

2

# Users

Other Performance FeaturesTransparent to Your Application

• Materialized Views• Transparent rewrites of expensive queries

• Including rewrites on remote objects• Incremental automatic refresh

• Bitmap Indexes• Optimal storage• Ideal for star or star look-a-like schemas

• SQL Access Advisor – based on workload• Materialized view advice• Index advice• Partition advice

SQL analytics

Bring Algorithms to the DataNot Data to the Algorithms

• Analytic computations done in the database• SQL Analytics• OLAP• Data Mining• Statistics

• Scalability• Security• Backup & Recovery• Simplicity

OLAP

Data Mining

Statistics

Native Support for Pivot and Unpivot

SALESREP Q1 Q2 Q3 Q4---------- ----- ----- ----- ----- 100 230 240 260 300 101 200 220 250 260 102 260 280 265 310

SALESREP QU REVENUE---------- -- ---------- 100 Q1 230 100 Q2 240 100 Q3 260 100 Q4 300 101 Q1 200 101 Q2 220 101 Q3 250 101 Q4 260 102 Q1 260 102 Q2 280 102 Q3 265 102 Q4 310

Native Support for Pivot and Unpivot

select * from quarterly_salesunpivot include nulls(revenue for quarter in (q1,q2,q3,q4))order by salesrep, quarter ;

QUARTERLY_SALES

SALESREP Q1 Q2 Q3 Q4---------- ----- ----- ----- ----- 100 230 240 260 300 101 200 220 250 260 102 260 280 265 310

SALESREP QU REVENUE---------- -- ---------- 100 Q1 230 100 Q2 240 100 Q3 260 100 Q4 300 101 Q1 200 101 Q2 220 101 Q3 250 101 Q4 260 102 Q1 260 102 Q2 280 102 Q3 265 102 Q4 310

Native Support for Pivot and Unpivot

SALESREP 'Q1' 'Q2' 'Q3' 'Q4'---------- ----- ----- ----- ----- 100 230 240 260 300 101 200 220 250 260 102 260 280 265 310

SALES_BY_QUARTER

SALESREP QU REVENUE---------- -- ---------- 100 Q1 230 100 Q2 240 100 Q3 160 100 Q4 90 100 Q3 100 100 Q4 140 100 Q4 70 101 Q1 200 101 Q2 220 101 Q3 250 101 Q4 260 102 Q1 260

select * from sales_by_quarterpivot (sum(revenue)for quarter in ('Q1','Q2','Q3','Q4'))order by salesrep ;

Transform Data Where Data ResidesIn-database ETL technology

Extract

Change Data Capture

External Tables

SQL*Loader

Data Pump

Transportable Tablespaces

Multi-Table Insert

MERGE

Distributed Queries

Table Functions

Load Transform Insert

Partition Exchange Loading

DML error logging

• Capture changes from [redo | archive] logs• No changes to source applications• Minimal performance impact on source applications

• Store changes in change tables• Provide (bulk) SQL interface to change data

OLTPDB

Logfiles

ChangeData

Log Miner

andStreams

DWTables

SQL, PL/SQL,Java

Transform

Read-consistent subscription

CapturePMOPs

Time-based subscriptionwindows

Asynchronous Change Data Capture

Oracle Database 11g

RAC – Scale Incrementally

3 6 9 12 15 18 21 24Months

100%

200%

300%W

o

r

k

l

o

d

Automatic Storage Management

• Storage pool for database files• Load-balanced across disks• Capacity on demand

• Add/remove storage on-line• Automatic IO load balancing

• Fault tolerant, high performance• Automatically mirrors and stripes

• Low cost• No IO tuning required• No volume manager or file system needed

Mixed Workloads

• Concurrent small data loads and queries• Looks like... OLTP

• Oracle's read consistency• Readers never block writers• Writers never block readers• Queries are always consistent

and auditable• No deadlocks• Introduced in Oracle V4 (1982)

– major improvements in V6 (1988)

report

update

update

Rollback Segment

BeforeImage

accuratereport

Budget table

Database Resource Manager

• Protect the system pro-actively• Maximum number of concurrent operations• Priority-dependent maximum Degree Of Parallelism (DOP)

High PrioritySales Analysis20 users (DOP 10)

Medium PriorityAd Hoc Reports200 users

(DOP 4)

Low PriorityETL Jobs200 users

(DOP 4)

Oracle Database Security

Marketing

Finance

Sales

Authenticate

Protect data in transit

Authorize

AccessControl

Protect stored data

Audit

Identity Management

Feature Usage for Large-Scale Data Warehouses

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

DB Res Mgr

RMAN

ASM

Read Only

VPD

MV Use

Compression

Parallel Exec

Partitioning

Source: TB Club Report: A survey of 30 multi-TB Oracle DW’s – data July 2006

Partitioning, parallelism, and compression are the foundation for large-scale data warehousing

<Insert Picture Here>

Monitoring

I/O MonitoringDatabase Control

I/O MonitoringDatabase Control

Parallel Execution MonitoringDatabase Control

Near Real-Time SQL MonitoringComing in Grid Control

Parallel SQL MonitoringComing in Grid Control

<Insert Picture Here>

Information Life-cycle Management (ILM)

Information Lifecycle Management

“The policies, processes, practices, and tools used to align the business value of information with the most appropriate and cost effective IT infrastructure from the time information is conceived through its final disposition.”

Storage Networking Industry Association (SNIA) Data Management Forum

HistoricalData

ActiveData

Less ActiveData

Information Lifecycle Management

Orders

Q1Orders

Q2Orders

Q3Orders

Q4Orders

OlderOrders

ActiveHigh PerformanceStorage Tier

Less ActiveLow CostStorage Tier

HistoricalOnline ArchiveStorage Tier

Traditional Storage ApproachAll data resides on a single storage tier

High Performance Storage Tier = $72 per Gb

All data on active = $972,000!

ActiveActive

Partitioning is the Foundation for ILMPartition data onto appropriate storage tier

Active Less Active Historical

High Performance Storage Tier = $72 per Gb

Low cost Storage Tier= $14 per Gb

Read only Storage Tier= $7 per Gb

Partitioning is the Foundation for ILMMove data onto appropriate storage tier

5% Active 35% Less Active 60% Historical

High Performance Storage Tier = $72 per Gb

Low cost Storage Tier= $14 per Gb

Read only Storage Tier= $7 per Gb

Partitioning is the Foundation for ILMReduce storage costs accordingly

5% Active 35% Less Active 60% Historical

High Performance Storage Tier = $72 per Gb

Low cost Storage Tier= $14 per Gb

Read only Storage Tier= $7 per Gb

$49,800 $67,700 $58,000

Introduce CompressionReduce storage costs across all tiers

5% Active 35% Less Active 60% Historical

$16,600 $22,600 $19,400

Lets use compression factor of 3

$49,800 $67,700 $58,000

Cost Savings by Storage Tier

<Insert Picture Here>

Oracle Optimized Warehouse Initiative

Oracle Optimized Warehouse Initiative

Goals for Oracle data warehouse solutions:

• Provide superior system performance• Provide a superior customer experience

Full Range of DW Solution Options

• Database Options

• Management Packs

ReferenceConfiguration

ReferenceConfiguration

• Documented best-practice configurations for data warehousing

• Benefits:

High performance

Simple to scale; modular building blocks

Industry-leading database and hardware

Available today with HP, IBM, Sun, EMC/Dell

• Flexibility for the most demanding data warehouse

• Benefits:

High performance

Unlimited scalability

Completely customizable

Industry-leading database and hardware

CustomCustom

• Database Options

• Management Packs

Flexibility

Pre-configured, Pre-installed, Validated

• Partitioning• RAC

OptimizedWarehouseOptimizedWarehouse

• Scalable systems pre-installed and pre-configured: ready to run out-of-the-box

• Benefits:

High performance

Simple to buy

Fast to implement

Easy to maintain

Competitively priced

<Insert Picture Here>

Market

Data Warehouse Market

39.8%

22.7%

16.0%

11.4%

10.1%

Oracle

IBM

Microsoft

Teradata

Other

Source: IDC, 2006 - Worldwide Data Warehousing Tools 2005 Vendor Shares

Oracle is the Data Warehousing DBMS Market Leader

Leading ScalabilityWintercorp VLDB Survey

Source: http://www.wintercorp.com

Yahoo! Oracle 100.39AT&T Daytona 93.88 KT-IT Group DB2 49.40AT&T Daytona 26.71 LGR - Cingular Oracle

25.20Amazon.com Oracle

24.77Anonymous DB2 19.65UPSS Microsoft 19.47Amazon.com Oracle

18.56Nielsen Media Sybase IQ

17.69

2005 SurveyFrance Telecom Oracle

29.23AT&T Proprietary

26.27 SBC Teradata 24.81Anonymous DB2 16.19 Amazon.com Oracle

13.00Kmart Teradata 12.59Claria Oracle 12.10HIRA Sybase IQ 11.94FedEx Teradata 9.98Vodafone Gmbh Teradata

9.91

2003 SurveySears Teradata 4.63 HCIA Informix 4.50Wal-Mart Teradata 4.42 Tele Danmark DB2

2.84Citicorp DB2 2.47MCI Informix 1.88NDC Health Oracle 1.85Sprint Teradata 1.30Ford Oracle 1.20Acxiom Oracle 1.13

1998 Survey

Oracle DW 10+TB Customers (3/2006)Various Platforms and Architectures

• Acxiom 16 TB HP• Allstate 15 TB Sun (RAC)• Amazon 61 TB HP (RAC)• Cellcom 14 TB HP• CenturyTel 10 TB HP• Chase 30 TB IBM (RAC)• Choicepoint 14 TB Sun• Claria 38 TB Sun• Experian 14 TB Sun• KTF 14 TB HP• Cingular 25 TB HP

• Mastercard 20 TB IBM (RAC)• NASDAQ 35 TB Sun• NexTel 28 TB HP• NYSE Group 15 TB HP (RAC)• Reliance Ltd 13 TB Sun• Starwood 12 TB HP• TIM (Italy) 12 TB HP (RAC)• Turkcell 14 TB Sun (RAC)• UBS AG 15 TB Sun• UPS 10 TB HP• Yahoo! 130 TB Fujitsu

Hundreds of Terabyte+ DW Customers!

<Insert Picture Here>

Summary

• Technology• Monitoring• Information Life-cycle Management (ILM)• Oracle Optimized Warehouse Initiative• Market

For More Information

http://search.oracle.com

or

oracle.com

BI & Data Warehousing