kscope14 - real-time data warehouse upgrade - success stories

52
www.rittmanmead.com [email protected] @rittmanmead Real-Time Data Warehouse Upgrade - Success Stories Nick Hurt - IFPI Michael Rainey - Rittman Mead KScope 2014 - Seattle, WA

Upload: michael-rainey

Post on 27-Jan-2015

108 views

Category:

Technology


3 download

DESCRIPTION

Providing real-time data to its global customers is a necessity for IFPI (International Federation of the Phonographic Industry), a not-for-profit organization with a mission to safeguard the rights of record producers and promote the value of recorded music. Using Oracle Streams and Oracle Warehouse Builder (OWB) for real-time data replication and integration, meeting this goal was becoming a challenge. The solution was difficult to maintain and overall throughput was degrading as data volume increased. The need for greater stability and performance led IFPI to implement Oracle GoldenGate and Oracle Data Integrator. This session will describe the innovative approach taken to complete the migration from a Streams and OWB implementation to a more robust, maintainable, and performant GoldenGate and ODI integrated solution.

TRANSCRIPT

Page 1: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Real-Time Data Warehouse Upgrade - Success StoriesNick Hurt - IFPI Michael Rainey - Rittman Mead KScope 2014 - Seattle, WA

Page 2: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Introduction

•Michael Rainey (Rittman Mead) ‣Principal Consultant ‣Oracle Data Integration expert

-GoldenGate and Oracle Data Integrator ‣Oracle ACE @mRainey

Page 3: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

About Rittman Mead

•Oracle Gold partner with offices in US (Atlanta), Europe, Australia, India and South Africa

•World leading specialist partner for technical excellence, solutions delivery and innovation in Oracle BI

•Provide consulting, training, global managed services for customers around the world

•120+ consultants including 1 Oracle ACE Director, 3 Oracle ACEs and 1 Oracle ACE Associate

•All expert in Oracle BI, DW, EPM and Analytics tech

•Skills in broad range of supporting Oracle tools: OBIEE, OBIA, ODIEE, Essbase, Oracle OLAP, GoldenGate, Exadata, Endeca

•Blog : http://www.rittmanmead.com/blog/ •Twitter : @rittmanmead

Page 4: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Introduction

•Nick Hurt (IFPI) ‣Solutions Developer at IFPI - using Oracle since 2002

@nicholas_hurt

•IFPI = International Federation of Photographic Industries‣represents the interests of recording industry worldwide‣green light for OBIEE in 2010‣required “near” real-time anti-piracy analytics‣joined in 2011 to work on delivery

Page 5: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Agenda

•IFPI data - the good, the challenging, the ugly •Pre-upgrade ‣Environment ‣Challenges

•Overview of GoldenGate and Oracle Data Integrator •Upgrade - planning, migration steps •Post-upgrade results •Closing remarks on real-time warehousing

Page 6: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Challenges of IFPI data

Page 7: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Challenges of IFPI data

•The good ‣Seek & destroy infringing URLs

Page 8: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Challenges of IFPI data

•The good ‣Seek & destroy infringing URLs

•The challenging ‣Velocity - 1 mil+ upserts per day ‣Volatility depth - indefinite retrospective updates ‣Large wide product dimension - 12 million rows

Page 9: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Challenges of IFPI data

•The good ‣Seek & destroy infringing URLs

•The challenging ‣Velocity - 1 mil+ upserts per day ‣Volatility depth - indefinite retrospective updates ‣Large wide product dimension - 12 million rows

•The ugly ‣Multiple redundant updates ‣Back-dated corrections ‣Multiple sources of information (data consistency & quality)

-heavy data cleansing - identifying duplicates -inconsistencies (error-tolerant/error-correction ETL)

Page 10: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Link Lifecycle

Time

Infringing URL Detected

t0

Link Found

Primary event

Page 11: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Link Lifecycle

Time

Infringing URL Detected

t0

Link Found

Primary event

Page 12: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Link Lifecycle

Time

Deleted / Matching

t0+tn

Link Correction

Optional events

Infringing URL Detected

t0

Link Found

Primary event

Page 13: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Link Lifecycle

Time

Deleted / Matching

t0+tn

Link Correction

Cease & Desist

t1 = t0+tn

Link Actioned

Optional events

Infringing URL Detected

t0

Link Found

Primary event

Page 14: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Link Lifecycle

Time

Deleted / Matching

t0+tn

Link Correction

Cease & Desist

t1 = t0+tn

Link Actioned

Take- down

t2 = t1+tn

Link Removed

Optional events

Infringing URL Detected

t0

Link Found

Primary event

Page 15: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Process Flow / Dataset

Event DetectedETL

Cleansing

De-duping

SummariesDashboards

Fact table representation

Time Found Link New Unique File Unique Link Actioned Taken-down4/10/14 2:50 PM www.4shared.com/rar/-6ebvl89/Justin_Bieber_-_All_Around_The.html 1 1 0 0

4/15/14 11:44 AM www.4shared.com/mp3/-2J4lahU/Nickel_Back_-_If_Everyone_Care.htm 1 1 0 0

4/15/14 2:50 PM www.4shared.com/rar/-6ebvl89 0 1 0 0

Time

Page 16: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Process Flow / Dataset

Fact table representation

Time Found Link New Unique File Unique Link Actioned Taken-down4/10/14 2:50 PM www.4shared.com/rar/-6ebvl89/Justin_Bieber_-_All_Around_The.html 1 1 1 1

4/15/14 11:44 AM www.4shared.com/mp3/-2J4lahU/Nickel_Back_-_If_Everyone_Care.htm 1 1 1 0

4/15/14 2:50 PM www.4shared.com/rar/-6ebvl89 0 1 1 1

4/15/14 11:01 PM www.4shared.com/mp3/-qXkFru8/Kanye_West__Jay-Z_Bingo_Player.html 1

Event Detected Summaries

Dashboards

Time

ETL upserts

Page 17: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Pre-Upgrade Architecture

EDWOBIEE

Dashboards

ODSSubscriber Views Star Schema

ETL

OLTP (source)

Streams & CDC

OWB Mappings

Page 18: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Asynchronous Distributed HotLog Configuration

Pre-Upgrade Architecture - Streams and Oracle CDC

Page 19: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Pre-upgrade Challenges

•Throughput •Complex views •Recovery after VM/DB crash •Maintenance and development •Purging auditing information •Volume of redo •Oracle’s Statement of direction

Page 20: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Over to you…

•Michael Rainey

@mRainey

Page 21: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Oracle GoldenGate

Page 22: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Oracle Data Integrator 11g

•Oracle’s strategic product for data integration •Uses ELT (Extract, Load, Transform) approach ‣No middle ETL engine necessary ‣Uses the power of the target database to perform transformations

•Supports heterogeneous data sources •Declarative design - separation of business and technical integration

•Data integrity controls create a “data firewall” •Extensible through “Knowledge Modules”

Page 23: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

ODI 11g Journalizing (CDC)

•Oracle Data Integrator Change Data Capture (CDC) delivered via Journalizing ‣Identify, capture, and deliver changes made to source data ‣Journalizing Knowledge Module (JKM) performs setup and creates infrastructure

•ODI CDC Framework ‣Capture Process - mechanism for capturing changed data from the source database (Ex. Oracle GoldenGate) ‣Journals - tables (J$) hold references to changed records and the change type (insert / update / delete) ‣Journalizing Views - (JV$, JV$D) provides access to changed data, used by IKM / LKM in mappings ‣Subscribers - used to allow consumption of changed data at different intervals, for multiple applications, etc.

Page 24: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

GoldenGate and ODI Integration

•JKM Oracle to Oracle Consistent (OGG) Knowledge Module ‣ODI Metadata used to generate GoldenGate parameter files (extract, pump, replicate) andconfiguration files ‣Delivered with ODI

•ODI CDC Framework generated ‣Staging table - replicate of source ‣J$ (journal) table - change rows

•Journalized data used in transformations (via JV$ views)

Page 25: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

GoldenGate and ODI Integration

•JKM Oracle to Oracle Consistent (OGG) Knowledge Module ‣ODI Metadata used to generate GoldenGate parameter files (extract, pump, replicate) andconfiguration files ‣Delivered with ODI

•ODI CDC Framework generated ‣Staging table - replicate of source ‣J$ (journal) table - change rows

•Journalized data used in transformations (via JV$ views)

Page 26: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Migration Decisions / Upgrade Planning

•ODI Master repository location •GoldenGate considerations ‣Installation and configuration (RAC is trickier) ‣Classic vs Integrated capture (requires EE for both source & target) ‣How to use it? Product built for migration and/or replication ‣Naming conventions

•OWB mappings to ODI interfaces ‣Various migration approaches

•Control, Monitoring & Alerting (no free lunch) •Testing & Go-live approach

Page 27: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Migration Steps•Migrate OLTP applications to RAC ‣GoldenGate RAC target kept in-sync during application migration

•Performance tuning & ODI KM Modifications ‣Retain existing CDC framework objects when adding new tables ‣Update column mapping in replicat ‣Remove unnecessary code in Integration Knowledge Module

•Generate GoldenGate extract, pump and replicat ‣ODI Journalizing Knowledge Module ‣Source definitions file recommended

•Migrate OWB mappings to ODI interfaces ‣3Rs: re-assess, replicate and refine existing mappings

•Test the migration ‣Run both systems in parallel and compare results ‣Trends, aggregates, row counts

Page 28: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Micro-batch ETL

Page 29: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Micro-batch ETL

Variables to track execution status

Error handling

Recursive execution

Execute Load Plan

Page 30: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Post-Upgrade Architecture

EDWOBIEE

Dashboards

2-node RACOLTP (source)

ODSJ$ Tables Star Schema

ETL

GoldenGate Replication

ODI Interfaces(CDC)

Control, Alerting, Monitoring

Page 31: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Control, Alerting and Monitoring

•GoldenGate status and lag •ODI Agent monitoring •ETL throughput / health: ODI session tables •Enterprise Manager job scheduler to control ETL process •Monitoring dashboard !

!

Page 32: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

GoldenGate Status and Lag

Page 33: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

GoldenGate Status and Lag

Page 34: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

GoldenGate Status and Lag

Page 35: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

ODI Agent Monitoring and Healthcheck

Page 36: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

ODI Agent Monitoring and Healthcheck

Page 37: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

ODI Agent Monitoring and Healthcheck

Page 38: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Monitoring Dashboard

Page 39: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Monitoring Dashboard

Fact Table Load - Volume and Duration

ETL currently running and duration

Scheduled Job Duration

OLTP -> BI Latency

Page 40: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Upgrade Results

•Reduced lag ‣From 5-15 minutes to <1 minute

•Stabilised fact mapping with equivalent load volumes ‣Pre-upgrade 2 mins - hours ‣Post-upgrade 10 - 25 seconds

•Reduced ETL downtime ‣2+ days p/m to minutes p/m

•Simpler to extend tables under CDC •Purging audit information <1 hour rather than days

Page 41: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Upgrade Effects

•Faster troubleshooting & diagnosis times •Shorter maintenance & development times •Focus on performance and streamlining processes •Investigation into excessive redo volumes ‣Understanding incremental statistics

•MDM project kick-off •Contemplation of The Reference Architecture…

Page 42: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Reference Architecture & Realtime DW

•Staging Data Layer ‣Buffers reception for right-time distribution ‣Apply business rules to make the data clean, consistent and complete ‣Retain rejected data for manual/automatic correction

Page 43: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Reference Architecture & Realtime DW

•Staging Data Layer ‣Buffers reception for right-time distribution ‣Apply business rules to make the data clean, consistent and complete ‣Retain rejected data for manual/automatic correction

•Performance Layer ‣Dimensional model - star schema ‣Permanent & non-volatile data (traditionally speaking)

Page 44: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Reference Architecture & Realtime DW

•Staging Data Layer ‣Buffers reception for right-time distribution ‣Apply business rules to make the data clean, consistent and complete ‣Retain rejected data for manual/automatic correction

•Performance Layer ‣Dimensional model - star schema ‣Permanent & non-volatile data (traditionally speaking)

•Something in-between… ‣Caters for deeply volatile data by persisting historic and real-time facts ‣Combines elements of staging and performance layers ‣Facilitates agile de-coupled ETL processes

Page 45: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Real-time DW/BI - Blogged by Stewart Bryson 2011

http://www.rittmanmead.com/2011/05/real-time-bi-an-introduction/

Page 46: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Reference Architecture - Mashup

Page 47: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Reference A

rchitecture - Mashup

Sour

ces

Staging Data Layer Performance Layer

Foundation Layer

refresh intervalETL interval

timeframe

Tim

e

latency

Que

ry performance

Page 48: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Reference A

rchitecture - Mashup

Sour

ces

Staging Data Layer

OLT

P

Performance LayerR

efer

ence

DW

Arc

hite

ctur

e M

ashu

p

Foundation Layer

Standard EDW (Star)

In-m

emor

y M

ater

ializ

ed V

iews

O

LAP

Temporary Structures

refresh intervalETL interval

timeframe

Tim

e

latency

Que

ry performance

Page 49: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Reference A

rchitecture - Mashup

Sour

ces

Staging Data Layer

OLT

P

Performance LayerR

efer

ence

DW

Arc

hite

ctur

e M

ashu

p

Foundation Layer

Federated OLTP+EDW

Standard EDW (Star)

In-m

emor

y M

ater

ializ

ed V

iews

O

LAP

Temporary Structures

refresh intervalETL interval

timeframe

Tim

e

latency

Que

ry performance

Page 50: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Reference A

rchitecture - Mashup

Sour

ces

Staging Data Layer

OLT

P

Performance LayerR

efer

ence

DW

Arc

hite

ctur

e M

ashu

p

Foundation Layer

Federated OLTP+EDW

Federated EDW+rolling hot partition

Standard EDW (Star)

In-m

emor

y M

ater

ializ

ed V

iews

O

LAP

Temporary Structures

refresh intervalETL interval

timeframe

Tim

e

latency

Que

ry performance

Page 51: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Hybrid LayerExtreme Real-time

EDW

Reference A

rchitecture - Mashup

Sour

ces

Staging Data LayerO

LTP

vola

tility

dep

th

Performance LayerR

efer

ence

DW

Arc

hite

ctur

e M

ashu

p

Foundation Layer

Federated OLTP+EDW

Federated EDW+rolling hot partition

Standard EDW (Star)

In-m

emor

y M

ater

ializ

ed V

iews

O

LAP

Temporary Structures

refresh intervalETL interval

timeframe

Tim

e

latency

Que

ry performance

Page 52: KScope14 - Real-Time Data Warehouse Upgrade - Success Stories

www.rittmanmead.com [email protected] @rittmanmead

Conclusion

•This was not a sales pitch!•Real-time DW/BI inevitable•Upgrade now•Share your thoughts & experiences: •[email protected][email protected]