informatica agile virtualization apr17 2012
TRANSCRIPT
11
Informatica Data VirtualizationThe “Foundation” for AGILITY & PRODUCTIVITY
Kerry Holton
Informatica Senior Sales Engineer
2Informatica Corporation Confidential – Do Not Distribute
2
H
Take some good notes !
A copy of “Lean Integration.”
Tell me which box is the ONLY thing
that data virtualization built on data
federation does – and why???
Answer questions along
the way…
Let’s Win Something!!!
3Informatica Corporation Confidential – Do Not Distribute
3
Sign-Up
Expert Roundtables
Data Virtualization Cornerhttp://vip.informatica.com/?elqPURLPage=8668
To Learn More…
JOIN & DISCUSS
2000+ Strong
“Data Virtualization & Data
Services Architecture” Group
Informatica.com > Products > PowerCenter > Data
Virtualization Edition
Informatica.com > Products > Data Virtualization
4Informatica Corporation Confidential – Do Not Distribute
4
Agenda
• “2012” – The Year of “BI” Agility
• Data Virtualization – Overview, Problem & Need
• Key Use Cases
• Customer Examples
• Data Virtualization in Action
• Why Informatica?
• Next Steps & Q&A
5Informatica Corporation Confidential – Do Not Distribute
5
ICC Director (VP of IM) to Dave Lyle (VP Product Strategy), end of Q3, 2009
getting the data out!”
I’m writing you a million dollar
check, but you’re not solving my
big problem. My big problem isn’t
getting the data into the data
warehouse. My big problem is …
6Informatica Corporation Confidential – Do Not Distribute
6
“2012”
BI will be the top priority
for the CIO, in 2012!
“Demands by users of business intelligence
(BI) applications to "just get it done" are turning
typical BI relationships, such as business/IT
alignment and the roles that traditional and next-
generation BI technologies play, upside down. As
business users demand more control over BI
applications, IT is losing its once-exclusive
control over BI platforms, tools, and applications.”
– Boris Evelson, Forrester Research, Blog -
“Top 10 BI Predictions for 2012”
Business
/
BI
IT
Have any of you had this discussion?• Need for a new BI infrastructure
• Replacing spreadsheets
• Faster data access & reporting
• Business-focused BI
• $100M Qtr. in 2011
• 10k+ customers
7Informatica Corporation Confidential – Do Not Distribute
7
How Long Does it Take to Deliver New Critical Data or Reports to the Business?
8Informatica Corporation Confidential – Do Not Distribute
8
The Business Can’t Wait 3-6 Months For a Single View of All Enterprise Data
Applications Partner Data
SWIFT NACHA HIPAA …
UnstructuredDatabases SocialWarehouses NoSQLCloud Computing
SOA
ESB/EAI
ETL
EIIHand Coding
Business
Intelligence
Business
Intelligence
Business
Intelligence
Business
Intelligence
Business
Intelligence
Business
Intelligence
Business
Intelligence
9Informatica Corporation Confidential – Do Not Distribute
9
Overview
NO
REUSE
16 Types of Data Sources
Different Price Info in Each LOB
To Add 1 Product Attribute to Existing Report – IT Estimated 1700 Hours
Product Config Mgmt (MS SQL Server)
Facets [Benefits, Products] (Sybase ASE)
Data Warehouse
(DB2)
30,000 Data Marts
(MS Access)
BI
(Cognos)
Portal
(WebSphere)
Business IT
HealthNow’s Data Integration Challenges
30,000 Data Marts Were Created by Shadow IT Teams
So What Did the Business Do?
11Informatica Corporation Confidential – Do Not Distribute
11
The Fundamental Problem(s)…
• It takes too long to explain
requirements
• It takes months to change a
DW / add new critical data
• It takes many iterations to
get the right data / reports
• Changes can break existing
integrations & impact apps.
1. Design
2. Change
3. Integrate
4. Unit Test
5. Validate
6. Deploy
Typical
Data Integration Process
Business is
Involved Too Late
As-Is Value Stream Map (LOT OF WAIT & WASTE)
12Informatica Corporation Confidential – Do Not Distribute
12
ApplicationsUnstructured Data Spread Marts
DATA
MART
EDW
Trying to Solve it in BI Layer Just Wont Scale…Why?
No Reuse
No Common Data Access Layer
No Easy Way to Handle Change
No Data Quality & No Data Consistency
13Informatica Corporation Confidential – Do Not Distribute
13
PortalBI Composite Apps
Enterprise
Data Sources
Data
Abstraction
Logical Data Objects
PRODUCT …CUSTOMER ORDER
Data
Consumers
Logical View of All Underlying Data
What is Needed to Solve these Problems?
Think Virtual Machines for DATA!
SUPPORT ALL USE CASES
BI / DW MDM SOA
FAST, DIRECT ACCESS TO
DATA THE BUSINESS TRUSTS
DATA ABSTRACTION &
REUSE OF SKILLS/LOGIC
COMMON ACCESS LAYER
ACROSS MANY DATA SOURCES
14Informatica Corporation Confidential – Do Not Distribute
14
How is the Market Trying to Address the Problems?
Cannot Easily Move to
Persistent Store
or Reuse
DW
BI
Virtual View
Access
Merge
Deliver
Data Virtualization
(Built-On Data Federation)Limited or
Data Source
Profiling Only
X
SQL/XQuery Only
Transformations &
No Data Quality
DWXX
• Addresses specific use cases
• No data movement / no copies / only federation
• Code heavy / not model-based / no reuse
• Not tools for business self-service
• SQL/XQuery-only transformations
• No data profiling / no data quality
It’s like ONE step forward
&
TWO steps backward
Time GAINED by federation
is nullified by
Time SPENT on more processing
15Informatica Corporation Confidential – Do Not Distribute
15
What Are the Top 3 Key Capabilities for a Project that Needs Data Virtualization?
Source – Informatica Data Virtualization Expert’s Forum ,2011
Dataset - 600
If Performance is a given…
16Informatica Corporation Confidential – Do Not Distribute
16
Are We Talking About TWO Separate Tools?
17Informatica Corporation Confidential – Do Not Distribute
17
Business IT
TRANSFORM IN RT
Advanced Transformations,
Data Quality, Data Masking
4
Virtual Table
CRM Accounts
ACCESS & MERGE
2
Virtual Table
PROFILE IN RT
Business
Manager
Analyst,
Steward
Developer,
Architect
Common
Metadata
3
Virtual Table
MODEL
Customer
Name
Address
Category
Orders
1
Virtual Table
CRM
SCALE & PERFORM
Accounts
7
Optimizations
& Caching
Virtual Table
MOVE OR FEDERATE
AccountsCall Center
DW
6
Virtual Table
REUSE INSTANTLY
Batch Web Services
5
Query
Engine
WS
Server
Virtual Table
What Does the Ideal Solution Look Like?
18Informatica Corporation Confidential – Do Not Distribute
18
How Does Informatica Deliver the Ideal Solution?
• Single environment for both data integration and data federation
• No data movement / no copies – but easily reuse virtual views for batch
• Early & iterative business (analyst) involvement – self-service
• Pre-built library of rich ETL-like advanced data transformations
• Integrated real-time, on-the-fly data profiling & data quality
DW
BI
Virtual View
Access
Merge
Deliver
DW
Prototype
First
Move to DW
or Instantly Reuse
as SQL / WS
Advanced
Transformations &
Data Quality
Analyze & Profile
Data & Logic
AnytimeEarly Business
Involvement
Data Virtualization = (Data Integration + Data Federation) in ONE Tool
19Informatica Corporation Confidential – Do Not Distribute
19
DM
WEB
How Does It Work?
DM
Cust DW
DM
DM ODS
DW
DW
PRODUCT INVOICECUSTOMER SUPPORT
SELECT *
FROM customer_table INNER JOIN
support_table ON
customer_table.customer_num =
support_table.customer_id
WHERE customer_name=‘ACME’
NEW QUERYSELECT *
FROM customer_table
Retrieve historical customer
datatxt
New query for report needing
data not in DW
Query is processed by
virtualization layer
Results retrieved in real-time
without data movement
Data quality rules applied on-the-
fly against data
Trusted blend of historical and
operational data delivered
On-boarding new data does not
break integrations
Virtual view can be physically
materialized later into DW
Complement data architecture
with virtualization
CUSTOMER
SELECT *
FROM SUPPORT
EXISTING QUERY
NEW REQUEST
• Change / add an attribute
• Join new data not in DW
• Create a new report
NEW DATA & REPORTS
THAT BUSINESS NEEDS
& TRUSTS, DELIVERED
IN DAYS vs. MONTHS
INSTANT REUSE
NO
REUSE
Product Config Mgmt (MS SQL Server)
Facets [Benefits, Products] (Sybase ASE)
Data Warehouse
(DB2)
30,000 Data Marts
(MS Access)
BI
(Cognos)
Portal
(WebSphere)
Business IT
Instant Reuse
DW, BI, SOA & MDM
(SQL, Web Services, Batch)
Informatica Data Virtualization at HealthNow
PRODUCT ORDERMEMBER CLAIM
“Virtual Table”
Common Data Model
Fast, Direct Data Delivery
1 week (vs. 3 months)
Shared
Repository
21Informatica Corporation Confidential – Do Not Distribute
21
What Does Informatica’s Data Virtualization Solution Look Like?
PowerCenter
Data Virtualization Edition
Data Federation
(Data Services)
Developer Tool
Analyst Tool
Data Profiling
ETL
(PC Standard Edition)
Partitioning
NEW
2 Adapters
(PWX for Relational)
New PowerCenter Edition for
AGILITY & PRODUCTIVITY
Combines:
Data integration (PowerCenter SE)
Data Virtualization (IDS Full Use)
Data Profiling (IDE Full Use)
Business-IT Collaboration (Analyst)
Packaged for simplicity and
attractively priced
Reuses existing skills and
resources
22Informatica Corporation Confidential – Do Not Distribute
22
What Use Cases Are Supported?
Weeks/Days
Change
Request
Deploy to
Production
Business ITDW/Business Intelligence (BI) Prototype DW & accelerate new data
& reports from months to days
1
MDMDeliver a complete view of master &
transactional data in real-time
2
Months
SOADeliver the missing data services
layer to SOA & applications
3
INCOMPLETE VIEW
OF CUSTOMER
MDM
HUB
TRANSACTIONAL
SYSTEMS
DATA
WAREHOUSE
Vir
tual
Vie
w
COMPLETE VIEW
OF CUSTOMER
Applications
Data Sources
Reg
istr
y
ES
B
BPM
Biz. Services
Data Abstraction
23Informatica Corporation Confidential – Do Not Distribute
23
What are the Benefits of Informatica’s Solution?
• Provide fast, direct access to critical
new data & reports in days vs. months
• Enable rapid iterations to results with
instant Biz-IT collaboration
• Deliver flexibility, ensure reuse &
insulate applications from changes
COMPLETE, CURRENT & TRUSTED
View of All Data, On-Demand
24
Customer Examples
25
BI, MDM, SOA – HealthNow NY Improves Risk & Pricing Analysis With Data Services
• 16 enterprise databases and over
30,000 Access databases
• Took 1700 man hours to add a
new product to portfolio
• Business had to go to 5 different
sources for all information related
to paid claims
• Continued data growth with over
30,000 claims processed per day
• Data proliferation leading to HIPAA
compliance concerns
• Logical data models and data
services to represent their core data
entities – MEMBER,
CLAIMS,PROVIDER,
ENCOUNTER, LAB RESULTS
• ‘Rate Letter’ project for
determination of policy rates and
discounts went live in May 2010
• Over 400 Logical data objects and 2
web services being used by around
125 end users
• Speed of data delivery –
Implemented first project in around
40 man hours. This would have
taken an order of magnitude more
in the past
• Complete view of the truth -
Business users now access plan
rate information from single service
• Better governance – Centrally
managed virtual views as opposed
to one-off data marts is improving
governance of data
The Challenge The Solution The Benefits
BI (Cognos)
IDSVirtual Table
Product Config Mgmt (MS SQL Server)
Facets [Benefits, Products] (Sybase ASE)
Data Warehouse
(DB2)
SQL, Web Service
Data Marts
(MS Access)
Portal
(WebSphere)
26
• Lack of visibility for proper
supervision and regulation of the
national financial system
• Real-time analysis and joining of data
(Adabas, DB2, SQLServer, Files)
• Persistent data replication even for
one-time use
• Huge data volumes (Online 6TB, DW
14 TB)
• Different reporting tools requesting
different data combinations across
heterogeneous data sources
• Logical data models to represent core
business entities (e.g. CUSTOMER)
• Mainframe virtualization (join data from
Adabas, DW DB2, Apps., 3rd Party )
• Logical data models and Web services
to deliver flexibility and agility to
respond to changing business needs
• Creation of logical data objects and
physical materialization of virtual views
to familiar PowerCenter environment
• Speed of data delivery – implemented
first project in around 60 man hours and
delivered a new virtual view in < 1hour
• Better risk/fraud governance (across
more than 6000 financial institutions)
and compliance with BASEL I, BASELII
and SOX
• Complete single view of the truth -
business users can now access
consistent customer and plan rate data
• Centralized management and
administration of logical data objects
The Challenge The Solution The Benefits
Microsoft Reporting Services
Data VirtualizationVirtual Table
Financial Institutions (Flat Files and Messages)
Credit Analysis, Applications, AML (SQL Server)
Data Warehouse
(DB2 LUW)
SQL, Web Service
Transactions Tables
(Mainframe – Adabas, DB2)
Customized Applications
BI, SOA - Large Latin American Bank Improves Governance
27
BI, MDM – VW Leverages Delivers a Complete View of Critical Data On-Demand
• CUSTOMER data in > 30 systems,
MDM hub, transaction systems, DW
• Have 80% data but missing critical 20%
transactions - WARRANTY, SERVICE
• No authoritative source of CUSTOMER,
PRODUCT data, conflicting relationships
• No complete view of CUSTOMER data
on-demand is affecting service
• Without complete view of data, can’t
meet goal to sell 3x more cars by 2018
• Create a common data model for
VW owners, prospects, & partners
• Federate data in real-time from > 30
systems & transactional systems
• Provide easy-to-use, browser-based
tools for business & IT to collaborate
• Apply reusable DQ rules on-the-fly
to CUSTOMER, PRODUCT data
• Instantly reuse data services for
SQL or Web services
• Completed DI, DQ, & data services
production pilot in <1 month
• Can leverage operational efficiency &
real-time decisions to differentiate
• Delivered accurate, complete view of
CUSTOMER data, on-demand
• Lowered costs by increasing
productivity & reuse of data services
• Supported strategy to triple sales to
1M vehicles annually, by 2018
The Challenge The Solution The Benefits
BI
Reuse
IDSVirtual Table
Transactional Systems (Warranty, Service)(Varied)
PRD [Campaign History] (SAGA/Win)
DW (Service History)
(Teradata)
SQL, Web Service
MDM Hub (Customer, Purchase, Case)
(IBM)
Portal
IDQ
28Informatica Corporation Confidential – Do Not Distribute
28
Data Virtualization in Action
29
The “Keystone” – Business Owns the Data While IT Retains Control
BI ReportAnalyst Tool
(Web Browser)
Developer Tool
(Eclipse)
SQL or
Web Service
Data Warehouse
Batch
ETL
• Role-based tools for Analysts
(Web) & IT developers (eclipse)
• Common metadata lets
Analysts & IT collaborate in RT
• Empower business analysts to:
• Define entities & directly access &
merge data to create virtual views
• Rapidly profile data sources &
logic without more processing
• Quickly find data & rules via
business glossary
• Collaborate, test, validate &
share results
• Cuts the wait & the waste in the
process
Common
Metadata
VIR
TU
AL
TA
BL
E
Portal
SQL or
Web Service
30
Business IT
TRANSFORM IN RT
Advanced Transformations,
Data Quality, Data Masking
4
Virtual Table
CRM Accounts
ACCESS & MERGE
2
Virtual Table
PROFILE IN RT
Business
Manager
Analyst,
Steward
Developer,
Architect
Common
Metadata
3
Virtual Table
MODEL
Customer
Name
Address
Category
Orders
1
Virtual Table
CRM
SCALE & PERFORM
Accounts
7
Optimizations
& Caching
Virtual Table
MOVE OR FEDERATE
AccountsCall Center
DW
6
Virtual Table
REUSE INSTANTLY
Batch Web Services
5
Query
Engine
WS
Server
Virtual Table
The 7 Steps to AGILITY & PRODUCTIVITY
31
31
1. Model
• Represent underlying data as business entities (CUSTOMER)
• Provide a common logical view or abstraction of all data
• Import logical model from 200+ modeling tools (ERWIN)
• Use visual and metadata based mapping language
• Instantly reuse logical data object for all applicationsUnstructured
Data
ApplicationsSpread Marts EDW
Common Data Access Layer – Logical Data Object
PRODUCT INVOICECUSTOMER ORDER
Data marts
32
Social Warehouses NoSQL
2. Access and Merge
Application Partner Data
SWIFT NACHA HIPAA …
Cloud Computing UnstructuredDatabase
Analytical
Data
Interactional
Data
Transactional
DataArchived
Data
Master
Data
PRODUCT INVOICECUSTOMER SUPPORT
Turn many data sources into
ONE with Data Virtualization
33
3. Profile in RT
Rich set of integrated profiling
capability to find data
anomalies and to discover keys
and hidden relationships:
• Column & Rule Profiling
• Midstream or Comparative
Profiling
• Join & Overlap Analysis
• Primary Key / Foreign Key
Profiling
• Dependency Profiling
34
4. Transform in RT
• Metadata-driven, codeless,
graphical environment
• Rich, pre-built library of
advanced transformation
• Integrated Data Quality
transformations
• Define policies to mask
sensitive data in real time
35
METADATA
REPOSITORY
5. Reuse Instantly
SQL Web
services
Batch
• Instantly reuse LDOs for any mode/protocol (SQL, WS)
• Single click deployment to batch
• Execution & optimization separate from design-time
• No re-development & re-building of LDOs
36
6. Move or Federate
BI
DW
Extract
Advanced Transform
&
QualityLoad
Data Integration
DW
BI
Virtual View
Access
Merge
Deliver
Data Federation
DW
Single-click deployment to
PowerCenter (batch)
• Specific use cases
• No data movement / no copies
• Real-time federation
• SQL/XQuery-only transformations
• No data quality / business validation
• Majority of use cases
• Physical data movement
• Bulk/batch, near real-time, real-time
• Advanced transformations
• Built-in data quality
37
• Leverage the proven, high-
performance Informatica engine
• Optimized SQL Query engine &
graphical Query Plan
• High-performance Web services
server
• Rich set of optimizations &
caching mechanisms
• Rule Based, Cost Based, Push Down, Early Projection, Early Selection, Semi-Join, Virtual Table & Result Set Caching
• Fine grained access control, WS-
Security & pass-through security
• Database, Schema, Table,
Column, Row-Level (v9.5) security
7. Scale & Perform
38
Business IT
TRANSFORM IN RT
Advanced Transformations,
Data Quality, Data Masking
4
Virtual Table
CRM Accounts
ACCESS & MERGE
2
Virtual Table
PROFILE IN RT
Business
Manager
Analyst,
Steward
Developer,
Architect
Common
Metadata
3
Virtual Table
MODEL
Customer
Name
Address
Category
Orders
1
Virtual Table
CRM
SCALE & PERFORM
Accounts
7
Optimizations
& Caching
Virtual Table
MOVE OR FEDERATE
AccountsCall Center
DW
6
Virtual Table
REUSE INSTANTLY
Batch Web Services
5
Query
Engine
WS
Server
Virtual Table
Data Virtualization Built On Data Federation Does 1 Box – Which 1?
39
Do it Right – Avoid Costly Mistakes!
1000s of
lines of code
TIME COST
Maintenance
Nightmare
Model & metadata-
driven environment
TIME COST
Sustain &
Maintain
Enabling Rapid
Development
v/s
Profile data AND
logic anywhere
TIME COST RISK
Get it Right
1st Time
Only source profiling,
need extra processing
Many Iterations
& Mistakes
TIME COST RISK
Analyzing &
Profiling
v/sHand-coding can’t do
advanced transforms
TIME COST RISK
SQL
XQuery
Simple Cleansing
Web Service
Limited Rules,
No Data Quality
Leverage pre-built
logic including quality
TIME COST RISK
Virtual Table
Bake-in
Quality
Integrating
with Quality
v/s
Naturally extend
your infrastructure
TIME COST
Re-purpose
Logic & Skills
TIME COST
Re-work, re-deploy &
re-train every time
Re-invent the
Wheel
Leveraging
Investments
v/s
Scaling with
Flexibility
v/s
Virtualize or physically
materialize in 1 tool
TIME COST
Prototype First
& Then Scale
EII
Optimizations
TIME COST
Overburden Data
Virtualization
EII
X
RISK
Non-integrated
technologies
40
Data Virtualization in Action
41
Scenario – Big Company
ISSUES
Call center talk times increasing = scattered data + many screens
Time wasted in correcting inconsistent & inaccurate customer data
Agents can’t easily & quickly identify what products are owned
IMPACT
Can’t easily identify top customers to improve up-sell/cross-sell
Low customer satisfaction & growing customer attrition
High marketing costs without targeted campaigns
42
Demo – Big Company
Business needs a new report – NOW vs. months!
Quickly merge data from multiple systems & cleanse
Analysts know the data – want some self-service
Join CUSTOMER (Oracle CRM) & ORDER (file)
Get ORDER TOTAL for ACTIVE customers
Analyst IT Architect /
Developer
Analyst defines business
entity, profiles, defines
rules & hands over to IT
IT enriches the business
entity & publishes for BI
tool, portal or batch
Integrate missing data, do
data cleansing “on-the-
fly,” validate
43Informatica Corporation Confidential – Do Not Distribute
43
Why Informatica?
44Informatica Corporation Confidential – Do Not Distribute
44
Gartner Magic Quadrant for
Data Integration Tools, 2011
“The ability to switch seamlessly and transparently
between delivery modes (bulk / batch vs. granular
real-time vs. federation) with minimal rework will be
key for IT organizations seeking to develop a
successful data integration strategy.”
Ted Friedman, VP Distinguished Analyst, Gartner
Why Informatica?
“With v9, Informatica advanced its capabilities with
on-the-fly data quality and profiling, a model-driven
approach to provisioning data services, performance
enhancements, cloud integration, common metadata,
and role-specific tools.”
The Forrester Wave: Data Virtualization, Q1 2012
Forrester Wave: Data
Virtualization, Q1 ‘12
Power of
The PlatformTHE BEST OF
“DATA INTEGRATION”
(SOPHISTICATION)
THE BEST OF
“DATA VIRTUALIZATION”
(AGILITY)
ONLY INFORMATICA
COMBINES…
…INTO ONE SOLUTION THAT
REUSES SKILLS
45Informatica Corporation Confidential – Do Not Distribute
45
Only Informatica Provides ONE Solution for Data Integration and Federation
DW
BI
Virtual View
Access
Transform
Deliver
DW
• Single environment for both data integration and data federation
• No data movement / no copies – but can easily reuse virtual views for batch
• Early & iterative business (analyst) involvement, efficient collaboration
• Pre-built library of rich ETL-like advanced data transformations
• Integrated real-time, on-the-fly data profiling & data quality
Prototype
First
Move to DW
or Instantly Reuse
as SQL/WS
Advanced
Transformations &
Data Quality
Analyze & Profile
Data & Logic
AnytimeEarly Business
Involvement
46Informatica Corporation Confidential – Do Not Distribute
46
Next Steps & Q&A
47Informatica Corporation Confidential – Do Not Distribute
47
Have the Conversation with the Business!
Business IT
1. Identify a Critical Project in Your Company
2. Involve the Business Early & Often
3. Bake-In Quality & Support Advanced Logic
4. Demonstrate Business Value Early
5. Self-Service + Data Virtualization = ROI
New data & reports take too long…
“YOU” can now do it in
DAYS!
48Informatica Corporation Confidential – Do Not Distribute
48
Sign-Up
Expert Roundtables
Data Virtualization Cornerhttp://vip.informatica.com/?elqPURLPage=8668
Next Steps & Q&A
JOIN & DISCUSS
2000+ Strong
“Data Virtualization & Data
Services Architecture” Group
Informatica.com > Products > PowerCenter > Data
Virtualization Edition
Informatica.com > Products > Data Virtualization
49Informatica Corporation Confidential – Do Not Distribute
49