using the information server toolset to deliver end to end traceability
Post on 14-Sep-2014
356 views
DESCRIPTION
Using the information server toolset to deliver end to end traceabilityTRANSCRIPT
© 2014 IBM Corporation
Using the Information Server Toolset to deliver end to
end traceability
Tommie HallinRob Cooper
Information Server User Group 2014
1
© 2014 IBM Corporation
Introduction
Tommie Hallin, Senior Information Architect – IBM GBS, BAO
Rob Cooper, Senior Information Managment Consultant
Abstract
Using the Information Server Toolset to deliver end to end traceability
Tommie and Rob have used the Information Server Toolset on a number of analytics and data
warehousing projects to deliver end to end traceability. The presentation focuses on describing
Why, What and How end to end traceability is important and share experiences and best
practices from projects and from many years of consulting.
2
© 2014 IBM Corporation33
End to end traceability – in the context for this presentation
FRONT LINE APPLICATIONS
OLAP
DATA INTEGRATION / DATA QUALITY / ETL
SOURCE SYSTEMS, DATA MARTS, MASTER DATA
DATA WAREHOUSE
Analytics
Data Integration
Data Warehouse
© 2014 IBM Corporation
Understanding how to create value from data has been the focus of IBM’s analytics studies for 5 years
http://www-935.ibm.com/services/us/gbs/thoughtleadership/
4
Analytics:
The new path to value
Operationalizing analytics in
sophisticated organizations
Analytics:
The widening
divide
Mastering analytic competencies
Analytics:
The real world use
of big data
Fundamentals of big data
Analytics:
A blueprint for value
Extracting value from data and
analytics
2010 2011 2012 2013
The intelligent enterprise and
Breaking away with BAO
2009
Defining analytics as a strategic
asset
2014
� The emerging role of the chief data officer
� The intersection of big data and innovation
� Power of analytics to transform business outcomes
© 2014 IBM Corporation5
Analytics correlates to performance
Source: Analytics: The New Path to Value, a joint MIT Sloan Management Review and IBM Institute of Business Value study. Copyright © Massachusetts
Institute of Technology 2010.
Top Performers are more likely to use an analytic approach over intuition*
Organizations that lead in analytics outperform those who are just beginning to adopt analytics
*within business processes
5.4x3x
© 2014 IBM Corporation
Top Performers are more sophisticated in handling information
6 Source: Analytics: The New Path to Value, a joint MIT Sloan Management Review and IBM Institute for Business Value study(c) Massachusetts Institute of Technology
36%
28%
34%
21%
9%
3% 4%2%
Capture information Aggregate information Analyze information Disseminate information
and insights
4xmore likely
9xmore likely
8.5xmore likely
10xmore likely
Activity rated very well
Transformed organizations
Aspirational organizations
Chart reflects percentage of respondents who rated their organizations’ ability to perform these tasks as “very well”
© 2014 IBM Corporation
Transformed organizations master three competencies to drive sustainable competitive advantage
7Source: The New Intelligent Enterprise, a joint MIT Sloan Management Review and IBM Institute of Business Value analytics research partnership. Copyright © Massachusetts Institute of Technology 2011.
© 2014 IBM Corporation
Manage The DataManaging the Information Landscape
Sources Business
Initiativeslegacy
apps
dbs
xls, xml,
flat
warehouse
external
custom
BI
Analytics
Data
Discovery
Predictive
Business
Analysts
ExecutivesEnterprise
Architects
Data
Analysts Subject
Matter
Experts
Data
Warehouse
Manager
Developer
DBA
System
Architect
Data
Steward
Optimization
Understand Understand ActActManage
© 2014 IBM Corporation
Transformed organizations need resist the urge to perfect the data
9Source: The New Intelligent Enterprise, a joint MIT Sloan Management Review and IBM Institute of Business Value analytics research partnership. Copyright © Massachusetts Institute of Technology 2011.
© 2011 IBM Corporation10
Understand The DataProfiling using Information Analyzer
Cleanse
Master
MonitorMonitor the quality of your data in any place (database / in a data flow) and
across systems
UnderstandAssess the quality of your data
Manage ActActUnderstand
© 2014 IBM Corporation
Data and Integration ModelingCommon understanding of the design
Database development requires a
“blueprint” or model of business
requirements
Data integration designer and
developer need that “blueprint” to
ensure that requirements (i.e.,
sources, transformations, and
targets) have been clearly
communicated in a common,
consistent manner
Model Type Data Integration
Conceptual
Model
Logical
Model
Physical
Model
Implementation
Development InfoSphere Data Architect
Tools
Conceptual Data ModelConceptual Data Integration
Model
Logical Data Model
Database Data Stage Projects
The Modeling Paradigm
Physical Data Model
Logical Data Integration Model
Physical Data Integration Model
Data Stage Designer
Blueprint Director
© 2014 IBM Corporation
Act On The DataTrust and traceability enables action
12
Information Integration: ETL, Data Quality,Data Profiling
Source Systems, Data Marts, Silos
Front Line / BI Applications / Predictive Analytics
Data Lineage,Impact Analysis,
Operational Monitoring
Understand Understand ManageManage
Information Governance,Business Definitions
Act
© 2014 IBM Corporation
– Key Business End Users
– Program Manager / Project Lead
– Governance Stewart (SME)
– Security & Privacy Teams
– Operations
– Developers
– Modelers / Architects
– QA / Testing Teams
– Data Analyst
BI Reports and
Dashboards
Source
Systems
Data
Warehouse
ETL Developer
Data Modeler
BI Developer
Accuracy in Reporting
Deliver Information Efficiently
Measures and Metrics
Complex Data at the Speed of Business
Data Analyst
Business User
Common Understanding
13
Common shared metadataAligning different actions for efficient delivery
© 2014 IBM Corporation
Trust in data – there is still a long way to goTwo thirds of the leaders express confidence in data
14
Transformed organizations that has confidence in the quality of data and analytics
Source: Analytics: A blueprint for value – Converting big data and analytics into results, IBM Institute for Business Value © 2013 IBM
Trust in data
© 2014 IBM Corporation
Three characteristics that distinguish Transformed organizations most
15Source: The New Intelligent Enterprise, a joint MIT Sloan Management Review and IBM Institute of Business Value analytics research partnership. Copyright © Massachusetts Institute of Technology 2011.
Percentage indicates Transformed respondents who rated themselvesas highly effective at each key characteristic
© 2014 IBM Corporation
Over to Rob
16
© 2014 IBM Corporation
� Simplify Integration � Increase trust and confidence in information
� Increase compliance to standards
� Facilitate change management & reuseDesign Operational
DevelopersSubject Matter Experts
DataAnalysts
Business Users
Architects DBAs
Unified Metadata Management
What does Information Server help to achieve?
© 2014 IBM Corporation
Information Server Metadata Components
Metadata Management
Analyze / Understand
Data Lineage
Impact Analysis
Object Merge
Import/Export
Create / Manage
Read/Write
Metadata Server
Information
Analyzer
Information
Services
Director
Metadata
Asset
Manager
DataStage FastTrackBusiness
Glossary
&
BGA
MetaBridges
CognosInfoSphere
Data
Architect
Metadata
Workbench
Third Party
Tools
© 2014 IBM Corporation
Information Server
Common
Metadata Repository
InfoSphere
Data Architect
(Data Model)
Inormation Analyzer (IA)
Source Data Profiling (tool)
Cognos Framework
Manager
(tool)
EDW /DM Repository
Business Glossary
(part of the Information
Server Common Metadata
Repository)
DataStage
ETL (tool)
Manage and Execute
DDL
BI Data Linage Meta Data
(Reports and FM Packages)
Export
Target Data Model
Export Data
Models
Validate
Discover and adjust source metadataUses and Creates
Fast Track
Mappings (tool)
Export
DDL / XML
Deploy and
Execute Scripts
Use Source and
Target meta data
To create mappings
CVS / ClearCase
Reopository
Metadata workflow and Tools OverviewOverall aim with the Metadata workflow is to:
- Ensure that the Cognos reports are linked to Business Definitions, Data Model and the Data Integration design , i.e. to enable design traceability and lookup of definitions
- Ensure an improvement of change management analysis, i.e. to perform impact analysis
Information Server Data
Stage Metadata Repository
IA Metadata Repository
(Source Table Definitions)
Updates Source Model
Generate
Meta Data to
Data Stage
Automatic publish of ETL/
Data Lineage Meta Data
Cognos Content Store
(Metadata Repository)
FM
Packages
Cognos Report Studio
(tool)Reports
Version Control
Version Control
Import Source
Models
Version Control
BA
DM BI
Version Handeling
BA DM DBA ETL BI
DBA ETL
Version
Control
DBA
BI
ETL BI
ETL
ETL ETL
ETL
BI BI
Source Databases
(Regular and Migration)
Read Terms from
Business Gloassary
DBA
InfoSphere Metadata Asset
Manager
© 2014 IBM Corporation
InfoSphere Data Architect (Manage & Understand)
� Data Models– Sources (Regular / Migration)– Targets (EDW / DM)
� Management– Logical Data Models– Physical Data Models– Attribute Groups– Generate DDL– Reverse Engineer
� Governance– Business Terminology– Naming Models– Domain Models
� Integration– InfoSphere Metadata Asset Manger (IMAM)– Business Glossary
� Challenges– Data Type inconsistencies with Oracle– Reverse Engineering source models– Implemented Data Resources– Date / Timestamp– Integer
© 2014 IBM Corporation
InfoSphere Business Glossary (Manage & Understand)
� Common Terminology
� Connect business with IT
� Associate terminology with assets
� Data Rules– Definitions– Visibility– Understanding
� Greater visibility increases understanding and trust in the underlying solutions, the data and information they provide
� Governance– Stewardship– Architects, Analysts, Business
� Integration– Import from files– IDA– Metadata Workbench– Information Server assets– Cognos– BG Workflow– Business Glossary Anywhere
� Challenges– Category structure– Business Organisation Governance
Business Lineage
BG Anywhere
Taxonomy
Business Terms
© 2014 IBM Corporation
InfoSphere Information Analyzer (Understand)
� Data Profiling tool– Understand the source data– Regular ETL Sources– Migration ETL Sources
� Integration– Input for the mapping specifications– Define and validate business rules (Data
Rules)– Publish Data Rules for use in DataStage
� Standard Analysis– Column Analysis– Primary Key Analysis– Foreign Key Analysis– Cross-Domain Analysis
� Overview of results in Data Quality Console
� Challenges– Consolidate and document findings /
conclusions for Mapping generation– Limitations of analysis– Some drill through limitations– SQL
Analyze Structure, Content, Quality
+ Relationships of Data
© 2014 IBM Corporation
InfoSphere FastTrack (Manage & Understand)
� Source to Target Mapping Specifications
� Metadata available from the IS Metadata Repository
� Connection between Business and IT
� Mapping (design) also stored in the IS Metadata Repository
� Audit
� Integration– Metadata Repository– Metadata Workbench
� Challenges– Efficency– MS Excel
Flexible Reporting
Auto-generates DataStage jobs
Specification
Flexible Reporting
© 2014 IBM Corporation
InfoSphere Metadata Asset Manager (Manage)
� Managed Metadata Import– Metadata Bridges– InfoSphere Data Architect– Cognos– Staging area for comprehensive
impact analysis
� Metadata Management– Administration of Metadata
Repository– Manage
• Duplicate and disconnected Metadata
• Relationships (LDM / PDM / Implemented Data Resources)
� Integration– Metadata Repository– IDA– Cognos– Other 3rd Party tools (BO, ERwin)
� Challenges– LDM / PDM relationships– Remove models for certain changes– Metadata Interchange Server (Client
or Server)
© 2014 IBM Corporation
InfoSphere DataStage (Manage)
� DataStage consists of three different components– Administrator– Designer– Director
� Develop and Run ETL
� Environment Variables
� Integration– Published Data Rules from IA– Table Definitions– Metadata from Metadata Repository originally
defined in IDA and imported via IMAM– Operations Console– Data Quality Console
� Challenges– Application of development standards and
guidelines to ensure End To End Data Lineage
– Use of the correct metadata from Metadata Repository
– Metadata management issues• Date / Time• Integer
Hundreds of Built-in
Transformation Functions
Visually Designed Logic
Transform, Aggregate
Data in Batch or Real Time
© 2014 IBM Corporation
InfoSphere Metadata Workbench (Manage, Understand & Act)� Manage and Understand
– Implemented Data Resources
– DataStage Jobs– FastTrack Mappings
– Cognos Data Models and Reports– Extended Data Sources / Extended
Mappings– Lineage Services
� Who– Metadata Administrators– Architects, Analysts
� Custom Queries– Adherence to standards– Validation of Data Lineage
� Information governance– End to End traceability of solutions– Data Model Implementation– Cognos BI– Understand complex environments– Visibility and understanding– Data Rules
� Data Lineage– Impact Analysis– Faster time to market
� Challenges– Data Lineage (some performance tuning)– Browser! (Firefox, Chrome, IE)
Design + Operational
+ Extended lineage
© 2014 IBM Corporation
InfoSphere Operations Console (Understand & Act)
� Operations Console
– Job runtime activity
– Logs
– System Resources (CPU, Memory)
– Identify jobs that have Failed or Finished with Warnings
– Automated integration with DataStage
– Execute jobs / sequences
– Analyse trends
� Operations Database
– ETL Audit Information – available to Jobs
� Challenges
– SLA / OLA measurement
Information Server
Administrator
Information project team
(developers. analysts, administrators, architects, etc.)
© 2014 IBM Corporation
Summary
� Information Server can provide a single repository for your BI solution
� Design and implementation enables End to End Lineage and Traceability
� Trust and confidence in data and information
� Organisation and Governance
– BICC
– Data Quality Forums
– Architecture Forums
� Impact Analysis – new and existing solutions
– Faster time to market
� Teams using the same tools with the same information, talking the same language
– Architects / Analysts / Application Management / Business
– Consistent communication between business and IT
� Run time analysis
– Operations console
– Identify and resolve issues in operations
28 IBM Confidential
© 2014 IBM Corporation
End to End Traceability enables...
� Trust and Understanding in solutions
� Provides confidence to decision makers, enabling the business to act!
� Or just wing it…
29 IBM Confidential
© 2014 IBM Corporation30