data migration and mdm - dmm5

20
London - New York - Dubai - Mumbai - Hong Kong 2012 The Data Migration Challenge: Elements including MDM by Wael Elrifai

Upload: wael-elrifai

Post on 09-Dec-2014

1.367 views

Category:

Documents


4 download

DESCRIPTION

Presentation given by Wael Elrifai to Data Migration Matters 5 in London on 25 May, 2012.

TRANSCRIPT

Page 1: Data Migration and MDM - DMM5

London - New York - Dubai - Mumbai - Hong Kong 2012

The Data Migration Challenge:Elements including MDM

by Wael Elrifai

Page 2: Data Migration and MDM - DMM5

Confidential - not for redistribution

Understanding Migration

“Migration is not just about moving the data…

It’s about making the data work.”

Few source systems

Many More

Source Systems

SpecificData

Formats

Data inunknownformats

All Data

Available

Needed Data

is Missing

Documented

System

Interfaces

Unknown System

Interfaces

Valid

Data

Poor Data

Quality

Assumptions

T R U T H

Page 3: Data Migration and MDM - DMM5

Outsourcing

Legacy Retirement

M&A Integration

Application InstanceConsolidation

Application Upgrade

Application Implementation

These Application Projects have a Common Critical Requirement: Migrating Data

From legacy into new application

From previous to new version

From multiple instances to fewer

From acquired systems

From legacy into new systems

From company to outsourcer

Page 4: Data Migration and MDM - DMM5

Project Overview: Data Migration to ERP

• 200+ source systems

• Operating in 14 languages

• Different sets of users working in different regions with different applications and languages

• Highly fragmented lines of business and regions

• No concept of Data Governance or Master Data Management

• No concept of Data Quality Analysis

Page 5: Data Migration and MDM - DMM5

Methodology: Practical Data MigrationM

igration Strategy &

Governance

(MSG

)

Migration Design & Execution

(MDE)

Tech

nica

lBu

sine

ssEn

gage

men

t

System Retirement Plan(SRP)

Key Data Stakeholder Management

(KDSM)

Legacy Decommissioning

(LD)

Gap Analysis & Mapping

(GAM)

Landscape Analysis

(LA)

Data Quality Rules(DQR)

Profiling ToolData Quality Tool

Migration Controller

DM

Z

Page 6: Data Migration and MDM - DMM5

Team Structure & Communications

• Primary Business Team located in Hong Kong• 6 Business Analysts• 2 Technical Coordinators

• Primary Development Team in Hong Kong• 8 Developers

• Offshore Development Team in Mumbai, India• 4 Developers

• Unique Aspects• Agile/Scrum meetings conducted via Video Conference• Email usage limited• Assigned secretary with output immediately posted on Wiki for comments• Team Lead makes final “closing comments” on each issue

Page 7: Data Migration and MDM - DMM5

Application Migration: The Anatomy of Failure

Long development times•Often many months or even years without any ‘visible’ signs of

progress•CAUSE: failure to properly decompose development into practical,

achievable and meaningful ‘phases’ and ‘sprints’

Long development times – for individual ETL flows•Due to extensive and repeated re-working of ETL code•Resulting from failures in unit testing and user acceptance testing•CAUSE: poor and inadequate design

Considerable variations in quality & efficiency of code•Increasing time for new/other developers to modify code•CAUSE: failure to define and firmly enforce standards

Page 8: Data Migration and MDM - DMM5

Application Migration : The Anatomy of Failure

Minimal attention to data cleansing or standardisation•Leading to longer report development times•And greater inconsistencies in reporting•Effectively pushing data quality management to report developers•AND information consumers•CAUSE: failure to recognise importance and impact of employing

a systematic approach to managing data quality

Poor reliability•Arising from ‘unexpected’ variations in structure or content of

incoming source files•CAUSE: failure to cater for Murphy’s Law – i.e. the most frequent

and most obvious causes of

Page 9: Data Migration and MDM - DMM5

Application Migration : The Anatomy of Failure

Poor performance

•CAUSE: failure to give due consideration to scale and complexity

of ETL processes – during the design stage

•CAUSE: failure to fully understand the underlying causes – when

performance problems become evident

•CAUSE: failure to routinely monitor performance or undertake

adequate capacity planning – to cater for gradual or step-change

increases in data volumes

Page 10: Data Migration and MDM - DMM5

Application Migration: The Anatomy of Success

ForensicData Analysis

DetailedFunctional Design

DetailedTechnical Design

Peer ReviewTechnical Authority

BuildUnit Test

Peer ReviewTechnical Authority

UAT

IncludingMaster

Schedule

SystemTest

Entity Level‘MAPPING’

Hosted

EnforceStandards

&Reusable

Components

Data Model Design& ETL Phasing

CodeTranslations

&Master

Schedule

SprintGo Live

SoftGo Live

REUSABLECOMPONENTS

TEMPLATES

Page 11: Data Migration and MDM - DMM5

Abstraction of Rules & Reusability

• Automated ETL mapping development based on source system metadata

• Automated data type verification for flat file data based on header information

• Consistent use of a single value mapping table abstracted to accommodate data migration rules

• Automated data type verification for flat file data based on header information

• Single generic “run script” which operates based on a simple dependency matrix

•This is more important in operational rather that data migration situations, but becomes important when dependencies are complex

Page 12: Data Migration and MDM - DMM5

Data Migration Guiding PrinciplesCreating Data Standards to Reduce Complexity

Confidential - not for redistribution

Current State Environments• Source Tables• Source Attributes• Upstream Sources• Downstream Targets• Create as is Domain Model• Create as is Entity Model

Future State Environments• Enterprise Apps Data Models• ODS Data Models

Initial Common DataStandards and creation of:•Initial DQ Program•Initial Data Ownership Model•Initial Data Management •Governance Processes

Rationalize Domains and Entities across Current State

and Future State Environments

Map in all Application Environments to the Enterprise Standard

Rationalize Attributes across Current State and Future

State Environments

Common Data Standards Enterprise Representation

• Create Domain Model• Create Entity Model• Create Entity Relationship Model

Create Entity Attribute Model

ETC

ODS

DW

Customer

Page 13: Data Migration and MDM - DMM5

Sample Architecture Diagram – Subset of Project

Page 14: Data Migration and MDM - DMM5

Data Governance - 14-step (sounds like a lot!) program

1. Review available documentation on process flow

2. Agree scope of work

3. Plan and schedule meetings

4. Produce initial definitions of DG framework

5. Assemble DG working group

6. Engage with Data Stewards

7. AS-IS business process analysis

8. AS-IS data analysis

9. Define TO-BE processes

10. Define TO-BE system requirements

11. Assemble business glossary

12. Introduce standardization of business-critical data items

13. Implement DG KPI tracking and DQ exception reporting

14. Conduct periodic audit of business processes

Page 15: Data Migration and MDM - DMM5

Master Data Management - Highlights

• DON’T FORGET! Your data migration tools may end up being the real-time MDM Hub communication logic/tools as well, design appropriately

• Simplified load tools that can be used by analysts• Custom match/merge algorithms• Gray’s coding• 14 languages including European, Middle Eastern (right-to-left), East

Asian• Some transliteration rules built using statistical regression on 30m

customer records• Match/merge algorithms with discrete variables and user interface• Ability to allow users to target hotspots• Variable “sliders” - Meshed variables for hotspot analysis allows for

more merge sensitivity flexibility• Data analysis for predicting why false positives and false negatives occur• Role of each source• Types of data that most often “fails”

• Google Maps/Address integration for matching (cloud), data enhancement, and more

Page 16: Data Migration and MDM - DMM5

Testing

• Custom “Black Box” testing tool designed• Specialized for database tests• Requires addition of some metadata columns to data model

• S_ID• Batch_ID• LOAD_TIME

• Automatic storage of test cases• Test data• Documentation on test being run• User metadata• Test metadata

• Sets database into a known state• Can generate test data• Single unified interface• Fault-Fix workflow management

Page 17: Data Migration and MDM - DMM5

Documentation

• Automated• Driven by• Business requirements documented in

• Custom testing tool• Wiki documentation

• ETL tool metadata• Custom testing tool metadata

This is highly contingent on being able to enforce developer rules about documentation within tools.

Page 18: Data Migration and MDM - DMM5

Risk Mitigation

Extract data early•Data should be seen immediately. We’ve seen problems come up because

data didn’t conform to expectations.

Convert data early•Our existing build will allow for the first conversion to take place within

weeks for all objects.

Convert data often•An iterative approach to both data quality and conversion allows for repeated

analysis. This should be driven by development schedules rather than inversely by validation schedules that aren’t related to development time.

Use real data from the start• Conversion team should have direct access to source systems, without a

dependency on another team to create extracts.

Seek to incorporate external and up-to-date information about your Master Data

• Tools like Google’s business services, D&B, Bloomberg and others can help

Page 19: Data Migration and MDM - DMM5

Confidential - not for redistribution

Data Migration through Information Development

Lessons Learned

Prioritise Planning• Define business priorities and start with quick wins• Don't do everything at once – Deliver complex projects through an incremental programme• “Chunks” need to be appropriate, based on elements like homogeneity of front-end, single sets of business users across geographies, language usage, etc.

Focus on the Areas of High Complexity•Don't wait until the 11th hour to deal with Data Quality issues – Fix them early•Follow the 80/20 rule for fixing data – Does this iteratively through multiple cycles•Understand the sophistication required for Application Co-Existence and that in the• In the short term your systems will get more complex

Keep the Business Engaged• Communicate continuously on the planned approach defined in the strategy The overall Blueprint is the communications document for the life of the programme• Try not to be completely infrastructure-focused for long-running releases – Always deliver some form of new business functionality• Align the migration programme with analytical initiatives to give business users more access to data• Ensure that the Data Governance program has “teeth”

Page 20: Data Migration and MDM - DMM5

Confidential - not for redistribution

Peak Consulting UK Headquarters

90 Long Acre, Covent GardenLondon WC2E 9RZ

T: +44 (0)20 7849 3422 F: +44 (0)20 7990 9478www.peakconsulting.eu

Questions?

?