master data summit 2014: datenqualitätsmanagement mit sap information steward
DESCRIPTION
Vortrag von Richard Follmann, Boehringer Ingelheim Pharma GmbH & Co. KG im Rahmen des Master Data Summit am 18.09.2014 in WalldorfTRANSCRIPT
Master Data and Data Quality Management with Information Steward
Master Data Summit 2014
Walldorf, September 18th 2014
Agenda
• Introduction – Boehringer Ingelheim
• Master Data Management & Data Quality
• Data Migration & Data Quality Processes
• Schema Matching Support - Data Profiling
• Data Deduplication - Match Review
• System Consolidation - Match Review
• Data Quality Reporting - Scorecards and Rules
• Summary
• Architecture Overview
• Lessons Learned / Improvement Suggestions
2Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Boehringer Ingelheim in brief (based on 2013)
• Family-owned global corporation
• Founded 1885 in Ingelheim, Germany
• Employees worldwide: 47 492
• R&D worldwide at 7 sites
• Expenses for R&D: EUR 2 743 million
• Net Sales: EUR 14 065 million
• Net Sales per employee: 296 K€
• 20 production facilities in 13 countries
• Affiliated companies: 145 worldwide
Boehringer Ingelheim Center Our headquarters in Ingelheim, Germany
3Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Net Sales is driven by Prescription Medicine
4Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Master Data Management Quality processes require quality data
5
Business processes require high quality data, this means we need to make sure• to migrate only quality data into MDM (e.g. with ERP rollouts) and• to keep up the high data quality for daily business.
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Data Quality Management In Data Migration and in Daily Business
6
Select Data Data Cleansing
PreparationLoadingCycle 1
LoadingCycle 2
LoadingCycle 3
Load toProd.
Map Extract Transform Load Test
Overview of the (Master)Data-Migration Process
Challenges• Completeness and correctness of
schema mapping and matching• Data Quality of source system • Cleansing of duplicate records in
source system• Consolidation of duplicate records in
source and target system• Validation and verification of the data
migration success
DQM in Data Migration
Quality Checks implemented in SAP MDM
• Checks single records with ~200 business rules
• Checks for duplicates on single records
• Quality checks are triggered by create and change processes
• Not all checks are performed with bulk data loads
Challenges
• Identification & fixing of data quality issues
• View on Data Quality levels over time
DQM in Daily Business
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Ded
up
lication Process
Dat
a M
igra
tion
Pro
cess
Data Migration & Data Quality Processes
7
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Schema Matching & Mapping
Deduplication within Source
System
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Deduplication between Source
and Target
Data QualityReporting
Schema Matching SupportInformation Steward - Data Profiling
Benefits
• Quick way to analyze data
• Shows patterns in formatting
• Eases identification of
• Deviations
• Relations between data
• Completeness of mandatory columns
• Unused columns
8
5% 5%10%
80%
Brief Step Overview with effort estimate1. Replication of Data
2. Create IS Project and Run Profiling
3. Train Profile Users
4. Support Schema Matching
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Data DeduplicationInformation Steward – Match Review
9
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Data DeduplicationInformation Steward – Confirmation View
Before:
• Access Database
• No concurrent work possible
• High development effort
• No two step approval
Now:
• Information Steward Match Review
• Concurrent work possible through locking
• Low development effort
• Two step approvals
• Comparison of values
10
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Data DeduplicationStep Overview
Benefits
• Deduplication of records within the same system
• Reduces amount of data to consolidate and migrate
• Result: IDa mapped to IDb
• Ignore IDa during data migration
• Map data referencing IDa to IDb
11
30%
5%
55%
10%Brief Step Overview with effort estimate
1. Develop Matching Strategy
2. Train Match Review Users
3. Perform Match Review
4. Clean Duplicates in Legacy System
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
System Consolidation – Confirmation View
Benefits
• Deduplication of records between different systems
• Avoids duplicates in target system
• Transformation to same data structure required
• Result: IDa mapped to IDb
• Ignore IDa during data migration
• Map Data Referencing IDa to IDb
12
15%
25%
5%
45%
10%Brief Step Overview with effort estimate
1. Transform Data to Target Data Structure
2. Develop Matching Strategy
3. Train Match Review Users
3. Perform Match Review
4. Map Consolidated Data
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Data Quality ReportingInformation Steward – Scorecards
General Idea
• Measure and report Data Quality
• Develop Business Rules
• Group Rules in Scorecards
• Data violating rules lowers score
• Export failed data to fix issues
13
20%
40%10%
30%
Brief Step Overview with effort estimate
1. Transform Data to Target Data Structure
2. Develop Rules and Scorecard
3. Develop Failed Data Export
4. Fix Problems using Failed Data
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Scorecard SetupRule binding
1. Setup Information Steward Project
2. Add Table View with Data Filter
3. Develop Data Quality Rules
4. Bind Rules to View
5. Add Rulebindings to Scorecard
6. Run Rule Calculation Task
7. Use Failed Data to correct Problems
14
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Scorecard SetupData Quality Reporting
15
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Summary
• Information Steward and Data Services support MDM data migration and daily business for ensuring data quality
• Reusable rules and modules for data migration and daily business
• Integrated approach compared to previously used solutions
16
50%
5%
20%
25%
Quality Management Tasks with required setup effort in relation (development only)
Data Quality Reporting (Scorecards)
Schema Matching Support (Data Profiling)
Data Deduplication (Match Review)
System Consolidation (Match Review)
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
1. Legacy Connection
2. Profiling 3. Deduplication
A. Transformation(Migration Team)
6. Consolidation5. Data Quality
B. Migration(Migration Team)
Architecture Overview
17
Source
Stage DB
Extract
Load
0
10Transform
MatchingStrategy
Rule View Project View
Business Rules
Scorecard
Match ReviewMatch Groups
Failed Data
Match Results
Data Services Information Steward
Profiling
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Lessons Learned(Data Services and Information Steward 4.1)
• Information Steward can address most Data Quality Requirements
• Preparation work in Data Services required
• Getting actionable results is sometimes difficult
• Data Services and Information Steward work well together
• Information Steward requires a reliable ETL tool to facilitate its full potential
• Developing Matching Strategies requires Business and Data Knowledge
• Developers should be knowledgeable in Databases Systems and SQL
18Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Improvement Suggestions(Data Services and Information Steward 4.1)
• Implement Dynamic Data Filter in Data Insight Projects
• Scorecards for different Data Areas require their own project
• Currently about 20 very similar Projects in use, only difference s are filters in source data
• New rules and views need to be bound to each project manually
• Improve User and Group Management
• Match Review user requires Data-Insight-User Group, hardcoded and undocumented logic
• Improve Failed Data Handling
• Within Scorecard drill-down, amount of failed records limited to 500 (200 by default)
• Failed Data connection and database required for more/all failed records
• Implement Transport System (Promotion Management) for projects
• It is not possible to transport projects between environments through Promotion Management
• Transport has to be done manually by exporting and importing XML-Files
19Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Thank you
2019.09.2014
Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Richard Follmann, Boehringer Ingelheim Pharma GmbH & Co. KG
Nils Schweikhard, Boehringer Ingelheim Pharma GmbH & Co. KG
Backup
21Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard
Matching Strategy Example
Screenshot of Data Services Matching Strategy for Deduplication
Steps in Overview:
• Filter relevant Data (exclude Records with Posting Block and Deletion Flag)
• Prepare Break Groups (group similar Accountgroups)
• Concatenate related Fields (Name1, Name2 and Street1, Street2)
• Apply Matching Strategies (Match for different Criteria in different Strategies)
• Associate Matching Groups
• Insert Entry into Status Table (required for Match Review)
• Apply sort and filter non-relevant Records
22Master Data Summit 2014 - 18.09.2014 – Richard Follmann / Nils Schweikhard