CASE STUDY: Dan Myers, Farmers Insurance Group Data Governance Step 1 =
Metadata, Business Terminology & IT Portfolio as Pre-MDM Phase
1 ©Dan Myers 2012
Agenda 1. Typical MDM Hurdles 2. Using the Focus Funnel 3. Funnel Results
a) Metadata Management b) Tiered Dictionary & Stewardship c) Systems Portfolio Documentation d) ABC & DQ Monitoring/ Data Profiling
4. Conclusion
2 Copy Write Dan Myers 2012
Typical MDM Hurdles Given that there are many hurdles that should be considered during an MDM implementation, here are a few that yield success in other data management areas, while also preparing a company for MDM implementation.
1. ??? 2. ?? 3. ?.?
4. ?? 5. ????
3 Copy Write Dan Myers 2012
4
Using the Focus Funnel
Funnel Criteria: 1. Synergies with other existing data management
needs (easier executive buy-off). 2. Will reduce time and cost of future MDM project. 3. Relates to an intangible infrastructure & culture
change that will take significant time.
So much to do in so little time. Where do you start?
Copy Write Dan Myers 2012
5
Funnel Results
Key to Success: Ability to Understand Source Data
Consensus of Business Terminology
Appropriate Sourcing
Data Completeness
& Accuracy
Mechanism to Overcome Hurdle:
Metadata Management
Tiered Dictionary & Stewardship
Systems Portfolio
Documentation
ABC & DQ Monitoring/
Profiling
Copy Write Dan Myers 2012
6
Why these? Mechanism to Overcome Hurdle:
Metadata Management Tiered Dictionary & Stewardship
Systems Portfolio Documentation
ABC & DQ Monitoring/ Profiling
Why?
• Synergies with existing needs • Reduces future research time for integration effort • Decreases complexity due to vast number of disparate systems • Identify (dis)similarity of source data • Storage of Master Metadata supports sustainability
• Improved project speed using business stewards to resolve business rules • Mechanism to prepare org. for integration of processes • Build ownership and buy-in of stakeholders
• Decreases complexity due to vast number of disparate systems • Establish roadmap of short-term & long-term sources • Improved project speed using SME resources documented by system
• Synergies with existing DQ pain points • Reduced financial costs due to write-offs • Increased match rate of master data • Manage expectations of outcomes • Ensure standardization of business processes
Copy Write Dan Myers 2012
7
Metadata Management for MDM • All new projects have metadata guideline
requirements embedded in project costs, so that all new fields have comprehensive business descriptions.
• Retroactive business metadata collection by data management team working hand-in-hand with functional data stewards.
• Increasing usage of enterprise metadata repository based on value-added functionality.
Copy Write Dan Myers 2012
8
Tiered Dictionary & Stewardship for MDM
Customer Related Attribute Identification
Farmers Enterprise Dictionary (FED)[Stewarded by Enterprise Data
Steward & Approved by the FED Committee]
Cross-functional Dictionary[Stewarded by EDM with review of terms by
stakeholder functions & Approved by the Enterprise Data Steward]
Function Specific Dictionary[Stewarded by one representative per Product Line/Function]
Spec
ialty
Pro
pert
y
Clai
ms
Agen
cy/M
arke
ting
Com
mer
cial A
uto
Com
mer
cial
Mul
ti-Pe
ril
Com
mer
cial
Wor
k Co
mp
PL A
uto
PL H
ome
PL U
mbr
ella
••• etc.
Copy Write Dan Myers 2012
9
Systems Portfolio Documentation for MDM
MDM Project Questions
1. What data feeds currently exist? 2. What attributes are available from each source? 3. Is the source system legacy or modern? 4. At what frequency is the data available? 5. What is the storage type, format and data movement
method of existing feeds? 6. Is the metadata repository connected to the source
system and if not is it possible with existing license? 7. Who is the business owner and IT point of contact for
each system?
Answers Provided by Systems Portfolio Documentation
Copy Write Dan Myers 2012
10
Systems Portfolio Documentation for MDM (graphical metadata)
Value-Added Diagramming Components to Consider:
Database technology info to know what types of data transfer possible
Data query/BI tool type
Metadata Repository Connect ability
Modernization/Retirement status based on border color
Transactional Systems Integration/Warehousing BI/Data Delivery
Visio Layers
Copy Write Dan Myers 2012
11
Systems Portfolio Documentation for MDM (tabular metadata)
Systems Level Metadata Collected
• System ID (surrogate key) • Primary Business Function • Business Owner & IT point of
contact • System Name • System Description • Category (Operational,
Integration, Delivery) • Data Storage Technology • Data Delivery/Query Tool
Data Transmission Level Metadata
• Source System Name • Target System Name • Method of Data Movement
(e.g. FTP, MF File, ETL) • Most Granular Frequency of
Data Movement
Copy Write Dan Myers 2012
12
ABC & DQ Monitoring/ Data Profiling for MDM
1. Point-to-Point Audit Balance and Control – Measurement of counts of rows and amount between source system and target system (aka file trailer with check-sums)
2. DQ Monitoring- Application of business defined acceptable domains to data movements.
3. Data profiles for data sets (either in-line with attribute level metadata in repository) and/or ad hoc profiles of specific feeds.
Copy Write Dan Myers 2012
13
1. Point-to-Point Audit Balance and Control PK PK PK
PLCY_ID COV STATE FT_PREM FEES CO_NAME39004958 29 1600 100 Tom's Place39004959 1 NULL 1000 200 Jack's Place39004959 2 88 1000 NULL Jack's Place
39004960 63 2000 300 Sue's Place39004961 88 20000 100 Arnold's Place39004962 88 24000 100 Nori's Place
5 49600 800
PK PK PKPLCY_ID COV STATE FT_PREM CO_NAME
39004958 29 1600 Tom's Place39004959 2 88 1000 20039004960 63 2000 Sue's Place39004961 88 20000 Arnold's Place39004962 88 24000 Nori's Place
4 48600
PLCY_ID COV STATE FT_PREM FEES CO_NAME39004959 88 1000 200 Jack's Place
1 1000 200
Record Amounts
When comparing two files that are the same granularity, record counts for the file must be compared as well as dollar amounts for any dollar columns.
Source Table 1
Premium Target Error Table
Premium Target TableRecord
Amounts
Record Amounts
Record Counts
PKPLCY_ID STATE FEES
39004958 29 1001 39004959 NULL 2002 39004959 88 200
39004960 63 30039004961 88 10039004962 88 100
4 1000
PLCY_ID STATE FEES39004959 88 200
1 200
Fees Target Table
Fees Target Error Table
•When comparing two files that are the same granularity, record counts for the file must be compared as well as dollar amounts for any dollar columns.
•*Is the sum of all dollar amounts in each column of the output file the same as the input file? In this case they are not so know that something happened. After researching the issue we can see that the PLCY_ID: 39004959 row is missing.
Copy Write Dan Myers 2012
14
2. DQ Monitoring • Purpose: develop domains and ranges of
acceptable values for each attribute to flag anomalies for further investigation by data stewards
• Classes of Monitoring Rules: Identifier, Date, Code, Amount, Rate, Indicator, Quantity
• These business rules are built directly in the ETL with reporting from a separate schema.
Copy Write Dan Myers 2012
15
Questions
Copy Write Dan Myers 2012