cdmp study group · 6/24/2020 · it not an official, dama international authorized training...
TRANSCRIPT
CDMP Study Group
SESSION 11 – Chapter 10Reference & Master Data
June 24, 2020
Mary Lynn Early, VP Marketing & CommunicationsEmail: [email protected]
Welcome To CDMP Study Group!☺
WE WILL BEGIN MOMENTARILY
CDMP Study Group
SESSION 11 – Chapter 10Reference & Master Data
June 24, 2020
Mary Lynn Early, VP Marketing & CommunicationsEmail: [email protected]
Agenda
• Facilitator• Introductory Note• Chapter 10 – Reference & Master Data
• Introduction• Guiding Principles• Comparison of the Two• Concepts of Reference and Master Data• Key Processing Steps• Data Sharing Architecture• Reference and Master Data Activities• Implementation Guidelines• Important Metrics
• Q & A• Next Session
New England Data Management Community
Facilitator
New England Data Management Community
Mary Lynn Early
▪ I work for Accenture
▪ As a Data Governance Consultant
▪ Primarily in healthcare and life sciences
CONTACT INFO:EMAIL: [email protected]
PHONE: 617-943-4371
: /IN//marylynnearly
Introductory Note
New England Data Management Community
➢This study group is offered as a service of DAMA New England for DAMA New England members. It not an official, DAMA International authorized training course because DAMA-I has not yet created an authorized trainer program.
➢The purpose of this group is to help prepare members to take the CDMP. We will do so by reviewing the content of chapters of the DMBOK2.
➢The chapter makes no claims for the effectiveness of the sessions or the ability of participants to pass the CDMP exam after having attended. In fact, you should plan on doing a lot of individual study to pass the exam.
Chapter 10: Reference & Master Data
New England Data Management Community
Reference and Master Data Management includes ongoing reconciliation and maintenance of core critical shared data to enable consistent use across systems of the most accurate, timely, and relevant version of truth about essential business entities.
Reference and Master Data Introduction
New England Data Management Community
Reference Data Management
Provides control over domain values & definitions
Master Data Management
Entails control over master values & identifiers that enable consistent use across systems
• Codes & descriptions• Classifications• Mappings• Hierarchies
Single version of:• Customers, Accounts• Materials• Products
“Golden Record”“Version of Truth”
“Master Values”
“List of Values (LoV)”
“Taxonomy”
“Cross Reference”
Enterprise HWY
Both are critical context of transaction data
Goals and Guiding Principles
New England Data Management Community
Goals:
• Complete, consistent, current and authoritative across the organization
• Enables sharing across enterprise functions and applications
• Lowering cost and reducing complexity of data usage and integration through standards, common data models and integration patterns
Guiding Principles:
• Shared Data –Managed so they are shareable across the organization
• Ownership – Belongs to the organization; require a high-level of stewardship.
• Quality – Both require ongoing quality monitoring and governance.
• Stewardship – Business Data Stewards are accountable for controlling and ensuring the quality of data
• Controlled Change – Require defined processes for managing
• Master data is the best understanding of what is accurate and current; matching rules should be applied with caution and oversight and changes should be reversable.
• Reference data should follow a process and changes should be communicated before they are implemented.
• Authority – Master data should only be replicated from the system of record. A system of reference may be required to enable sharing across an organization.
Guiding Principles:
• Shared Data
• Ownership
• Quality
• Stewardship
• Controlled Change
• Authority
Differences Between Master and Reference Data
New England Data Management Community
A. Master Data
B. Reference Data
C. Both
This type of data could be a subset of the other.
This type of data provides context for transaction data.
This type of data is usually has fewer rows and columns.
This type of data is usually reduces risk with ambiguous identifiers.
This type of data is requires a trusted version of truth for each instance of conceptual entities.
This type of data is shared and should be managed at the Enterprise level.
This type of data often resides outside of the organization.
B
C
B
A
A
C
B
Exploring Reference Data Concepts
New England Data Management Community
Taxonomies• Enable content classification and multi-faceted navigation to
support BI• Maintain hierarchical classifications using super and sub relations• Ex: NAICs – North America Industry Classifications; Thesaurus
Code Description Parent Code
44000 Retail Trade 44000
44500 Food & Beverage
44000 …
Ontologies• Characterize data; relate to information
beyond boundaries of the organization• Ex: Content Management; Topic Map
(concepts in a domain and relationship that holds them together)
Lists• Simple cross Reference: US = United
States• Expanded list with definitions• Languages may affect list structure
Proprietary/Internal Ref. Lists• Common words or terms for a value
that may be shared across systems• Ex: Account Status• “List of Values” LOV
Industry Ref Data• Industry or Gov’t bodies that establish
codes• Ex: International Classifications of Diseases –
Diagnosis and Procedure codes (ICD10
Geographic or Geo-Statistical• Ex: Census Bureau Reports describe
populations demographics, etc. that is used for marketing purposes
• MSA
Computational• Foreign exchange calculations rely on
managed time-stamps• Typically third-party provided
Metadata about Reference Data
• To ensure lineage and currency are understood and maintained, track the details about a reference data set, such as: Name, Data Provider, Provider Source, Version Number, Version date, etc.
Exploring Master Data Concepts
New England Data Management Community
Party
• Individuals and organizations
• Roles: Customers, patients, vendors, etc.
• Challenges: roles, qty sources, customer engagement
Product
• Products / Services
• PLM – Product Lifecycle Mgmt.
• PDM – Product Data Mgmt.
• ERP - Ent Resource Planning
• MES – Manuf. Execution Sys
• CRM – Prod data in Customer Rel Mgmt.
Legal or Financial Structure
• Contracts
• Centralized and combined –support negotiations / MSAs
• Typically in ERP:
• Chart of Accounts
• Cost Centers
• Profit Centers
Location
• Ability to track and share location info, such as:
• Addresses
• Plants, facilities
• Location Reference Data can support this – geopolitical data: Countries, states, counties, etc.
Data about business entities --- A real-world object (person, organization, place or thing)
Common Types:
System of Record – authoritative system where data is created/captured and maintained
System of Reference – where reliable records are retained to support transactions and analysis; e.g. MDM, Data Sharing Hubs, Data Warehouses
Exploring Master Data, continued
New England Data Management Community
An MDM System should be managed to provide:
• A “Trusted Source” - the best version of the truth
• A Single View or 360⁰ View
The “Golden Record”
• Encompasses data from multiple source systems• Rules are developed for the matching and merging processes and
the formulation of the final “record”
Master Data Management (MDM) defined by Gartner:“…a technology-enabled discipline…business and IT together…to ensure uniformity, accuracy, stewardship, semantic consistency, and accountability of the enterprise’s shared Master Data assets…”
MDM Key Processing Steps
New England Data Management Community
•Terms and definitions
•Align these across the Enterprise
•Should be a “business” versus “technical” view
Data Model Management
•Planning, evaluating and incorporating sources
•Receive, assess (profile), evaluate, pilot, DQ, Stewardship, integrate
Data Acquisition
•Validation – i.e. cleansing
•Standardization –conforms to defined standards
•Enrichment – adding attributes to inform entity resolution, i.e. D&B DUNS numbers
Data Validation, Standardization and
Enrichment
•Process of matching*
•Deterministic vs. Probabilistic
•False Negative and False Positives
•Identity resolution –rules-based formation of the final record
•Essential to maintain history to “Undo” if needed
•ID Management – Global and x-Ref Info
Entity Resolution & Identifier Management
•Work processes that support the resolution of match fall-out or weak/bad matches
•Over time lessons learned can improve rules in the MDM technology
Data Sharing & Stewardship
Note – *Different match rules require different workflows- Duplicate identification – DS manual merging- Match-link rules – identify and cross-reference matches without updating the content of the cross-referenced record- Match-merge rules – match and merge into a single unified record; if rules apply across systems, update
Data Sharing Architecture
New England Data Management Community
https://www.slideshare.net/inforacer/how-to-identify-the-correct-master-data-subject-areas-tooling-for-your-mdm-initiative1
Source/Adapted: How to identify the correct Master Data subject areas & tooling for your MDM initiative, Christopher Bradley
DW or DM
Example of hub-and-spoke architecture
• MD hub handles interactions with spoke items such as source systems, data stores, etc.
• Minimizing the number of integration points
• A local hub can extend and scale the MD hub
MDM Tools and Techniques
• Requires data integration, data remediation, Op Data Stores, data sharing hubs or specialized MDM applications
• Vendors offer subject area solutions or custom services/solutions
• Packaged solutions to jumpstart a program
Data Sharing Architecture – Three Approaches
New England Data Management Community
Approach Description Pro Con
Registry Index points to Master Data in the various systems of record.
Master Data is managed in systems of record.
Easier to implement; fewer changes to systems of record
Complex queries to assemble MD from multiple systems; more business rules needed
Transaction Hub Applications interface with the hub to access and update Master Data.
Master data exist only in the Transaction Hub and is the System of Record for Master Data.
Enable better governance and provide a consistent source of Master Data.
Costly to remove functionality to update Master Data from existing systems of record.
Consolidated A hybrid of Registry and Transaction Hub. Systems of record manage master data local to their applications.
Master Data is consolidated and made available form a data-sharing hub – the system of reference for Master Data.
Eliminates the need to access directly from the systems of record.
Provides an Enterprise view with limited impact on systems of record.
Entails replication of data; will be latency between the hub and systems of record.
Reference Data (RD) and Master Data (MD) Activities
New England Data Management Community
Define Drivers and requirements
Evaluate / Assess Sources
Define Architectural Approach
Model the Data Sets
Define Stewardship and Maintenance
Establish Policies
RD & MD – For both, business and technical drivers areusually to create efficiencies, save costs
RD – Consider volatility, freq. of updates & consumption modelMD – Consider data consumption & sharing models: align with business strategy
RD – Model codes, descriptions and other info needed, such as metadata to track change historyMD – Model the subject area; introduce canonical model
RD – Identify internal and/or external sourcesMD – Assess structure and quality of source data
RD – Ensure ref data remains current; facilitate maintenance across business unitsMD – Stewards to resolve close /questionable matches; resolve match issues
RD – Roadmap and requirements for adoptionMD – Deploy unidirectional loops to maintain consistency
Implementation Guidelines
New England Data Management Community
• Both are forms of data integration – implementation principles for data integration and interoperability apply (Chapter 8)
• KEY – proper and strong Data Governance
• Adhere to Master Data Architecture
• Monitor data movement
• Manage Reference Data change
• Establish Data Sharing Agreements
Anticipate required organization change and utilize Data Governance and Data Stewardship.
Reference and Master Data Metrics
New England Data Management Community
• Data quality and compliance
• Data change activity
• Data ingestion and consumption
• Service Level Agreement (SLAs)
• Data Steward coverage
• Total Cost of Ownership (TCO)
• Data sharing volume and usage
Tips:• Vendor tools may provide canned reports and dashboards• Consider the metrics when you define your processes• Don’t forget about the operational metrics IT can provide
Q & A
New England Data Management Community
STUDY GROUP MATERIALS
New England Data Management Community
Study group presentations will be posted on CDMP Study Group page, on DAMA New England website, in the Schedule &
Agenda section.
NEXT SESSION
New England Data Management Community
Date Topic Facilitator
February 19th Chapter 1: Data Management Tony Mazzarella
March 4th Chapter 2: Data Handling Ethics Lynn Noel
March 18th Chapter 3: Data Governance Sandi Perillo-Simmons
April 1st Chapter 4: Data Architecture Laura Sebastian Coleman
April 15th Chapter 5: Data Modeling & Design Lynn Noel
April 29th Chapter 6: Data Storage & Operations Karen Sheridan
May 13th Chapter 7: Data Security Laura Sebastian-Coleman
May 27th Chapter 8: Data Integration & Interoperability Mary Early
June 10th Chapter 9: Document & Content Management Sandi Perillo-Simmons
June 24th Chapter 10: Reference & Master Data Mary Early
July 8th Chapter 11: Data Warehousing & Business Intelligence Tony Mazzarella
July 22nd Chapter 12: Metadata Management Karen Sheridan
August 5th Chapter 13: Data Quality Laura Sebastian-Coleman
August 19th Chapter 14: Big Data & Data Science Nupur Gandhi
September 2nd Chapter 15: Data Management Maturity Assessment Laura Sebastian-Coleman
September 16th Chapter 16: Data Management Organization & Role Expectations Agnes Vega
September 30th Chapter 17: Data Management & Organizational Change Management Tony Mazzarella
October 7th Final Review Tony Mazzarella
HOMEWORK
New England Data Management Community
Next up: Chapter 11 - Data Warehousing and Business Intelligence
What are the key factors to consider when defining the population approach to your Data Warehouse and Business
Intelligence solution?