master data management in the age of big data data management in... · dataflux data management web...

51
Copyright © 2012, SAS Institute Inc. All rights reserved. MASTER DATA MANAGEMENT IN THE AGE OF BIG DATA PRESENTED TO IRMAC MAY 15, 2013 STEVE PAPAGIANNIS [email protected] 416 307 4620

Upload: others

Post on 10-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

MASTER DATA MANAGEMENT IN THE AGE

OF BIG DATA

PRESENTED TO IRMAC – MAY 15, 2013

STEVE PAPAGIANNIS

[email protected]

416 307 4620

Page 2: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

DEFINITIONS WHAT ARE MASTER AND BIG DATA???

Master data is information that is

key to the operation of a business. It

is the primary focus of the

Information Technology (IT)

discipline of Master Data

Management (MDM), and can

include reference data. This key

business information may include

data about customers, products,

employees, materials, suppliers,

and the like.

Source: Wikipedia

Big Data. When the volume,

velocity and variety of data exceed

an organization's storage or

compute capacity for accurate and

timely decision making.

Source: SAS Website

Page 3: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

DATA MANAGEMENT MASTER DATA AND BIG DATA AS SUB-SPECIALTIES

Master Data is the high value information that is

used most frequently across an enterprise.

• Example: Customers, Members, Groups, Organizations,

Patients, Vendors, Providers, Citizens, Employees, Services,

Accounts, Materials, Locations, Assets, etc.

Big Data is the low value density information

generated by systems

• Example: Security Logs, Call Detail Records, Sensor

Information

Page 4: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

WHY NOW???

• Master Data is

getting richer

• Big Data is being

heavily explored

• Costs to achieve

this are dropping

• Hype cycle

Page 5: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

VOLUME

VARIETY

VELOCITY

VALUE

TODAY THE FUTURE

DA

TA

SIZ

E

THRIVING IN THE BIG DATA ERA

Page 6: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

Budgeting Engineering

Scattered Data

Contract Center

Hwy 410, Brampton, Acme Construction

Highway 410, Peel Region

Hwy 410, January, Kilometer 10-12, resurfacing

Highway 410, $75MM

Hwy 410, Brampton, Peel Region, Kilometer 10, January, $75MM, Acme Construction, resurfacing

Data Integration

Data Quality

Data Model

Business Services

Stewardship Console

Data Governance

Identity Management

Reporting

Data Profiling

Metadata Discovery

Business Rule Definition

Entity Definition

Page 7: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

3721B

SURVIVING

CONTRIBUTORS

INTO MASTER

RECORD

ID

3721B

First Name

Willaim

Middle

James

Last Name

Corp.

DOB

April 12

SSN

56349123

Address

3224 Pkwy G, Los Osos DW

ID

30391-244

First Name

William

Middle

James

Last Name

Crown

DOB

04/12/1939

SSN

563-49-1234

Address

123 Oak St., Eves, IL 30319 SFA

Person ID

14239

First Name

Bubba

Middle

J.

Last Name

DOB

April 12

SSN

Address

[email protected] ONLINE

Member ID

30391244

First Name

William

Middle

J.

Last Name

Crowne

DOB

4-12-39

SSN

563491234

Address

123 Oak St., Eves, IL ERP

ConsumerID

30391-244

First Name

William

Middle

James

Last Name

Crown

DOB

04/12/39

SSN

563-49-1234

Address

123 Oak St., Eves, IL 30319 CRM

William James Crowne 04/12/1939 563491234 123 Oak Street Eves CA 91403 30391-244 1001 14239 30391-244 30391244

EID Source Keys Survived Fields

Page 8: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

EVOLVING FROM DM TO MDM

Data Integration and/or Data Quality Initiative

Enrich: Data Augmentation and Enrichment

Survive: Entity Resolution & Surviving Record Analysis

Persist: Create a “Master” data record

Synchronize: Tie master data record into existing source systems

Surface: Provide access to other systems in real time via SOA

Full-blown MDM Initiative

Data Management

Master Data

Management

Page 9: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAMPLE MDM HIGH

LEVEL PROJECT

PLAN

Phase 4: QA / Knowledge Transfer

UAT Build Reports UI Enhancements Support/ Knowledge

Transfer

Phase 3: Perform Initial Load

Stage Data, Load Hub Integrate with Source

Systems Define and Implement

Services Performance Testing

Phase 2: Apply Data Quality

Data Connectivity Define and Implement DQ, Matching,

Survivorship Rules Data Enrichment, Verification,

Validation

Phase 1: Define / Discover

Project Definition, Requirements Assessment

Installation Source Analysis/Profiling Define Entities, ETL,

Relationships

Page 10: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

WHAT IS THE

INTERSECTION?

Big Data

MDM

1. Event Stream Processing

2. High Performance Analytics

3. In Database Processing Governance

Page 11: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

INFORMATION

MANAGEMENT

SUPPORT FOR ENTIRE INFORMATION

MANAGEMENT CONTINUUM

Provides unified data

management capabilities that

include data governance, data

integration, data quality and

MDM

Provides complete analytics

management that includes model

management, deployment,

monitoring and governance of the

analytics information asset

Provides decision services that

include business rules and workflow

that facilitates integration of the

information services into the business

systems

Capabilities

Strategy STRATEGY & VISION TO EXECUTE

DATA

MANAGEMENT

DECISION

MANAGEMENT

ANALYTICS

MANAGEMENT

Governance INFORMATION GOVERNANCE

Page 12: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

Data Management

Data Governance

Execution Process

P R O G R A M

O V E R S I G H T

Corporate Drivers

Process & Policy

Business

Framework

DATA GOVERNANCE

FRAMEWORK IGNORE AT YOUR PERIL!

Page 13: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

WHERE IS THE DATA MORE VARIETY THAN BEFORE

Page 14: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

HOW IS IT

PROCESSED MORE VARIETY THAN BEFORE

Page 15: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

BUSINESS DATA NETWORK

Page 16: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

BUSINESS DATA NETWORK

• Enables data governance Common language improves communication & supports compliance regulations

Represent and expose business relationships

Track history of changes

• Accountability and responsibility Document and communicate ownership

Notify interested parties on changes

• Supports better collaboration Capture and share annotations between team members

Greater understanding of the context of information

Use and reuse of trusted information

Page 17: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

USE CASE: BUSINESS AND IT COLLABORATION

Database = ORACLE

Schema = NAACCT

Table = DLYTRANS

Column = TAXVL

data type =

Decimal(14,2)

Derivation:

SUM(TRNTXAMT)

Category: Costs

Term: Tax Value

Description: Tax to be paid on

Gross Income.

(John Walsh is responsible for

updates. 90% reliable source)

Status: UNDER REVIEW

Achieve a common vocabulary between business & technical users

Business Data Network

Page 18: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

• Common language improves communication & supports compliance regulations

ENABLES DATA GOVERNANCE

Page 19: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ENABLES DATA GOVERNANCE

• Represent and expose business relationships

Page 20: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ENABLES DATA GOVERNANCE

• Track history of changes

Page 21: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ACCOUNTABILITY AND RESPONSIBILITY

• Document and communicate ownership

• Notify interested parties on changes

Page 22: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

COLLABORATION

• Capture and share Notes between team members

Page 23: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

COLLABORATION (CON’T)

• Understand the context of information

Page 24: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

DataFlux - Windows Internet Explorer

DataFlux Data Management Web Studio

Parent Term

DataFlux Web Studio Relationship Diagram Business Data Network

Parent Term Synonymous Term

Child Term

Tag

User Account Link

Task

Collection

Domain

Profile

Field

DataFlux Table Data Job Process Job

Rule

Filter

Legend: Dependency Parent Child Inclusion Association Synonymous Equivalent

SAS Table

Report

Information Map

Transformation 1

Transformation 2

SAS Table 2

Page 25: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

DataFlux Data Management Web Studio DataFlux Web Studio

Filter: Object types: „Rules, Terms‟

Filter

DataFlux - Windows Internet Explorer

Parent Term

Relationship Diagram Business Data Network

Collections

Contacts

Data jobs

Domains

Fields

Links

Process jobs

Profiles

Rules

Tables

Tags

Tasks

Terms

All object types 1

All relationship types

Dependency

Parent Child

Inclusion

Association

Synonymous

Equivalent

Legend: Dependency Parent Child Inclusion Association Synonymous Equivalent

Parent Term

Grandparent Term Child Term

Rule

Page 26: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

EVENT STREAM PROCESSING

Page 27: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

EVENT STREAM PROCESSING (ESP)

ESP is a subcategory of Complex Event

Processing (CEP) focused on

analyzing/processing „events in motion‟

called Event Streams.*

The SAS ESP is an embeddable engine

that can be integrated into or front-end

solutions.

* This is the definition provided by the Event Processing Technical Society

Page 28: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

• Continuous queries on data in motion (with incrementally updated

results)

• Very low (max) event processing latencies (i.e., Usecs-msecs)

• High volumes (>100k events/sec)

• Derived event windows with retention policies

• Memory constrained for performance (i.e., Bounded state)

• Predetermined data mining, decision making, alerting, position

management, scoring, profiling, …

• Event out-of-order handling to ensure ordered source streams

TYPICAL

CHARACTERISTICS

OF EVENT STREAM

PROCESSING

APPLICATIONS:

Page 29: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

EVENT STREAM PROCESSING (ESP) VS. RELATIONAL DATABASE

MANAGEMENT SYSTEM (RDBMS)

ESP

ESPs store the

queries

and continuously

stream data

through the queries

RDBMS

Databases store

the data

and periodically

run queries

against the stored data

EVENTS INCREMENTAL

RESULTS QUERIES RESULTS

Page 30: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

EVENT STREAM

PROCESSING WHAT PROBLEMS ARE WE TRYING TO SOLVE?

• Capture value otherwise lost through

information lag

• Enable new opportunities through producing

actionable intelligence with lower latencies

• Continuously analyze events as they occur

• Eliminate storage latencies

• Incrementally update intelligence as new

events occur

• Enable new analysis & processing models to

be developed and modified quickly to

increase the opportunity windows and

reduce costs.

Page 31: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ESP SAMPLE USE CASES

Portfolio Equity Position

Management

Continuous Change

Data Management

Capital Markets Liquidity

Credit Card Fraud Prevention

Telco Prepaid Call Authorization

Portfolio positions of

interest, e.g., market

equity positions by

account, trader,

department, location,

region, and venue.

Get master data changes as

they occur and provide a

consolidated up-to-the-

moment view across silo

systems. Example is a

consolidated customer view

for telco services across

home services, wireless, …

Consolidated L1 & L2

order books for selected

instruments across

selected venues. This

can be used for

algorithmic trading,

trade execution, dark

pool analysis.

Maintain account-based

usage signatures and

known fraud signatures to

enable credit card

purchase scoring for

authorization requests.

Correlate call

authorization requests to

account & account call

plans to determine the call

duration based on current

balance. Maintain

account balances upon

call completions.

Page 32: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

CHANGE DATA

MANAGEMENT

USE CASE

CONTINUOUS DATA INTEGRATION FOR SINGLE

CUSTOMER VIEW ACROSS SILO SYSTEMS

Event Stream Processing Server

Event Stream Sources/ Publishers

Consolidated

Customer Views

Sybase ASE

Customer Broadband,

Digital, Phone, TV Services

Normalized & Cleansed

Home Services (join)

Customer Wireless Services

Normalized & Cleansed

Wireless Services

(join)

Holistic Customer Views

(join)

Customer Support

Cases

Normalized & Cleansed

Support Cases (join)

Customer Account Map

Codes

Customer Care

Oracle

MS SQL Server

My SQL

Data Flow Model:

1. Customer services information is captured from the various DBs of silo systems as a query snapshot followed by change log deltas.

2. Customer account maps are used to normalize the various account numbers across disparate systems.

3. The normalized service data is cleansed for accuracy and completeness.

4. All customer services are consolidated into one holistic view of each customer, which is used by Customer Care.

Page 33: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

MDM WALK THROUGH

Page 34: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ENTITY TYPE

SUPPORT

DataFlux qMDM provides multi-

entity support. Many teams

(admins, stewards, business

users) can work together to

define various elements related

to each entity like attribute lists,

match rules, data quality rules,

security roles, and so on.

Entities are not “live” in the

system until they have been

published in Master Data

Manager and can‟t accept data

from other systems until the

jobs/services to support each

entity type have been generated

using Master Data Manager.

Administrator Activity

Page 35: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ENTITY TYPE

INHERITANCE

Entity types support

inheritance. This allows entities

derived from others to inherit

attributes, clustering rules,

roles, and other properties from

their parent entity. Inheritance

will in some cases render

properties read-only on the

child entity if the property was

inherited from a parent.

Modifications would need to

made at the higher level.

Entity types can also be

designated as abstract, which

means they can carry

attributes, roles, and so on but

they won‟t be used to store

actual instance data.

Administrator Activity

This entity type

inherits from the

“Party” entity type.

Page 36: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

CLUSTER

CONDITIONS

In this administrative area,

clustering conditions can be

designed for each entity type.

By virtue of inheritance,

clustering rules can be

accessed by entities that take

others as their parent.

Job generation functionality will

take these rules and other

properties defined for each

entity and will create ready-to-

use data jobs for batch loads

and updates, real-time queries,

and real-time add, modify, and

retire services.

Entities will not be active until

published by an administrator.

Administrator Activity

Page 37: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

DEFINING

RELATIONSHIP

TYPES

Entities can be related to other

entities through defined

relationships. Here a

relationship between

companies and parts has been

established. Notice the labels

and descriptions. These will

appear throughout Master Data

Manager to show the kind of

relationship that is being

displayed.

Relationships are defined by a

set of match rules that link the

entity instance data. Here, if the

match code for the company

name in one entity matches the

supplier match code in the

other, a relationship will be

made.

Administrator Activity

Page 38: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SEARCH

Master Data Manager provides

several ways to find the specific

instance data you are looking

for. This can be done through

workflow tasks or reports but

typically searching is done in

the Master Data area.

Searches can be done on

entities and hierarchies. You

can use any number of search

fields or use an advanced

search feature for more

granular control.

Entities that inherit from a

common parent will have

search results displayed

together on the same screen.

Stewardship Activity

Fields available for

searching are

configured by an

administrator and can

differ across entity

types.

Page 39: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

AUTHORING

ENTITIES

New entity instance data can

be created directly from Master

Data Manager. The fields

available in the edit form are

configured by an administrator

for each entity type.

Fields can be marked as

required or read-only and they

can be constrained by regular

expressions to enforce data

standards.

Stewardship Activity

Page 40: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

CLUSTER MEMBERS

At the heart of DataFlux qMDM

is the notion of match clusters.

These represent the group of

records from different sources

that have been deemed to be

representations of the same

person, place or thing.

Every match cluster has a best

record that is constructed

through business rules from the

values held by one or more

contributing records.

Best records can be authored

and edited using Master Data

Manager.

Stewardship Activity

New Best Record (if saved)

Best Record (in bold)

Cluster Members (Best Record + Contributors)

Page 41: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

CLUSTER COMPARE

Cluster compare functionality

allows data stewards to look

across contributing records in a

cluster and quickly see,

through highlighted fields, what

is different.

Like the main Cluster Members

view, you can tag this entity for

a workflow, create a new best

record, retire the cluster, or

split off contributors into new

clusters.

You can double-click field

values on the right to populate

your potential new best record

on the left.

Stewardship Activity

Selecting this option

will filter the entity

attributes to just those

that are different.

Use these controls to cycle

through contributing records in

the right column. The left

column shows your potential

new best record.

Page 42: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

VIEWING

HISTORICAL

INFORMATION

Every change to entity data

is captured in the qMDM

database. Modifications to

best records result in

regenerated best records and

older versions are retired.

Entire match clusters can

also be retired, making the

data inactive but available for

queries of the system‟s

history. In this view, previous

versions of the best record

are shown in gray while

active information is shown

normally. Users can toggle

the history view by choosing

to show or hide retired

records.

Stewardship Activity

Use the Show Retired menu item to view the history of changes to the match cluster.

Page 43: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ENTITY

PROPERTIES WITH

RELATIONSHIPS

Every relationship definition that

involves an entity type will be

shown as new windows in the

entity Properties area. Master

Data Manager will query for

discovered relationships for the

first relationship type but you‟ll

have to expand each additional

relationship window to see all of

them.

Discovered relationships will

appear automatically; however,

you can also manually add and

retire relationships here.

Stewardship Activity

Page 44: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ENTITY

RELATIONSHIP

DIAGRAM

The relationship diagram is a

visual representation of an

entity and all of its relationships

with other entities. The area at

the right sets the action for

expanding related items (it‟s

configurable since some

queries can take more than a

few second to run).

A double-click will expand the

relationship diagram to the next

set of related entities. The

window at the bottom either

shows attributes for the

selected entity or, if a group-

type node is selected, all of the

entities that comprise the group

These can be selectively added

to the diagram.

Stewardship Activity

Control+Click rotates

the diagram.

Shift+Click pans the

diagram.

Page 45: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ENTITY

HIERARCHIES

DataFlux qMDM supports the

concept of entity hierarchies.

There is no limit on the number

of unique hierarchies or entity

types per hierarchy.

Hierarchies relate entities at

the “best record” level and

respond accordingly to splits

and merges of match clusters.

Entities can participate in

several hierarchies

simultaneously and appear in

the same hierarchy more than

once outside of the same level.

Stewardship Activity

Page 46: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

MASTER DATA

DASHBOARD

The master data dashboard

dynamically updates as new

entity types are added to

DataFlux qMDM. It shows

summary statistics by source

and entity type for:

• Record counts

• Volume growth

• Number of contributors and

survivors

• Data analysis for sparsity

and max/min values

• Source system contribution

ratios

• Batch load history

Stewardship Activity

Page 47: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

CUSTOM REPORTS

Dynamic and batch reports are

supported through the Reports

area. Both are specially

designed DataFlux data jobs

interpreted as reports by Master

Data Manager.

Dynamic reports can accept

user input as parameters and

the output can be linked directly

to entity data elsewhere in

Master Data Manager for easy

access.

Batch jobs can be initiated from

this location and if their output

contains an HTML file, that data

can be displayed here.

Stewardship Activity

Page 48: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

WORKFLOW TASKS

Master Data Manager supports

the concept of workflows. Two

are available with a basic

installation:

• Tag – this is an ad hoc

workflow that can be used to

raise awareness of issues to

other qMDM users. It is

enabled be default

• Entity Lifecycle – this can be

used to route new records

and edits to existing clusters

to a group of users for

review. It is available but not

enabled by default.

Here an entity has been tagged

for review.

Stewardship Activity

Use the Tag menu item

to make a cluster ready

for review.

Page 49: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

WORKFLOW TASKS

Once a user chooses to act on

a workflow task (they have

already chosen to “accept” the

task in the Actions item), the

area below the Status

information will contain entity

attributes that they can view

and modify.

The entity attributes values

shown are what was captured

at the time the workflow was

created. The actual entity

attributes in the system may

have changed after the

workflow task was initiated. If

so, any submission of changes

from this location will be

integrated with changes made

in the interim.

Stewardship Activity

The Notes field can

be used to

communicate

issues to other

team members.

Page 50: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

WORKFLOW

DASHBOARD

The workflows dashboard

provides a snapshot of

workflow activity generated in

DataFlux qMDM. Among the

statistics it reports are the

following:

• Workflows by status

• Workflows by priority

• Workflows by entity type

• Workflows by workflow type

• Workflows by submitter

• Workflows by modification

date

• Workflows by start date

Stewardship Activity

Page 51: Master Data Management in the Age of Big Data Data Management in... · DataFlux Data Management Web Studio DataFlux Web Studio Filter: Object types: „Rules, Terms‟ Filter DataFlux

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

DATA EXPORT

Data from most panels in

Master Data Manager can be

exported to CSV, PDF, or

Excel formats.

Stewardship Activity