revolution in data governance - transforming the customer experience

54
February 13, 2015 DAMA International Data Warehousing Data Governance, Metadata, & ETL

Upload: paul-dyksterhouse

Post on 18-Jul-2015

159 views

Category:

Data & Analytics


0 download

TRANSCRIPT

February 13, 2015

DAMA International

Data Warehousing

Data Governance, Metadata, & ETL

February 13, 2015 - Data Governance 2

Speakers Intro

Pamela Hulse

— Director of Data Governance & Compliance

— Wolters Kluwer Health (formerly NDC Health)

— Previous data management experience with Mayo Clinic, McKessonHBOC

Paul Dyksterhouse

— Acxiom

— Data Warehouse Technical Unit Leader

— Previous data management experience with BankOne, Schwab, Honeywell, American Express, UPS, NDCHealth

February 13, 2015 - Data Governance 3

Wolters Kluwer Health

Healthcare analytics provider for pharmaceutical companies

20 years of healthcare claims data warehousing and business

intelligence

Service pharmaceutical manufacturers including Pfizer, GSK

10 million transactions per week and 50 Terabyte Data

Warehouse

Currently housed on DB2, Oracle, MSSQL and Netezza platforms

with MicroStrategy BI interfaces

In process of being migrated to Acxiom’s scalable Linux Grid

February 13, 2015 - Data Governance 4

Agenda

Introduction

The Path Traveled

Data Governance

Data Access and Asset Management

Data Architecture

Data Tool Selection

Outcomes

February 13, 2015 - Data Governance 5

Introduction

A little more than a year ago, Wolters Kluwer Health was faced

with two large seemingly insurmountable challenges.

As newer members of the Wolters Kluwer Information

Management team, Pamela Hulse & Paul Dyksterhouse faced

technical, process, and people challenges to address data access

and distribution requirement in a changing business environment.

This is the story of how over the past year we revolutionized data

governance.

The revolution was in taking the data governance process that

was out of control and getting it under control.

February 13, 2015 - Data Governance 6

Experience gained and lessons learned

Successes

— Large number of people involved reduced pushback and

propagated vision

— Experience level of external resources

— Package solution acquisition

— Vision is carried into new initiatives that will further the

impact

— Maintained external compliance certification

— Project came in under budget and within a 12 month period

— Further the maturity of the organization

February 13, 2015 - Data Governance 7

Experience gained and lessons learned

Things to do different next time

— Proof of concept/vendor participation

— Further education of internal resources

Governance & Data Management

Technology vision

February 13, 2015 - Data Governance 8

Experience gained and lessons learned

Other considerations

—Immaturity of package solutions and available

consultants

—Progress slowed by new large initiatives

—Availability of key staff

Technical skills required

Data Management & Governance experience required

February 13, 2015 - Data Governance 9

Revolution in Data Governance

“Whether occurring spontaneously, which is rare, or through

careful planning, revolutions depend for their success on crucial

timing, the fostering of popular support, and the nucleus of a

new governmental organization.” Encarta

Foundation of the Revolution

— Attributes

— Established Environment/Culture resistant to change

— People with a vision

Catalyst for Revolution

— External events that change perspectives

— A key event that consolidates the supporters

February 13, 2015 - Data Governance 10

Attributes of the Revolution

•Must be swift

•Must be strong

•Must be driven

•Require outside support

February 13, 2015 - Data Governance 11

Established Environment/Culture Resistant to Change

No management investment or priority on process improvement

Tactical approach to data management issues

Brittle legacy systems from too many short term fixes

Complex web of processes, systems, and platforms

Silos of departments and individuals with

key knowledge of data assets

Established suite of products with a very

established customer base

February 13, 2015 - Data Governance 12

People with a vision

Executive Sponsor –

primary data & large

project owner

Dedicated individuals

to drive the project

and own the future

process

— Business Sponsor

— Technology Sponsor

February 13, 2015 - Data Governance 13

Catalysts for Change:

Regulatory requirements

Contractual agreements

Customer demand

Financial pressures

February 13, 2015 - Data Governance 14

Key Event: Business not able to meet challenges

Risk of non-compliance

Risk of not inventorying data assets, transforms and

products in an accessible repository

Lack of organizational resource priority to manage

risks

Product quality and service issues

Increased costs and missed opportunities

Inability to measure risks

Inability to secure sensitive data assets

February 13, 2015 - Data Governance 15

Role of the revolutionary

Deliver a message

That states the reality of the losses of not changing;

And provides a vision to people

that foments support for transformation

February 13, 2015 - Data Governance 16

We are here to share with you the path

we followed.

February 13, 2015 - Data Governance 18

Pharmaceuticals R Us

Compounds

Formulas

Pharmaceutical

Products

February 13, 2015 - Data Governance 19

Data Warehouse

The ability to store and easily

retrieve attribute level information

on data assets, access, transforms,

and deliverables is essential for

asset management, quality products

and responsive customer service.

Compounds = Data Assets

Formulas = Business Rules & Transformations

Products = Information Deliverables

February 13, 2015 - Data Governance 20

2 Resources

Obtain champion, funding, leadership team

— Essential that the business own defining the solution and

implementing it.

Assess internal capacity vs. resource needs

— Availability

— Skills, Experience, Knowledge

Procured professional resources to meet the need

— Business

— Technology

February 13, 2015 - Data Governance 21

3 Define parallel project work teams

(security, controls,

HIPAA compliance,

contractual

obligations)

Architecture (Data and Metadata)

Metadata / ETL

Tools and Processes

Governance

Data Asset & Access

Management

February 13, 2015 - Data Governance 22

Launch

Resources

— Hired a Director of Data Access Management

— Procured experienced vendor – 5 vendors

Analysis

— Compiled requirements and use cases

— Evaluated available options

Build / Buy – existing solutions

— Enterprise Metadata Solutions

— Integrator Metadata Solutions

RFP process – 5 vendors

Proof of concept – 2 vendors

February 13, 2015 - Data Governance 23

Project Work Teams

Data Governance – Development of roles, responsibilities, communication strategies, policies, processes, and procedures, as well as assistance in implementing them.

Data Asset & Access Management –

Definition of Data Flows, Common Data Model, and Metadata for information management and the documentation of these data assets. Identification and documentation surrounding sensitive HIPAA & ArcLight Contractual data elements including business process and business rules / requirements for a data integration tool.

Data Architecture – The validation and recommendation of a architecture that is aligned with business

requirements

Data Tool Selection – Evaluate a short list of Data Integration / Metadata Tools that includes a Proof-Of-

Concept pilot, results collection and the creation of a Wolters Kluwer Health Recommendations Document

February 13, 2015 - Data Governance 24

New Vision

The old paradigm: “Just do it!”

The post-compliance paradigm:

“Do it. Control it. Document it. Prove it!”

Data Governance

February 13, 2015 - Data Governance 25

Data Governance Deliverables

Data Governance framework design — Roles & responsibilities

— Policies

— Key procedures

Defined key roles & processes — Governance steering committee

Plan for complete implementation

Data Governance

February 13, 2015 - Data Governance 26

Data Governance Groups

Staff

perspective

Management

perspective

Executive

perspective

Managers and other influencers

Staff

Corporate

Leadership

Stewards

Exec

Council

GRCS

Board

Data Gov

Mgmt Team

Lead

Stewards

Small group that runs

the Governance Program

Larger group of Subject

Matter Experts, Super-

users, Directors/Managers

of Functional Areas

VPs in various

Business

and IT groups

Staff that works with data

Management or staff

that communicates

with or gives direction

to stewards

Data Governance

February 13, 2015 - Data Governance 27

Scores: 0 – Non-existent 1 – Initial / Ad Hoc 2 – Repeatable but Intuitive 3 – Defined Process

4 – Managed and Measurable 5 - Optimized

Data Governance

February 13, 2015 - Data Governance 28

Project Teams

Data Governance – Development of roles, responsibilities, communication strategies, policies,

processes, and procedures, as well as assistance in implementing them.

Data Asset & Access Management – Definition of Data Flows, Common Data Model, and Metadata for information

management and the documentation of these data assets. Identification and documentation surrounding sensitive HIPAA & ArcLight Contractual data elements including business process and business rules / requirements for a data integration tool.

Data Architecture – The validation and recommendation of a architecture that is aligned with Wolters

Kluwer Health’s business requirements

Data Tool Selection – Evaluate a short list of Data Integration / Metadata Tools that includes a Proof-

Of-Concept pilot, results collection and the creation of a Wolters Kluwer Health

Recommendations Document

February 13, 2015 - Data Governance 29

Data Asset & Access Management

Analysis of all Data Warehouse assets at all points in

the lifecycle

Analysis of all Access Roles

Modeling of data access granting, data usage, and

metadata management

Extension of metadata definitions to include the type

and level of sensitivity

Data Asset & Access Management

February 13, 2015 - Data Governance 30

Data access and control requirements

Collection of business rules

Identification of key data elements (PHI, Contractual) with metadata

Documentation of key data flows

Identification of key control points

High-Level Business Process Model [UML]

Infrastructure / Systems Diagram

Team Deliverables

Data Asset & Access Management

February 13, 2015 - Data Governance 31

Sensitive data

Data Asset & Access Management

February 13, 2015 - Data Governance 32

Sensitive data

•Regulatory Sensitive Data Elements:

HIPAA (PHI/IIHI) Name, Birth Date, SSN, Demographics, other ID numbers

•Contractual Sensitive Data Elements:

Vendor License Agreements NCPDP Number, Vendor/Pharmacy Name,

Demographics

Data Asset & Access Management

February 13, 2015 - Data Governance 33

3.4 Maintain Product

Delivery Options

Metadata Repository System(ERStudio)

Maintain Logical and

Physical Data Element

Descriptions and

Rules

Business User

(Can include members of

Data Services, Data

Management and Client

Services)

E-Security Administrator

Data Access Manager

/ Data Analyst

ETL Tool / ERStudio

MicroStrategy / BI tools /

Scanners

(Systems)

2.3 View Logical

Descriptions, Business

Definitions, Reports and

Product Definitions

5.1 Update

Repository

5.2 Update Metrics

4.6 Analyze

Repository Usage

4.5 Analyze Data

Usage / Lineage

4.4 Analyze Access

to Data Assets

6.1 Generate

Data Asset

Inventory Report

6.2 Set Inventory

Security Levels

Use Case Diagram

Technical User

(Can include members of

Data Services,

Data Management and Client

Services)

2.2. Maintain

Logical to Physical

Maps

2.1. Maintain

Physical Data

Descriptors and

Sensitivity Rules

Data Services /

Client Services

Workforce

2.4 Maintain Business

Definitions for Data,

Rules and Processes

2.5 Link Logical Rules

and Data to Business

Definitions

3.2 Link Product

Definitions to Business

Process Definitions

4.1 Identify and

Update Governors

and Stewards

4.7 Analyze

Repository Data

Quality

4.2 Maintain

Governance Policies

and Procedures

4.3 View Governance

Policies, Procedures,

Governors, Stewards

Workforce

Data Management

Workforce

3.1 Maintain Product

Definitions

Client Services

Workforce / Product Mgmt

2.6 Maintain Report

Definitions

Color Key: Security Logical View Physical View Business View Governance

1.2 Audit Linkage of

Logical Rules and Data to

Business Definitions

3.3 Maintain Clients

3.5 Link Clients to

Product Delivery

Options

1.3 Audit Linkage of

Logical Rules and Data to

Physical Entities

Use Case Line Key:

Thick : In scope

Thick-dashed: Some Dev.

Thin- solid : Prototype

Thin-dashed : HL Arch.

None : Deferred

1.0 Maintain Lists of

Production Servers and

Databases

1.1 Maintain

Logical Data Descriptors

and Sensitivity Rules

Data Asset & Access Management

February 13, 2015 - Data Governance 34

Data Asset & Access Management

Data Governance – Development of roles, responsibilities, communication strategies, policies,

processes, and procedures, as well as assistance in implementing them.

Data Asset & Access Management – Definition of Data Flows, Common Data Model, and Metadata for information

management and the documentation of these data assets. Identification and documentation surrounding sensitive HIPAA & ArcLight Contractual data elements including business process and business rules / requirements for a data integration tool.

Data Architecture – The validation and recommendation of a architecture that is aligned with Wolters

Kluwer Health’s business requirements

Data Tool Selection – Evaluate a short list of Data Integration / Metadata Tools that includes a Proof-

Of-Concept pilot, results collection and the creation of a Wolters Kluwer Health

Recommendations Document

February 13, 2015 - Data Governance 35

Team Deliverables

Metadata architecture —Operational

—Governance

Industry-based best practice findings

Common Warehouse Metamodel

Data Architecture Design

Development Solution Diagram

Project Plan for Phase II

Data Architecture

February 13, 2015 - Data Governance 36

Example Metadata Architecture

Data Sources Business Applications Data Warehouse Environment

Co

nte

xt

Metadata (Business, Technical, Operational) & Security / Access Control (eTrust)

Data

Data integration architecture – Data models

Metadata Repository

Exte

rnal D

ata

So

urc

es

Quality Control (QC)

Master Reference Data

Collection and

Standardization

ETL

QC

ETL

3a

Client

Profile

Pharma

Data Mart

Products

IHR Data

Mart

Products

ETL Engine

Pharma

Data Mart

IHR Data

Mart Integrated Repository

Consolidation / Aggregated Layer

ETL

Data Architecture

February 13, 2015 - Data Governance 37

Project Teams Data Governance –

Development of roles, responsibilities, communication strategies,

policies, processes, and procedures, as well as assistance in

implementing them.

Data Asset & Access Management – Definition of Data Flows, Common Data Model, and Metadata for information

management and the documentation of these data assets. Identification and

documentation surrounding sensitive HIPAA & ArcLight Contractual data elements

including business process and business rules / requirements for a data integration

tool.

Data Architecture – The validation and recommendation of a architecture that is aligned with

business requirements

Data Tool Selection – Evaluate a short list of Data Integration / Metadata Tools that includes a Proof-Of-

Concept pilot, results collection and the creation of a Wolters Kluwer Health

Recommendations Document

February 13, 2015 - Data Governance 38

Solution Requirements Matrix & Priorities

Tool Recommendation Document:

—Acceptance Criteria Matrix

—Proof of Concept Plan and Design

—Testbed Management Strategy

—Proof of Concept Test Result

Over 50 users for 4 weeks required for definition of test cases, text execution, and review of results.

Team Deliverables

Tool

February 13, 2015 - Data Governance 39

Metadata ETL Proof of Concept

Three test cases that would validate highest

complexity/risk areas of functionality

Delivered requirements, test cases, test data and

acceptance criteria 3 weeks in advance

Scheduled checkpoint progress meetings

Schedule 1 week for each POC Tool

February 13, 2015 - Data Governance 40

Metadata ETL Proof of Concept

Tool

February 13, 2015 - Data Governance 41

FALCON

Metadata Project

NDCHealth

Phoenix, Arizona

Quad Analysis Of ETL Vendor Evaluation Positioning

Business Alignment - CIBER SME’s

Rev Drawing Number Department xxx

1.2 2005.03.23.1 Information Management

DRAFT First Release Pg 1 OF 5

Low HIgh

Pro

du

ctivity: Ease O

f Use, Integration, C

hange Mgm

t, Reusability, F

unctionality

Performance: Throughput, Scalability, Infrastructure Requirements, etc.

Hig

hL

ow

Im

Legend:

Im Informatica Metadata Score

Ie Informatica ETL (Data Movement) Score

Am Ascential Metadata Score

Ae Ascential ETL (Data Movement) Score

Ie

Ae

Am

CIBER SME Analysis:

AscentialØ IBM Purchase Is Expected To Delay Release

Of Integrated Product Suite And Functionality

Improvements

Ø Ascential Infrastructure Requirements Lowers

Metadata Scoring

Ø Ascential’s Lack Of Integration For Their

Product Suite Negatively Affects Developer

Productivity (ETL Score)

Ø Ascential’s Lack Of Architectural Integration

Lowered The Metadata Score

lnformaticaØ Informatica’s SuperGlue Is Best Metadata

Engine In The ETL Market

Ø ETL Tool Has Improved Their Parallel

Performance Recently (Especially On SUN

Servers)

Ø Informatica’s High Productivity Score Results

From Integrated Toolsets And Powerful Reuse

& CM Functions

Ø Informatica Parallel Technology Is Close But

Not Equal To Ascential’s.

Pro

du

cti

vit

y

Performance

National Practice Experts Subjective Scores - CIBER In

fo

rm

atic

a (M

eta

data

) W

in

s

Tool

February 13, 2015 - Data Governance 42

Project Timeline

Metadata Project – Part One - Analysis

Metadata Project – Part Two - Implementation

1/3/2005 1/10/2005 1/17/2005 1/24/2005 1/31/2005 2/7/2005 2/14/2005 2/21/2005 2/28/2005 3/7/2005 3/14/2005 3/21/2005

Deliverable

Review

Librarian

Turnover

Architect. RoadmapTechnical Assessment & Requirements Phase

JANUARY FEBRUARY MARCH

Project Planning & Closure

Data Governance Framework D.G. Implementation

Architecutral High Level Design Tool Recommendation/Testbed

3/7/2005 3/14/2005 3/21/2005 3/28/2005 4/4/2005 4/11/2005 4/18/2005 4/25/2005 5/2/2005 5/9/2005 5/16/2005

Test Scripts Support

DATA GOVERNANCE FRAMEWORK IMPLEMENTATION & WORKOUT Project Closure Doc's

Knowledge Transfer & Training, Goal Setting Meetings & Deliverable Reviews

Metadata Capture Data & Bus. Rules Validation & Testing Production 5/13/2005

ETL Coding ETL Debugging, Testing, Metadata & Tuning Script Test & Validation Turnover

MARCH APRIL MAY

Tool

February 13, 2015 - Data Governance 43

• Inventory of data assets, sensitivity, and

data access

• Where-founds of data

• Identify controls and

owners; Apply controls

• Complement existing Change

Management with governance controls

• Ongoing management / measurement:

- Audit Project/SRE/Customer changes,

- Audit access controls and asset inventory

- Assess impact of regulatory & compliance

changes

- Measure data governance effectiveness

• Executive Council

• Data Governance Manager + Team

• GRCS Board (provides perspective on Governance,

Risk, Compliance, and Security)

• Lead Stewards (serve as communication hubs)

• Formalize stewardship

responsibilities for all staff

Data Governance plus Metadata: Solution Facets

People

Process

Info

Tools

• Inventory of data owners

• Risk management focus

– assessment,

prioritization, controls

• Technology to

facilitate

harvesting, storing,

and publishing

data about

Wolters Kluwer Health

data

• Industry-standard

frameworks for working

with controls

February 13, 2015 - Data Governance 44

The future of the revolution

Foundation Laid - The Data Governance, Metadata and

ETL laid the foundation for managing data at the

attribute level.

Continue the Transformation

— Wolters Kluwer has now engaged in a 2 year initiative to

convert all systems over to Data Stage

— Goal is to be able to manage data and business rules in a more

transparent and flexible manner

— Further the automation and formalization of the Data

Governance, Metadata and ETL initiatives and gain the

additional value

— Wolters Kluwer is moving it’s data processes to Acxiom’s

enterprise data grid to support the transformation.

February 13, 2015 - Data Governance 45

Experience gained and lessons learned Successes

— Large number of people involved reduced pushback and propagated vision

— Experience level of external resources

— Package solution acquisition

— Vision is carried into new initiatives that will further the impact

— Maintained external compliance certification

— Project came in under budget and within a 12 month period

— Further the maturity of the organization

February 13, 2015 - Data Governance 46

Experience gained and lessons learned

Things to do different next time

—Proof of concept/vendor participation

—Further education of internal resources

Governance & Data Management

Technology vision

February 13, 2015 - Data Governance 47

Experience gained and lessons learned Other issues

—Immaturity of package solutions and available consultants

—Progress slowed by new large initiatives

—Availability of key staff Technical skills required

Data Management & Governance experience required

February 13, 2015 - Data Governance 48

Questions

February 13, 2015 - Data Governance 49

Contributors

Wolters Kluwer Business and IT teams

Knightsbridge

Ciber

Informatica

IBM Ascential

www.SOXonline.com

February 13, 2015 - Data Governance 50

Additional Slides

February 13, 2015 - Data Governance 51

Proactive Data Governance

Change Management

Process

Ø The Case for Data Governance

Ø Data Governance Groups

Ø Data Governance Processes

Ø What Data Governance Looks Like

Ø Next Steps

Impact is understood.Risks are

identified and Managed.

Trigger:Change Request

5. Communicate

Status

Notify all stakeholders of decisions and required actions.

Administer Process

Exec

Council

Data

Governance

Management

Team

GRCS Board,

Project or Functional

Teams, Lead Stewards,

others as appropriate

1. Triage

Set Goals,Assess & Communicate

Required Levels of Involvement

GRCS

3. Conduct Risk Analysis

Identify upstream and downstream impacts. Consider impacts of change on Governance, Risk, Compliance, and

Security efforts.

4. Decide How to Proceed

Decide whether to approve the change, and whether

adjustments are required for any other efforts or controls.

2.

Conduct Due

Diligence

optional loop-outs

February 13, 2015 - Data Governance 52

Ø Data Governance roles & policies rollout

Ø Tool Configuration

Ø extend the metadata model

Ø build ETL Connectors

Ø build user workflow and reports

Ø Repository population

Ø Testing and data validation

Ø Knowledge transfer

Ø User adoption training and execution

Implementation Approach Implement Best Practices

February 13, 2015 - Data Governance 53

Revolution in Data Governance Outcomes

Data Governance formally defined, trained, established and

integrated into change management

Unified approach of Business and Technology

Recognition of Maturity Model

Executive level sponsorship and accountability

Complete assessment, procurement and implementation in under

12 months

Metadata – Daily update of metadata to repository for data

sensitivity access assessments and audit

February 13, 2015 - Data Governance 54

Sensitive data