data lake - cio summits · big data strategies analytics playing with petabytes is passion...

13
Data Lake BUILDING AGILE BIGDATA ANALYTICS PLATFORM Rama Kattunga Systems Director Enterprise Analytics

Upload: others

Post on 28-Oct-2019

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

Data LakeBUILDING AGILE BIGDATA ANALYTICS PLATFORM

Rama KattungaSystems DirectorEnterprise Analytics

Page 2: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

About me

Experience Worked @

System Director | Enterprise Analytics

Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data

platform for healthcare

Page 3: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

“Culture eats Strategy for …..”

Culture is todays’ major performance differentiator

Culture is the foundation for the strategy

Page 4: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

What is Data Lake?

Data Lake

Ecosystem

Page 5: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

“a place to store practically unlimited

amounts of data of any format, schema and

type that is relatively inexpensive and

massively scalable”

"If you think of a datamart as a store of bottled water –

cleansed and packaged and structured for easy

consumption – the data lake is a large body of water in a

more natural state. The contents of the data lake stream in

from a source to fill the lake, and various users of the lake

can come to examine, dive in, or take samples.“

- James Dixon, Pentaho CTO

Page 6: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

Water packaging

CRM ERP Finance

ETLBusiness Area 1

CRM ERP Finance

ETLBusiness Area 2

CRM ERP Finance

ETLSingle Source of Truth

Business Area 3

Page 7: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

Today’s Model: Traditional Extract Transform Load

BusinessIT Supported

IT Pro

End Users

Extract

Existing Data LOB Applications

FilesData Marts

Data

Quality

Analysis Reports Dashboards &

Scorecards

Provision

Analysis

Cubes

Data Warehouse

Transform &

Load

Spreadsheets

Specialized Tools

6-9

Mo

nth

s

Change

$$$

3-6

Mo

nth

s

Satisfaction Low

Data Marts

High cost of rework

Page 8: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

Water packaging

CRM ERP Finance

ETLBusiness Area 1

CRM ERP Finance

ETLBusiness Area 2

CRM ERP Finance

ETLSingle Source of Truth

Business Area 3

CRM ERP EMR

LOB CORPORATE

Local

Data

LOB

MartEDW

Transactional Systems

Page 9: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

Managed

Self-ServiceProduction

End Users &

Business

Extract & Load

LOB Applications

FilesData Marts

Data

Quality

Analysis Reports Dashboards &

Scorecards

Provision

Analysis

Cubes

Data Warehouse

Transform

Rapid Experiment

POC

PilotIT Support

Iterate

Transform

KeepKill

IT Pro / IT Supported

Spreadsheets

Specialized Tools

Ad Hoc

Go

ve

rna

nc

e &

Da

ta S

tew

ard

ship

Requirements

Common Platform

Better approach: ETL EL, iterate then T

Data Lake

Page 10: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

Where can we use Data Lakes?

Ingestion challenges with Data sources like EMR systems

No data left behind

Schema on read

Scaling

Reduction of costs due to data movement

Page 11: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

Challenges with Data Lakes?

Not a silver bullet

Not a replacement to Information Governance

Frustrating to business users if there is no schema, Metadata,

Size of the data

Security and data privacy

Page 12: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

1 2

ONESource Reference Architecture

EMR &

Revenue Cycle

ERP

Decision Support

External and

Benchmark Data

Quality Data

Patient

Experience

ACO

ImagesSystems Management

Data Governance

Dashboards

Standard

reporting Adhoc Reporting

Predictive

Analysis

Advanced

Visualization

Business Intelligence Platform

FINANCE APPS

Clinical APPS

ACO

HIE

Hadoop Distributed File System

SQL in Hadoop SQL in Hadoop.

. . .

. .

Compute & Storage

. Compute & Storage

SQL in Hadoop SQL in Hadoop.

Data Lake Platform

Security Management

Data Quality

Project DUMP iT

Source Systems

Clinical

Operational

Financial

Supply Chain

HR

Accounting

Sales and Mktg.

IT Analytics

Population

PHP

PMG

DATA NORMALIZATION

Data Quality Management

Reference Data Management

Data Quality Rules Engine

Business Rules Engine

Data Policy Management

Business and Data Definitions

Business and Data Traceability

Hierarchy Management

Enterprise Data Model

Data Enrichment

Patient Portal

Physician Portal

FIREWALL

Information Hub

CDR

Devices

Da Vince

Robots

Portals

Page 13: Data Lake - CIO Summits · Big Data Strategies Analytics Playing with petabytes is passion Currently building a unified and unique data platform for healthcare “Culture eats Strategy

1 3

Thank You !

Rama.Kattunga@gmail