ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

31
April 10-12, Chicago, IL Ensuring Compliance of Patient Data with Big Data and BI Ayad Shammout & Denny Lee

Upload: denny-lee

Post on 26-Jan-2015

107 views

Category:

Technology


3 download

DESCRIPTION

Ayad Shammout and Denny Lee's PASS BA Conference session on our end-to-end Big Data to BI auditing project.

TRANSCRIPT

Page 1: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

April 10-12, Chicago, IL

Ensuring Compliance of Patient Data with Big Data and BIAyad Shammout & Denny Lee

Page 2: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

April 10-12, Chicago, IL

Please silence cell phones

Page 3: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

3

Agenda

A Quick Big Data Primer

Healthcare and Big Data

Compliance and AuditingSQL Compliance Project

Compliance and Auditing with Big Data and BIBig Data: Unstructured Volumes of DataAnalytics: PowerPivot, Power View

Page 4: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

4

What is Big Data?

VolumeExceeds physical limits of vertical scalability

VelocityDecision window small compared to data change rate

VarietyMany different formats makes integration expensive

VariabilityMany options or variable interpretations confound analysis

Page 5: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

5

10x increase every five years

85% from new data types

Dataexplosion

Easy Accessibility of External Data

Cheap, Distributed Storage & Processing

VolumeVelocityVariety

Hadoop

Cloud

By 2015, organizations that build a modern information management system will outperform their peers financially by 20 percent.

– Gartner, Mark Beyer

“Information Management in the 21st Century”

Page 6: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

Large Data Volumes

Non-traditional Data Types

New TechnologiesNew Data Sources

New Economics

Page 7: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

7

Big Data Business Value

140,000-190,000 more deep analytical talent positions

1.5 millionmore data savvy managersin the US alone

$300 billionPotential annual value to US healthcare

15 out of 17sectors in the US have more data stored per company than the US Library of Congress

€250 billionPotential annual value to Europe’s public sector

50-60% increase in the number of Hadoop developers within organizations already using Hadoop within a year

Page 8: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

8

Databecomes the new currency

Page 9: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

9

Hadoop: The most visible face of Big Data

MapReduce Layer

HDFS Layer

Task trackerTask tracker

Job tracker

Name node

Data node Data node

Page 10: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

10

HDInsight: Visit HadoopOnAzure.com

10

Page 11: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

Healthcare and Big Data

Page 12: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

12

Healthcare and IT

Often the laggard in technology

Yet application of IT to healthcare can radically change what we can do

Genomic SequencingProteomic sequencingIncidence Prediction

Page 13: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

13

Healthcare Big Data Example ScenariosClinical Trial DeviationsOriginally Viagra was developed to lower blood pressure and treat AnginaNow its used to help newborn pulmonary hypertension and altitude sickness

Incidence PredictionMissed 4 or more visits, twice as likely to have an asthmatic incidentParticular Cardiac monitor sine wave points to highly likelihood of heart attack

CampaignsSocial media and advertising campaigns to understand user behavior and sentiment

Patient SatisfactionSocial media and advertising campaigns to understand user behavior and sentiment

Page 14: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

14

BIDMC Auditing Scenario

Auditing is critical component HIPAA in ensuring patient privacy1 Billion rows+ of audit data 146 mission critical clinical applicationsComprehensive audits yield 300-500k transactions/dayHIPAA requires audit system with 20 years of data

Auditing ProjectAvailable to community as part of Compliance SDKUpdating for SQL Server 2012, HDInsight, Power View, and MobileBI*

Creating an enterprise tool for consolidated storage, reporting and alerting of all application audit data - that's cool!

John Halamka’s Cool Technology of the Week (Wellsphere Top Health Blogger, Health Impact Award)

Page 15: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

15

BIDMC Compliance Project

SSIS

SSIS

SSIS

HDInsight Windows

HDInsight Azure

SQ

L Serv

er

20

08

/20

12

Audit LogsETL Logs to HDFS

Use Excel 2013 PowerPivot and Power View

SSAS (tabular)

Page 16: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

16

Auditing Sensitive Information

16

Querying Audit InformationUse PowerPivot / Power View / Analysis Services to Query the data.

Security InformationPolicy Information

Process Audit InformationUse SSIS to process SQL2008 All-Actions Audit Information and other CG application audit log data; potentially can use Management Performance DW framework.

Caregroup Environment

File Server

SQL Audit

Connect/Logic

SSIS

CG Application Data

Intersystems Cache

SQL2005

Oracle

SQL2008 All-Actions Audit Data

SQL 2008 / 2012 R2

SSRS 2008 /Power View

Policy Analysis

Policy Reports

Policy Best Practices

Security Analysis

Security Reports

Compliance Reports

Feedback Action LoopUpdate systems to keep them

compliant and secure

Page 17: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

Audit Logs

17

Storage Infrastructure

Transfer files to ASV via AzCopy,CloudExplorer, etc.

Page 18: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

18

Storage Infrastructure

18

Hadoop on AzureCompute Nodes (Medium VMs)

Azure Storage Vault (ASV)Azure Blob Storage

Azure Flat Network Storage

Page 19: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

19

Storage Infrastructure

19

Hadoop on AzureCompute Nodes (Medium VMs)

Azure Storage Vault (ASV)Azure Blob Storage

Azure Flat Network Storage

Stream dataTo compute

Push dataBack to Storage

map sort shuffle reduce

http://dennyglee.com/2013/03/18/why-use-blob-storage-with-hdinsight-on-azure/

Page 20: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

2020

SSIS to HDInsight

Page 21: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

2121

SSIS Processing

Page 22: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

22

SSAS Tabularof HoA Audit Data

Page 23: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

23

Hadoop / Auditing: File sizes

Currently testing gz vs. rawE.g. 12MB raw text file vs. 633Kb gz file (~20x compression)

20x smaller size, ~same query timeApprox same map / reduce task utilization

File Size is 250MB-1GBSSIS package takes care of the size

Future testing: avro, protobuf23

Query Duration (s)

select count(*) from sql_audit_asv_raw 56.066

select count(*) from sql_audit_asv_gz 58.994

Page 24: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

24

Hadoop / Auditing: Formats

For ease of processing, replace carriage returns within embedded SQL statements, e.g.

select col1, col2 from tableAto

select col1, col2 from tableA

This allows you to create a Hive table using CR as row delimiter (i.e. does not have things like SQL quoted identifiers)

24

Page 25: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

25

Page 26: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

SQOOP, HiveODBC, Templeton, CSV, etc

BI Connectivity

Page 27: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

27

Big Data … Excel-lerated!

2 Server, 3mo110 GBbinaryfiles

SSIS

SSIS

SSIS

SSIS extraction1.2GB of text

120MB gz

Hadoop toPowerPivot

6MB

Page 28: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

28

PowerPivot workbook of HoA Audit data

Page 29: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

29

Power View of HoA Audit Data

Page 30: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

30

Win a Microsoft Surface Pro!

Complete an online SESSION EVALUATION to be entered into the draw.

Draw closes April 12, 11:59pm CTWinners will be announced on the PASS BA Conference website and on Twitter.

Go to passbaconference.com/evals or follow the QR code link displayed on session signage throughout the conference venue.

Your feedback is important and valuable. All feedback will be used to improve and select sessions for future events.

Page 31: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)

April 10-12, Chicago, IL

Thank you!Diamond Sponsor Platinum Sponsor