predicting risk of re-hospitalization for congestive heart failure patients (in collaboration with )...

33
Predicting Risk of Re- hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai, Si-Chi Chin, David Hazel, Kiyana, Mehrdad, (UWT) Paul Amoroso, Yoshi Williams, Dr. Lester Reed, Sheila, Eric Johnson (MHS)

Upload: kaitlin-roland

Post on 14-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients(in collaboration with )

Jayshree AgarwalSenjuti Basu Roy,

Ankur Teredesai, Si-Chi Chin, David Hazel, Kiyana, Mehrdad, (UWT)

Paul Amoroso, Yoshi Williams, Dr. Lester Reed, Sheila, Eric Johnson (MHS)

Page 2: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Motivation

Congestive Heart

Failure(CHF)

Many hospitalizations

readmissions

19.6% patients readmitted within 30 days [Jencks et

al. 2009]

31.1% patients readmitted within 60 days [Jencks et

al. 2009]

LOW Readmission rate = HIGH quality of care by hospital

No reimbursement for readmission within 30 days

$$$COST - 2004 unplanned re-admits = $17.4 billion [Jencks et

al. 2009]

2

Page 3: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

MHS - UWT Web and Data Science collaboration objectives

Predict the RISK of Readmission for CHF patientsReduce the Readmission rate and cost Improve patient satisfaction and quality of careAppropriate pre-discharge and post-discharge planningProper resource utilization

3

Page 4: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Problem

Develop models that can predict risk of readmission for CHF patients within 30 days after discharge 60 days after discharge

The readmission may happen for other reasons in addition to CHF

5

Page 5: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Overall Approach

How to solve the problem?– Apply predictive data mining techniques such as,

classificationWhat do these predictive mining techniques

require?– Data in homogeneous format• Information Extraction, Integration, and data

preparation• Prepare labeled dataset to train the model; used later

on for testing.6

Page 6: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Our ChallengesBuilding domain knowledge– Which variables to consider?– How to merge and unify them in a homogeneous

format (information extraction and integration)– How to understand the relative importance of the

variables in the prediction task?How to prepare data?– Class label generation– Noisy real world data (missing values, inconsistencies,

etc.)– Serious skew in the dataset

7

Page 7: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

8

Solution

Page 8: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Building Predictive Classification Models

Data Understanding

Data Preprocessing

Modeling

Evaluation

9

Page 9: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Data Understanding

Collect initial data Acquire Domain knowledge

Describe and explore dataset

Create data visualization

10

Page 10: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Building Predictive Classification Models

Data Understanding

Data Preprocessing

Modeling

Evaluation

11

Page 11: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

12

Data Preprocessing

Define class label Attribute selection

Data Integration

Removal of incomplete data

Finding Eligible CHF admissions

Page 12: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

13

Eligible CHF admissions and Generating Class Labels

All CHF Admissions

Eligible CHF Admissions

In hospital deaths removed

Is there any readmission

within x days of discharge?

The class label is assigned as 1

The class label is assigned as 0

YESNO

X=30 X=60

Page 13: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

14

Attribute selection

Yale Model [Krumholz et al]

-Socio-Demographic variable(2)

-Comorbidities(35)

“Baseline”

Additional predictor variables identified by us

(14)

“New”

“Correlated”“All”

Chi-square correlation test

Page 14: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

15

Data Extraction

Labeled data

Patient details

Primary and Secondary diagnosis

Lab measurement

Administrative data

Data used for training the Models

Data

Incomplete data removed

Table Joins

Page 15: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

16

Data Distribution

30 days time frame 60 days time frame

Readmissions0

2000

4000

6000

8000

10000

12000

ReadmitNo Readmit

Readmissions0

2000

4000

6000

8000

10000

12000

ReadmitNo Readmit

Page 16: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

17

Building Predictive Classification Models

Data Understanding

Data Preprocessing

Modeling

Evaluation

Page 17: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

18

Modeling

• Logistic regression• Naïve Bayes classifier• Support Vector Machine

Balancing imbalanced data by under-sampling and over

sampling

Selecting modeling technique for Binary

Classification

Building prediction models

Page 18: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

19

Logistic Regression Model

P (P

roba

bilit

y of

Y)

Z ------>

Page 19: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

20

Naïve Bayesian Classification

Statistical Classifier performs probabilistic prediction based on Bayes Theorem

Assumes that the attributes are conditionally independent

Given a data tuple X and m classes Predicts X belongs to only if is highest among all the

for all the m classes

Page 20: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

21

Support Vector Machine

A method of classification for both linear and non linear data

Searches for optimal separating hyperplane separating the two classes

Page 21: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Building Predictive Classification Models

Data Understanding

Data Preprocessing

Modeling

Evaluation

22

Page 22: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Performance Evaluation Metrics

Precision – percentage of tuples labeled as positive are actually positive = TP/TP+FP

Recall – measures the percentage of positive tuples that are labeled positive = TP/TP+FN

Accuracy – percentage of tuples correctly classified = (TP+TN)/P+N ROC curves and area under the curve (AUC) – Shows the trade-off

between true positive rate and false positive rate.

23

Page 23: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Evaluation

• Predictive models are assessed using 10 fold cross validation

• The performance is compared using different evaluation metrics mentioned previously

25

Page 24: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

RESULTS

Page 25: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Logistic Regression for 30 days

Area Under the Curve (AUC) Recall

27

Page 26: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

Logistic regression for 60 days

Area Under the Curve (AUC) Recall

28

Page 27: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

29

Naïve Bayes classifier for 30 days

Attribute Set0.56

0.57

0.58

0.59

0.6

0.61

0.62

0.63

0.64

BaselineNewAllCorrelated

Area Under the Curve (AUC)

Page 28: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

30

Support Vector Machine for 30 days

Attribute Set0.58

0.59

0.6

0.61

0.62

0.63

0.64

BaselineNewAllCorrelated

Area Under the Curve (AUC)

Page 29: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

35

Conclusion and Discussion

It is one of the difficult problem to solveFeature selection gives the best results. With data balancing recall of the model improves

Page 30: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

36

Future Work

Investigate other classifier techniques like ensemble methods, neural networks

To explore additional features and study their relevance

To employ other feature selection techniquesTo device a method to impute missing valuesDeploying the predictive models

Page 31: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

37

Acknowledgement

Multicare health System (MHS) and Dr. Lester Reed for giving us this opportunity

Data architects and domain experts in MHS for their inputs

Professors Dr. Ankur Teredesai and Dr. Senjuti Basu Roy for their guidance

Other team members in UWT for their support

Page 32: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

38

References

S. F. Jencks, M. V. Williams, and E. A. Coleman, “Rehospitalizations among Patients in the Medicare Fee-for-Service Program,” New England Journal of Medicine, vol. 360, no. 14, pp. 1418–1428, 2009.

J. Han and M. Kamber, Data mining: concepts and techniques. Morgan Kaufmann, 2006

H. M. Krumholz, S. L. T. Normand, P. S. Keenan, Z. Q. Lin, E. E. Drye, K. R. Bhat, Y. F. Wang, J. S. Ross, J. D. Schuur, and B. D. Stauffer, Hospital 30-day heart failure readmission measure methodology. Report prepared for the Centers for Medicare & Medicaid Services.

Page 33: Predicting Risk of Re-hospitalization for Congestive Heart Failure Patients (in collaboration with ) Jayshree Agarwal Senjuti Basu Roy, Ankur Teredesai,

39

Questions