intelligent clinical decision support with probabilistic

1
UC Irvine Heart Disease dataset Intelligent Clinical Decision Support with Probabilistic and Temporal EHR Modeling Simulations over 5,807 IN/TN patients >50% improvement in cost eectiveness >30% improvement in outcomes beyond existing fee-for-service model [Bennett and Hauser, Articial Intelligence in Medicine 2013] EHR SRL POMDP Domain expert Provides rules & corrective feedback Identies clinically relevant features Disease progression model Intelligent Clinical Decision Support System (ICDSS) Primary clinician Observations, test results, imaging, interventions Recommended treatment plans w/ human-readable rationale Kris Hauser* (PI), Sriraam Natarajan* (Co-PI), Shaun Grannis† (Co-PI) *Indiana University School of Informatics and Computing † Indiana University School of Medicine, Regenstrief Institute With clinical partners: Overview and Motivation Progress & Expected Outcomes Existing and Ongoing Work POMDP models in chronic depression treatment 0.45 0.55 0.65 0.75 0.85 0.95 J48 SVM AdaBoost Bagging Logis3c Naïve Bayes Single Rela3onal Tree So> Margin boos3ng AUCROC SRL for medical prediction tasks Cardiology domain Circulation; 92(8), 2157-62, 1995 JACC; 43, 842-7, 2004 Plaque in the left Coronary artery Given - age, sex, cholesterol, bmi, glucose, hdl level, ldl level, exercise, trig level, systolic bp and diastolic bp for all years 0 and 20 Predict – high Coronary Artery Calcication (CAC) levels at year 20 Atherosclerosis is the cause of the majority of Acute Myocardial Infarctions (heart attacks) Model: mixtures of relational probabilistic decision trees Learning: Relational Functional Gradient Boosting (RFGB) New Soft Margin relational learning algorithm improves recall Tunable penalties for false positives vs false negatives State-of-the-art performance on UC Irvine Heart Disease dataset (at right) and others Predic’ve model: Dynamic Bayesian Network Predic’on horizon:8 Decision variable: treat / not treat Outcome variables: selfreported survey (CDOI), treatment cost Objec’ve func’on: tradeoff between CDOI and $$$ (clinicianselected weight) [Yang et al, submitted to ICDM] Overview of envisioned CDSS Clinical decision-support systems (CDSS) have potential to exploit the wealth of clinical data in EHRs in addition to expert recommendations New AI techniques needed to plan chronic and multi-stage treatments: compute patient-specic, temporal, statistically- justied treatment plans Balance costs vs patient outcomes, reason with uncertainty Hypothesis: CDSS can improve state of current clinical practice by providing outcome-driven and cost-driven optimized decisions Two promising AI techniques: SRL and POMDP Sta’s’cal Rela’onal Learning (SRL): learning probabilis3c models from datasets with rela3onal structure Handles linked datasets, incomplete/missing data, noise Excellent for EHR data Par’ally Observable Markov Decision Processes (POMDP): learning probabilis3c models from datasets with rela3onal structure Handles linked datasets, incomplete/missing data, noise Excellent for EHR data 1800 1600 1400 1200 1000 800 600 400 200 0 Trivial Rand. Forest Log. Reg. Handmade BN Learned BN LOGLIKELIHOOD MODEL Final Outcome, Probabilis[c Predic[on, 10fold cross valida[on Higher is be\er (0 is perfect predic[on) Developing SRL methods to learn likelihoods of future adverse medical events CARDIA 20 year longitudinal dataset (N=5115) Preliminary results suggest high predictive power @ year 20 Future: use POMDP to suggest lifestyle changes (smoking, exercise), medications ER domain Int’l Stroke Trial dataset N=19,435 patients (1991-1996) with acute stroke symptoms 3 observation, 2 decision points: Admission (t=0): demographics, symptoms observed. Medicine administered Discharge (t=2 weeks): test results, diagnosis. Possible change of treatment, long-term prescription Follow up (t=6 mo): death, long term dependency Current: Learning probabilistic conditional models from partial & noisy data Future: optimize patient- specic plans Regenstrief PHESS: 65 million records across Indiana Identify high utilizers: signicant source of waste Stroke domain

Upload: others

Post on 23-Oct-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

UC Irvine Heart Disease dataset  

Intelligent Clinical Decision Support with Probabilistic and Temporal EHR Modeling

Simulations over 5,807 IN/TN patients >50% improvement in cost effectiveness >30% improvement in outcomes

beyond existing fee-for-service model

[Bennett and Hauser, Artificial

Intelligence in Medicine 2013]

EHR

SRL POMDP

Domain expert

Provides rules & corrective feedback

Identifies clinically relevant features

Disease progression model

Intelligent Clinical Decision Support System (ICDSS)

Primary clinician Observations, test results, imaging, interventions

Recommended treatment plans w/ human-readable rationale

Kris Hauser* (PI), Sriraam Natarajan* (Co-PI), Shaun Grannis† (Co-PI) *Indiana University School of Informatics and Computing † Indiana University School of Medicine, Regenstrief Institute

With clinical partners:

Overview and Motivation Progress & Expected Outcomes Existing and Ongoing Work

POMDP models in chronic depression treatment

0.45  

0.55  

0.65  

0.75  

0.85  

0.95  

J48   SVM   AdaBoost   Bagging   Logis3c   Naïve  Bayes   Single  Rela3onal  

Tree  

So>  Margin  boos3ng  

AUCR

OC  

SRL for medical prediction tasks Cardiology domain

Circulation; 92(8), 2157-62, 1995

JACC; 43, 842-7, 2004

Plaque in the left Coronary artery

Given - age, sex, cholesterol, bmi, glucose, hdl level, ldl level, exercise, trig level, systolic bp and diastolic bp for all years 0 and 20 Predict– high Coronary Artery Calcification (CAC) levels at year 20

Atherosclerosis is the cause of the majority of Acute Myocardial Infarctions (heart attacks)

Model: mixtures of relational probabilistic decision trees Learning: Relational Functional Gradient Boosting (RFGB)

§  New Soft Margin relational learning algorithm improves recall

§  Tunable penalties for false positives vs false negatives

§  State-of-the-art performance on UC Irvine Heart Disease dataset (at right) and others  

§  Predic've  model:  Dynamic  Bayesian  Network  §  Predic'on  horizon:  8  §  Decision  variable:  treat  /  not  treat  §  Outcome  variables:  self-­‐reported  survey  (CDOI),  treatment  cost  §  Objec've  func'on:  tradeoff  between  CDOI  and  $$$  (clinician-­‐selected  weight)  

[Yang et al, submitted to ICD

M]  

Overview of envisioned CDSS

§  Clinical decision-support systems (CDSS) have potential to exploit the wealth of clinical data in EHRs in addition to expert recommendations

§  New AI techniques needed to plan chronic and multi-stage treatments: compute patient-specific, temporal, statistically-justified treatment plans

§  Balance costs vs patient outcomes, reason with uncertainty §  Hypothesis: CDSS can improve state of current clinical practice by

providing outcome-driven and cost-driven optimized decisions

Two promising AI techniques: SRL and POMDP Sta's'cal  Rela'onal  Learning  (SRL):  learning  probabilis3c  models  from  datasets  with  rela3onal  structure  §  Handles  linked  datasets,  incomplete/missing  data,  noise  §  Excellent  for  EHR  data  

Par'ally  Observable  Markov  Decision  Processes  (POMDP):  learning  probabilis3c  models  from  datasets  with  rela3onal  structure  §  Handles  linked  datasets,  incomplete/missing  data,  noise  §  Excellent  for  EHR  data  

-­‐1800  

-­‐1600  

-­‐1400  

-­‐1200  

-­‐1000  

-­‐800  

-­‐600  

-­‐400  

-­‐200  

0  

Trivial   Rand.  Forest   Log.  Reg.   Handmade  BN   Learned  BN  

LOG-­‐LIKELIHO

OD  

MODEL  

Final  Outcome,  Probabilis[c  Predic[on,  

10-­‐fold  cross  valida[on  Higher  is  be\er  (0  is  perfect  predic[on)  

Developing SRL methods to learn likelihoods of future adverse medical events •  CARDIA 20 year longitudinal

dataset (N=5115) •  Preliminary results suggest

high predictive power @ year 20

Future: use POMDP to suggest lifestyle changes (smoking, exercise), medications

ER domain

Int’l Stroke Trial dataset N=19,435 patients (1991-1996) with acute stroke symptoms 3 observation, 2 decision points: •  Admission (t=0): demographics,

symptoms observed. Medicine administered

•  Discharge (t=2 weeks): test results, diagnosis. Possible change of treatment, long-term prescription

•  Follow up (t=6 mo): death, long term dependency

•  Current: Learning probabilistic conditional models from partial & noisy data

•  Future: optimize patient-specific plans

Regenstrief PHESS: 65 million records across Indiana Identify high utilizers: significant source of waste

Stroke domain