using machine learning to automate clinical pathways

47
Using Machine Learning to Automate Clinical Pathways David Sontag, PhD Department of Computer Science Courant Ins@tute of Mathema@cal Sciences NYU Joint work with my student Yoni Halpern (NYU) and Steven Horng (Beth Israel Deaconess Medical Center)

Upload: diannepatricia

Post on 16-Feb-2017

461 views

Category:

Technology


3 download

TRANSCRIPT

Using  Machine  Learning  to  Automate  Clinical  Pathways  

David  Sontag,  PhD  Department  of  Computer  Science  

Courant  Ins@tute  of  Mathema@cal  Sciences  NYU  

Joint  work  with  my  student  Yoni  Halpern  (NYU)  and  Steven  Horng  (Beth  Israel  Deaconess  Medical  Center)  

Health  Informa@on  Technology  is  Rapidly  Changing  

•  Aided  by  HITECH  Act,  hospital  adop@on  of  EHRs  has  increased  5-­‐fold  since  2008  

[Charles  et  al.,  ONC  Data  Brief,  May  2014]  

•  Over  $4  billion  of  investment  in  digital  health  startups  in  2014  

Health  Informa@on  Technology  is  Rapidly  Changing  

Analy@cs  /  Big  Data  

Healthcare  Consumer  Engagement  

[Wang  et  al.,  “Digital  health  funding  in  Q1  2015  over  $600M”,  Rock  Health,  April  2015]  

EHR  /  Clinical  Workflow  

Digital  Diagnos@cs  

Popula@on  Health  

Management  

Digital  Medical  Device  

[Weber  et  al.  (2014).  Finding  the  Missing  Link  for  Big  Biomedical  Data.  JAMA.]  

Wealth  of  digital  health  data  available  

Research  in  my  clinical  ML  lab  

•  Next-­‐genera*on  electronic  health  records  focus  of  today’s  talk  

•  Popula@on-­‐level  risk  stra@fica@on  

•  Beber  managing  pa@ents  with  chronic  disease  

clinicalml.org  

Emergency  Department:  

•  Limited  resources  •  Time  sensi*ve  •  Cri*cal  decisions  

Triage  Informa@on  (Free  text)  

Lab  results  (Con@nuous  valued)  

MD  comments  (free  text)  

Specialist  consults   Physician  documenta@on  

Repeated  vital  signs  (con@nuous  values)  Measured  every  30  s  

T=0  

30  min  2  hrs  

Disposi@on  

Next-Generation EHR for the Emergency Department

All  pa*ent    observa*ons  

MD/nurse  documenta@on  

Billing  codes   Vitals   Orders   Labs   History  

Built  on  Top  of  Real-­‐@me  Predic@on  of  Clinical  State  Variables  

All  pa*ent    observa*ons  

Clinical  state  variables  

MD/nurse  documenta@on  

Billing  codes   Vitals   Orders   Labs   History  

From  nursing  home?  

Has  altered  mental  status?  

Has  cardiac  e@ology?  

Has  infec@on?  

Will  die  in  next  30  days?  

Built  on  Top  of  Real-­‐@me  Predic@on  of  Clinical  State  Variables  

Machine  learning  and  natural  language  processing  

All  pa*ent    observa*ons  

Clinical  state  variables  

MD/nurse  documenta@on  

Billing  codes   Vitals   Orders   Labs   History  

Ac*on   Alerts/Reminders  

Decision  support   Cohort  Selec@on  QA  review   Contextual  display  

From  nursing  home?  

Has  altered  mental  status?  

Has  cardiac  e@ology?  

Has  infec@on?  

Will  die  in  next  30  days?  

Built  on  Top  of  Real-­‐@me  Predic@on  of  Clinical  State  Variables  

Machine  learning  and  natural  language  processing  

All  pa*ent    observa*ons  

Clinical  state  variables  

MD/nurse  documenta@on  

Billing  codes   Vitals   Orders   Labs   History  

Ac*on   Alerts/Reminders  

Decision  support   Cohort  Selec@on  QA  review   Contextual  display  

From  nursing  home?  

Has  altered  mental  status?  

Has  cardiac  e@ology?  

Has  infec@on?  

Will  die  in  next  30  days?  

Built  on  Top  of  Real-­‐@me  Predic@on  of  Clinical  State  Variables  

Machine  learning  and  natural  language  processing  

Advise  fall  precau@ons  

Suggested  order  sets  

Triggering  celluli@s  pathway  

Sepsis  alert   Panel  management  

Example:  Triggering  Clinical  Pathways  

•  Clinical  Pathways  project  at  Beth  Israel  Deaconess  Medical  Center  (BIDMC)  

•  Standardizing  care  in  the  Emergency  Department  

–  Reduce  possibili@es  for  error  

–  Enforce  established  best  prac@ces  

•  Pathways  have  been  shown  to  reduce  in-­‐hospital  complica@ons,  without  increasing  costs  [Rober  et  al  2010]  

Celluli@s  Pathway  Flowchart  

Automa@ng  triggers  

•  Don’t  rely  on  the  user’s  knowledge  that  the  pathway  exists!  

Current  triggering  mechanism  (Celluli@s  pathway)  

Trigger  if  chief  complaint  contains  any  of  the  following:    

CELLULITIS,  REDDENED  HOT  LIMB,  ERYTHEMA,  LEG  SWELLING,  INFECTION,  HAND,  LEG,  FOOT,  TOE,  ARM,  FACE,  FINGER  

Current  triggering  mechanism  (Celluli@s  pathway)  

Trigger  if  chief  complaint  contains  any  of  the  following:    

CELLULITIS,  REDDENED  HOT  LIMB,  ERYTHEMA,  LEG  SWELLING,  INFECTION,  HAND,  LEG,  FOOT,  TOE,  ARM,  FACE,  FINGER  

Expert  constructed  rule  –  built  for  sensi*vity  

Could  we  learn  a  beber  rule?  

Supervised  learning  is  a  non-­‐starter  

•  Leverage  large  clinical  databases  to  learn  predic@ve  rules.  

•  Need  labeled  data  •  Classifiers  onen  don’t  generalize  across  ins@tu@ons    

LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Our  contribu@on:    Anchor  &  Learn  Framework  

•  Use  a  combina@on  of  domain  exper@se  (simple  rules)  and  vast  amounts  of  data  (machine  learning).  

•  Method  does  not  require  any  manual  labeling.  

•  Anchors  are  highly  transferable  between  ins@tu@ons.  

[Halpern  et  al.,  AMIA  2014]  

What  are  anchors?  

•  Rather  than  provide  gold-­‐standard  labels,  construct  a  simple  rule  that  can  catch  some  posi@ve  cases.    

What  are  anchors?  

•  Rather  than  provide  gold-­‐standard  labels,  construct  a  simple  rule  that  can  catch  some  posi@ve  cases.    

•  Examples:  

Phenotype   Possible  Anchor  Diabe@c   gsn:016313  (insulin)  in  Medica@ons  

Cardiac   ICD9:428.X  (heart  failure)  in  Diagnoses  

Nursing  home   “from  nursing  home”  in  text  

Social  work   “social  work  consulted”  in  text  

What  are  anchors?  

•  Rather  than  provide  gold-­‐standard  labels,  construct  a  simple  rule  that  can  catch  some  posi@ve  cases.  Low  sensi*vity  here  is  ok!    

•  Examples:  

Phenotype   Possible  Anchor  Diabe@c   gsn:016313  (insulin)  in  Medica@ons  

Cardiac   ICD9:428.X  (heart  failure)  in  Diagnoses  

Nursing  home   “from  nursing  home”  in  text  

Social  work   “social  work  consulted”  in  text  

Learning  with  Anchors  LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Pa@ent    database  

•  Iden@fy  anchors  

Learning  with  Anchors  LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Pa@ent    database  

1011001  

•  Iden@fy  anchors  

Learning  with  Anchors  LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Pa@ent    database  

1011001  

•  Iden@fy  anchors  •  Learn  to  predict  the  anchors  (anchor  as  pseudo-­‐labels)  

Learning  with  Anchors  LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Pa@ent    database  

1011001  

•  Iden@fy  anchors  •  Learn  to  predict  the  anchors  (anchor  as  pseudo-­‐labels)  •  Account  for  the  difference  between  anchors  and  labels  

Transform  

Predict  anchor   Predict  label  

New  ins@tu@on  

Generalizability/Portability  LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

New  ins@tu@on  

Generalizability/Portability  

Data  may  be  very  different:  •  Language  •  Representa@on    •  Popula@on  

New  ins@tu@on  

Generalizability/Portability  

As  long  as  our  anchors  appear  in  the  new  data  as  well…  

LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

New  ins@tu@on  

Generalizability/Portability  

As  long  as  our  anchors  appear  in  the  new  data  as  well…  Can  learn  a  new  model,  specific  to  the  new  ins@tu@on.  

LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

New  ins@tu@on  

Generalizability/Portability  

As  long  as  our  anchors  appear  in  the  new  data  as  well…  Can  learn  a  new  model,  specific  to  the  new  ins@tu@on.  

Only  need  to  share  anchor  defini*ons,  Each  site  trains  models  on  its  own  data.  

LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

Theore@cal  basis  for  anchors  •  Unobserved  variable:  Y,  Observa@on:  A  •  A  is  an  anchor  for  Y  if  condi@oning  on  A=1  gives  uniform  samples  from  the  set  of  posi8ve  cases.  

Theore@cal  basis  for  anchors  •  Unobserved  variable:  Y,  Observa@on:  A  •  A  is  an  anchor  for  Y  if  condi@oning  on  A=1  gives  uniform  samples  from  the  set  of  posi8ve  cases.  

•  Alterna@ve  formula@on  –  two  necessary  condi@ons:  

P (Y = 1|A = 1) = 1Posi*ve  condi*on  

A ? X|YCondi*onal  independence  

AND  

X represents  all  other  observa@ons.  

Theore@cal  basis  for  anchors  •  Unobserved  variable:  Y,  Observa@on:  A  •  A  is  an  anchor  for  Y  if  condi@oning  on  A=1  gives  uniform  samples  from  the  set  of  posi8ve  cases.  

•  Alterna@ve  formula@on  –  two  necessary  condi@ons:  

P (Y = 1|A = 1) = 1Posi*ve  condi*on  

A ? X|YCondi*onal  independence  

AND  

X represents  all  other  observa@ons.  

e.g.  If  pa@ent  is  taking  insulin,  the  pa@ent  is  surely  diabe*c.  

e.g.  If  we  know  the  pa@ent  had  heart  failure,  knowing  whether  the  diagnosis  code  appears  does  inform  us  about  the  rest  of  the  record.  

Theore@cal  basis  for  anchors  •  Unobserved  variable:  Y,  Observa@on:  A  •  A  is  an  anchor  for  Y  if  condi@oning  on  A=1  gives  uniform  samples  from  the  set  of  posi8ve  cases.  

•  Theorem  [Elkan  &  Noto  2008]:    

In  the  above  se>ng,  a  func8on  to  predict  A    

can  be  transformed  to  predict  Y  

•  Can  also  use  more  recent  advances  on  learning  with  noisy  labels  (e.g.,  Natarajan  et  al.,  NIPS  ‘13)  

Learning  with  anchors  Input:  anchor  A  

           unlabeled  pa@ents  Output:  predic@on  rule  1.  Learn  a  calibrated  classifier  (e.g.  

logis@c  regression)  to  predict:  

2.  Using  a  validate  set,  let  P  be  the  pa@ents  with  A=1.  Compute:  

3.  For  a  previously  unseen  pa@ent  t,  predict:  

Pr(A = 1 | X̃ )

C =1

|P|X

k2PPr(A = 1 | X̃ (k))

[Elkan  &  Noto  2008]  

1

CPr(A = 1|X (t)) if A(t) = 0

1 if A(t) = 1

Calibra*on  C  is  the  average  model  

predic@on  for  pa@ents  with  anchors.  

Learning  Learn  to  predict  A  from  the  other  variables.  

Transforma*on  If  no  anchor  present,  

according  to  a  scaled  version  of  the  anchor-­‐predic@on  

model.  

…  

…  

Specified  anchors  Automated  sugges@ons  

Detailed  pa@ent  display  

Ranked  pa@ent  list  

Pa@ent  filters  

User  interface  to  specify  anchors  

Rapid  itera*on  ~30  min  to  add  a  new  clinical  state  

variable  

Sonware  freely  available:  clinicalml.org  

Learned  model:  Celluli@s  

Pyxis  

Unstructured  text  

Anchors  Highly  weighted  features  (covariates)  

ICD9  680-­‐686:    Infec*ons  of  skin  and  subcutaneous  *ssue  

celluli*s  celluli*c  cellulits    paronychia    pilonidal    bite  cyst    boil    

abcess  abscess    abcesses  red    redness    reddness    erythema  

unasyn    vanco  finger  thumb  rle  lle  gluteal  

cephalexin  vancomycin  clindamycin  

cephazolin  amoxicillin  sulfameth/trimeth  

(using  200K  pa@ents’  data,  2008-­‐2013)  

Learned  model:  Cardiac  E@ology  

ICD9  codes  410.*  acute  MI  

411.*  other  acute  …  413.*  angina  pectoris  785.51  card.  shock  

Pyxis  coron.  vasodilators  

loop  diure@c  

Anchors  

cmed  

Ages  age=80-­‐90  age=70-­‐80  age=90+  

nstemi  stemi  ntg    lasix  nitro  

lasix  furosemide  

Medica*ons  aspirin  

clopidogrel  Heparin  Sodium  Metoprolol  Tartrate  

Morphine  Sulfate  Integrilin  Labetalol  

Pyxis  

Unstructured  text  

cp  chest  pain  edema  cmed  

chf  exacerba@on  sob  

pedal  edema  

Sex=M  

Highly  weighted  features  (covariates)  

(using  200K  pa@ents’  data,  2008-­‐2013)  

Learned  model:  Nursing  Home  

nursing  facility  

nursing  home  nsg  facility  

nsg  home  nsg.  home  

from  staff  at  

resident  sent  

reported  

Ages  age=90+  age=80-­‐90  age=70-­‐80  

baseline  changes  nonverbal  

ams  unwitnessed_fall  

confusion  

senna  colace  

trazodone  

dnr  full  code  g  tube  foley  nh  

Medica*ons  

vancomycin  levofloxacin  

Pyxis  

Unstructured  text  

Anchors  

mirtazapine  maalox  tums  

Highly  weighted  features  (covariates)  

(using  200K  pa@ents’  data,  2008-­‐2013)  

Evalua@on:  ED  red  flags  

•  Ac@ve  malignancy  •  Fall  •  Cardiac  E@ology  •  Infec@on  •  From  Nursing  Home  

•  An@coagulated  •  Immunosuppressed  •  Sep@c  Shock  •  Pneumonia  

We  gathered  gold  standard  labels  for  these  9  variables  by  adding  ques@ons  to  EMR  at  @me  of  ED  disposi@on:  

Comparison  to  Exis@ng  Approaches  

•  (Rules)  Predict  just  according  to  the  anchors.    – 1  if  anchor  is  present,  0  otherwise  

•  (ML)  Machine  learning  (logis@c  regression)  – Using  up  to  3K  labels  –  Improves  with  more  labels,  but  labels  are  expensive!  

Accuracy  of  predic@ons  

*  

Accuracy  of  predic@ons  

*  

Accuracy  of  predic@ons  

*  

*  

Scaling  this  up  •  Currently  making  predic@ons  for  40  clinical  variables  within  the  BIDMC  pa*ent  display  – e.g.  allergic  reac@on,  motor  vehicle  accident,  hiv+  

•  Only  turned  on  for  a  small  number  of  clinicians  

Suggested  tags:  MD  can  accept/reject  

Scaling  this  up  •  Currently  making  predic@ons  for  40  clinical  variables  within  the  BIDMC  pa*ent  display  – e.g.  allergic  reac@on,  motor  vehicle  accident,  hiv+  

Accep@ng  a  tag  triggers  events    (pathway  enrollment,  specialized  order  sets,  etc)  

Our  next  steps  

•  Shared  library  of  anchored  phenotypes  •  Real-­‐@me  es@ma@on  of  clinical  states  and  actual  use  for  decision  support  within  ED  

•  Test  portability  of  anchors  to  other  ins@tu@ons  

More  info:  clinicalml.org