southeastern diabetes initiative and electronic health records april 28, 2015 michael pencina, phd...

23
Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Upload: lily-miles

Post on 17-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Southeastern Diabetes Initiative and Electronic Health Records

April 28, 2015

Michael Pencina, PhDDirector of Biostatistics, Duke Clinical Research Institute

Page 2: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Center for Predictive Medicine (CPM)

Mission Statement:

To advance personalized medicine through development and application of novel approaches to quantify and communicate risk.

Composition:

Faculty and Staff Statisticians, Clinicians and Informaticists from DCRI, Scientists from Electrical and Computer Engineering and Information Initiative at Duke

Page 3: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Southeastern Diabetes InitiativeSupporting a platform for real-time monitoring and evaluation

of population health through spatially enabled data architecture and analytics

Page 4: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

CENTRAL OBJECTIVES• Improve healthcare delivery at the individual- and population-level • Improve health outcomes and quality of life • Deploy multiple levels of intervention, stratified by risk factors of individual patients• Reduce overall healthcare costs for populations by reducing hospital and ED admissions

and major proceduresTHEMES

• Use of electronic health record and publicly available data to monitor the health of a population

• Use of analytics to target interventions where they are most needed• Longitudinal monitoring through data sharing with multiple health provider partners

Missing data

monthly data extract + risk algorithms

Lower intensity Higher intensity

Intervention Spectrum

Page 5: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Data Driven Approach to Implementation

LOW RISK MODERATE RISK HIGH RISKIndividual • Access to community

resources• Educational materials• Access to evidence

based primary and specialty care (provider education)

• Telephone-based interventions

• Mobile technology interventions

• Diabetes self-management classes

• Clinical practice QI

• In home team-based care

Neighborhood • Grocery store tours• Faith-based access to

education and screening

• Faith-based and neighborhood interventions

• Neighborhood canvassing

County • Public awareness (communication, social media)

• Integration of resources

• Evaluate access to healthcare in community

Region • Dissemination of best practices

Training, Research, Education

Page 6: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Data Driven Approach to Risk Prediction

2010 2011

Use data from 2010 to predict serious

outcomes in 2011

20092007 2008 2012Validate with data from other years

Defining “serious outcome”• inpatient with dx of MI, cardiac

arrest, heart failure, ventricular fibrillation, ischemic cardiomyopathy, CAD, revascularization procedure, stroke, vascular disease, PVD, DM renal complications, kidney disease, dialysis, amputation, foot infection/ulcer

• death

• Clinical data• Laboratory values• Demographics• Utilization

Page 7: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Moderate risk

SEDI Intervention Flow Chart

EHR dataProviderReferrals

Risk Algorithm

High risk Low risk

Telephone Module InterventionHigh-risk Intervention

• 12 monthly calls by interventionist• Mailers reinforcing/supplementing

information covered during calls Electronic data capture:• Medication adherence• Side effects• Routine exam/test compliance• Weight/Tobacco/Exercise goals

• 5 sets of home visits by team, occurring every 6 months

• Assistance also provided between visits• Teams include nurses, pharmacists, social

workers, dietitians, patient navigators, community health workers

Electronic data capture:• Demographics• Vital signs/physical exam• Laboratory results• ED/Inpatient visits• Medication adherence• Patient activation• Health literacy

≥90% 65-89% ≤65%

Quitman County, MSDurham County, NC

Cabarrus County, NCMingo County, WV

Community-based events, diabetes self-management classes, quality improvement with current sources of care, media activities, cooking demonstrations, etc.Data collected: Participation and demographics

Page 8: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Ensuring our computable phenotypes are accurate and valid

Page 9: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Challenges to valid EHR-based computable phenotypes

Our goal is to implement algorithms within the EHR that accurately identify patients with a given phenotype and those without. But –

• Limitations of EHR data

• Variation in published authoritative source computable phenotype definitions/algorithms

Page 10: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Limitations of EHR Data Can Result in Inaccurate/biased Computable Phenotypes

• Frequent missingness (often non-random)

-- Data encounter-based:

only recorded during healthcare episodes (bias – healthiest excluded)

-- Patients move around and/or go elsewhere for care (lost to follow-up)

• Inaccurate or uninterpretable

-- Errors are common (most data entered by busy providers during visit, or from recall)

-- Based on coding that is influenced by billing – systematic biases

-- Uninterpretable data (e.g., Units of measurement missing, qualitative assessments)

• Complexity and inconsistency

-- Clinical definitions, coding rules, data collection systems vary over time

-- Data collection can vary by providers at different locations

-- Much information is non-coded and stored in narrative notes

Page 11: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Variation in Published (Authoritative Source) Computable Phenotype Definitions

Page 12: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

The CPM-SEDI Phenotype

Development Process

Page 13: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

The Data

Page 14: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Source Data for the PlatformElectronic Health Record Data

Core Clinical Domains

1. Patient Demographics

2. Encounters

3. Diagnoses

4. Procedures

5. Labs

6. Vital Signs

7. Medications

8. Social History

Research Data stored in electronic data capture

systems

Publicly available census data – community resources, social data, resource and environmental data

Duke’s Decision Support Repository

Medicare and Medicaid Claims Data

NATIONAL CENTER FOR GEOSPATIAL MEDICINE

datamart

Page 15: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute
Page 16: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

A Continuum of Informatics Services to Support the Analysis Pipeline

Page 17: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Phenotyping

Page 18: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute
Page 19: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

21

Known Diabetic Patients >35 in 2010 = 14,330

With Any Encounter in 2011 = 11,548

With An Inpatient Encounter in 2011 = 2488

With A Serious Outcome in 2011 = 1742

80% Modeling Set

12% of 2010 patients;15% of modeling set

Serious Outcome = EITHER: at least 1 inpatient stay in 2011 with diagnosis of at least one of: • MI, cardiac arrest, ventricular fibrillation, ischemic cardiomyopathy, CAD, revascularization

procedure• stroke• vascular disease, PVD• diabetic renal complications, kidney disease, dialysis• amputation, foot infection/ulcer

OR: death

Outcome Definition

Page 20: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

22

• Potential factors were identified by their predictive power in the absence of other factors• Factors included the following categories:

• Clinical diagnoses• Demographics: age, sex• Insurance status• Hospital / clinic utilization• Diabetes medications• Labs:

• Max random glucose• Min random glucose• Max HbA1c• Max systolic b/p• Max HDL • Min LDL

• Lab values had a significant number of missing values. In this preliminary analysis, we simply added an N/A indicator variable to the model. E.g.

• Glu.max was the maximum random glucose if present, 0 if not• Glu.max.na was 0 if a random glucose value was present, 1 if not

• We also allowed interaction of the lab values and N/A indicator with other (fully present) variables. (Interaction terms implicitly create a non-constant imputation formula.)

• 52 distinct betas were estimated, including those for missing values

Preliminary Logistic Regression Model: Factors

Page 21: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

23

Logistic Regression Model: Betas Grouped By Factor Type

Factor Estimate z value Pr(>|z|) Signif. Factor Estimate z value Pr(>|z|) Signif.kidney_disease 1.0E+00 9.0 2.0E-16 *** glu.max 1.1E+00 4.2 2.2E-05 ***heart_failure 6.7E-01 7.6 2.3E-14 *** glu.max * age1.5 -1.5E-03 -3.3 0.0008 ***(Intercept) -1.3E+01 -7.3 4.1E-13 *** glu.max.na 4.6E+00 3.1 0.0017 **cad 6.8E-01 6.1 9.0E-10 *** glu.max.na * age1.5 -6.4E-03 -2.6 0.0099 **age1.5 1.6E-02 5.4 5.5E-08 *** glu.max.na * hypertension -3.9E-01 -1.7 0.0828 .metformin -4.7E-01 -5.2 2.1E-07 ***sex.m 1.4E+00 4.2 2.3E-05 *** glu.min -8.2E-03 -3.9 9.1E-05 ***stroke 5.9E-01 3.9 7.9E-05 *** glu.min * nI 2.4E-03 4.4 1.3E-05 ***obesity -3.4E-01 -3.8 0.0001 *** glu.min^2 2.2E-05 3.5 0.0004 ***copd 3.8E-01 3.8 0.0002 *** glu.min.na * kidney_disease 1.4E+00 2.7 0.0076 **alcohol -5.8E-01 -2.6 0.0083 **depression 1.6E-01 1.6 0.1012 a1cmax 5.9E-02 2.0 0.0445 *dialysis 3.3E-01 1.8 0.0742 . a1cmax * sex.m -1.4E-01 -3.6 0.0004 ***foot 4.2E-01 2.4 0.0165 * a1cmax.na 6.2E-01 2.5 0.0127 *pvd | vascular_disease 3.0E-01 3.2 0.0016 ** a1cmax.na * sex.m -1.0E+00 -2.9 0.0037 **revasc_proc 1.6E-01 1.5 0.1422smoker 2.3E-01 2.2 0.0247 * sys.max 2.8E-02 3.8 0.0001 ***substance 2.5E-01 1.3 0.1833 sys.max * age1.5 -3.3E-05 -2.6 0.0086 **

sys.max.na 4.4E+00 3.8 0.0001 ***medicaid 5.0E-01 3.4 0.0008 *** sys.max.na * age1.5 -5.0E-03 -2.5 0.0139 *medicare & age < 63 5.7E-01 4.6 4.4E-06 ***medicare 5.9E-02 0.6 0.5335 hdl.max -4.0E-01 -3.1 0.0019 **

hdl.max * sex.m -9.4E-02 -2.2 0.0304 *cad:stroke -4.7E-01 -2.4 0.0159 * hdl.max.na -2.0E+00 -3.3 0.0010 ***cad:metformin 5.0E-01 3.3 0.0011 **cad * kidney_disease -5.8E-01 -3.7 0.0002 *** (ldl.min - 100)^2 4.5E-05 2.5 0.0143 *kidney_disease * stroke -5.8E-01 -2.9 0.0037 ** ldl.min.na 1.7E-01 0.4 0.6949kidney_disease * medicaid 8.0E-01 2.7 0.0079 **

log(1 + nEI) 1.8E-01 1.6 0.1061log(1 + nEonly) 1.4E-01 2.3 0.0225 *

insulin 1.8E-01 2.3 0.0231 *pioglitazone | rosiglitazone -2.4E-01 -1.8 0.0676 .

Page 22: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

24

0.0 0.2 0.4 0.6 0.8

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Calibration Plot

Predicted Fraction (quantile=20)

Obse

rved F

ract

ion

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC

1-Specificity

Sensi

tivity

AUC = 0.836

Logistic Regression Model: ROC and Calibration

Next Steps

• Incorporate additional labs, particularly creatinine and triglycerides• But also look at hematocrit, hemoglobin and other standard lab values

• Add non-diabetes meds

Page 23: Southeastern Diabetes Initiative and Electronic Health Records April 28, 2015 Michael Pencina, PhD Director of Biostatistics, Duke Clinical Research Institute

Summary: Opportunities for Collaboration

• Data still curated and updated

• New phenotypes developed/validated

• Unlimited possibilities to expand

• Strong analytic team eager to take on cutting-edge questions and hypotheses