observa(onal health data sciences and informa(cs · pdf fileobserva(onal health data sciences...

32
Observa(onal Health Data Sciences and Informa(cs (OHDSI) George Hripcsak, MD, MS Columbia University Medical Center NewYork-Presbyterian Hospital Sea>le Symposium on Health Care Data AnalyBcs

Upload: lymien

Post on 27-Mar-2018

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Observa(onalHealthDataSciencesandInforma(cs

(OHDSI)GeorgeHripcsak,MD,MS

ColumbiaUniversityMedicalCenter

NewYork-PresbyterianHospital

Sea>leSymposiumonHealthCareDataAnalyBcs

Page 2: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Observa(onalHealthDataSciencesandInforma(cs(OHDSI,as“Odyssey”)

AmulB-stakeholder,interdisciplinary,

internaBonalcollaboraBvewithacoordinaBng

centeratColumbiaUniversity

Mission:Toimprovehealth,byempoweringa

communitytocollaboraBvelygeneratethe

evidencethatpromotesbe>erhealthdecisions

andbe>ercare

Aimingfor1,000,000,000paBentdatanetwork

h>p://ohdsi.org

Page 3: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

OHDSI’sglobalresearchcommunity

•  >140collaboratorsfrom20differentcountries

•  ExpertsininformaBcs,staBsBcs,epidemiology,clinicalsciences

•  AcBveparBcipaBonfromacademia,government,industry,providers

•  Currently600millionpaBentrecordsin52databases

h>p://ohdsi.org/who-we-are/collaborators/

Page 4: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Whylarge-scaleanalysisisneededin

healthcare

Alldrugs

Allhealthoutcomesofinterest

Page 5: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

PaBent-levelpredicBonsforpersonalizedevidencerequires

bigdata

2millionpaBentsseemexcessiveorunnecessary?

•  ImagineaproviderwantstocompareherpaBentwithotherpaBentswiththe

samegender(50%),inthesame10-yearagegroup(10%),andwiththesame

comorbidityofType2diabetes(5%)

•  ImaginethepaBentisconcernedabouttheriskofketoacidosis(0.5%)

associatedwithtwoalternaBvetreatmentstheyareconsidering

•  With2millionpaBents,you’donlyexpecttoobserve25similarpaBentswith

theevent,andwouldonlybepoweredtoobservearelaBverisk>2.0

Aggregateddataacrossahealthsystemof1,000providersmaycontain2,000,000paBents

Page 6: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

EvidenceOHDSIseekstogeneratefrom

observaBonaldata•  Clinicalcharacteriza(on

–  Naturalhistory:Whohasdiabetes,andwhotakesme`ormin?

–  Qualityimprovement:WhatproporBonofpaBentswithdiabetesexperiencecomplicaBons?

•  Popula(on-leveles(ma(on–  Safetysurveillance:Doesme`ormincauselacBcacidosis?

–  ComparaBveeffecBveness:Doesme`ormincauselacBcacidosismorethanglyburide?

•  Pa(ent-levelpredic(on–  Precisionmedicine:Giveneverythingyouknowaboutme,ifItakeme`ormin,whatisthechanceIwillgetlacBcacidosis?

–  DiseaseintercepBon:Giveneverythingyouknowaboutme,whatisthechanceIwilldevelopdiabetes?

Page 7: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

OHDSI’sapproachtoopenscience

Open

source

socware

Open

science

Enableusers

todo

something

Generate

evidence

•  OpenscienceisaboutsharingthejourneytoevidencegeneraBon

•  Open-sourcesocwarecanbepartofthejourney,butit’snotafinaldesBnaBon

•  Openprocessescanenhancethejourneythroughimprovedreproducibilityof

researchandexpandedadopBonofscienBficbestpracBces

Data+AnalyBcs+DomainexperBse

Page 8: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Standardizingworkflowstoenable

transparent,reproducibleresearch

Open

science

Generate

evidence

Databasesummary

Cohortdefini(on

Cohortsummary

Comparecohorts

Exposure-outcomesummary

Effectes(ma(on

&calibra(on

Comparedatabases

Definedinputs:•  Targetexposure

•  Comparatorgroup

•  Outcome

•  Time-at-risk

•  ModelspecificaBon

PopulaBon-levelesBmaBonforcomparaBve

effecBvenessresearch:

Is<intervenBonX>be>erthan<intervenBonY>

inreducingtheriskof<condiBonZ>?

Consistentoutputs:•  analysisspecificaBonsfortransparencyand

reproducibility(protocol+sourcecode)

•  onlyaggregatesummarystaBsBcs

(nopaBent-leveldata)

•  modeldiagnosBcstoevaluateaccuracy

•  resultsasevidencetobedisseminated

•  staBcforreporBng(e.g.viapublicaBon)

•  interacBveforexploraBon(e.g.viaapp)

Page 9: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

OHDSIDisBnguishingFeatures

•  InternaBonaleffort(size&coverage)– 43sourcesterminologiesfromaroundtheworld

•  Openscience(depth)–  Infrastructureservesthescience– Stack:Terminology,CDM,ETL,QA,VisualizaBon,

NovelanalyBcmethods,Clinicalresearch

•  FullinformaBonmodel

Page 10: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

HowOHDSIWorks

Sourcedata

warehouse,with

idenBfiable

paBent-leveldata

Standardized,de-

idenBfiedpaBent-

leveldatabase

(OMOPCDMv5)

ETL

Summary

staBsBcsresults

repository

OHDSI.org

Consistency

Temporality

Strength Plausibility

Experiment

Coherence

Biologicalgradient Specificity

Analogy

Compara(veeffec(veness

Predic(vemodeling

OHDSIDataPartners

OHDSICoordinaBngCenter

Standardized

large-scale

analyBcs

Analysis

results

AnalyBcs

development

andtesBng

Researchand

educaBon

Data

network

support

Page 11: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

DeepinformaBonmodelOMOPCDMv5.0.1

Concept

Concept_relaBonship

Concept_ancestor

Vocabulary

Source_to_concept_map

RelaBonship

Concept_synonym

Drug_strength

Cohort_definiBon

Standardizedvocabularies

A>ribute_definiBon

Domain

Concept_class

Cohort

Dose_era

CondiBon_era

Drug_era

Cohort_a>ribute

Standardizedderivedelem

ents

Stan

dardized

clin

icaldata

Drug_exposure

CondiBon_occurrence

Procedure_occurrence

Visit_occurrence

Measurement

ObservaBon_period

Payer_plan_period

Provider

Care_siteLocaBon

Death

Cost

Device_exposure

ObservaBon

Note

Standardizedhealthsystemdata

Fact_relaBonship

SpecimenCDM_source

Standardizedmeta-data

Standardizedhealtheconom

ics

Person

Page 12: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Extensivevocabularies

Page 13: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Preparingyourdataforanalysis

PaBent-level

datainsource

system/schema

PaBent-level

datain

OMOPCDM

ETL

design

ETL

implementETLtest

WhiteRabbit:profileyour

sourcedata

RabbitInAHat:mapyoursource

structureto

CDMtablesand

fields

ATHENA:standardized

vocabularies

forallCDM

domains

ACHILLES:profileyour

CDMdata;

reviewdata

quality

assessment;

explore

populaBon-

levelsummaries

OHDSItoolsbuilttohelp

CDM:

DDL,index,

constraintsfor

Oracle,SQL

Server,

PostgresQL;

Vocabularytables

withloading

scripts

h>p://github.com/OHDSI

OHDSIForums:PublicdiscussionsforOMOPCDMImplementers/developers

Usagi:mapyour

sourcecodes

toCDM

vocabulary

Page 14: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

ACHILLESHeelDataValidaBon

Page 15: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

ATLAStobuild,visualize,andanalyze

cohorts

Page 16: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Characterizethecohortsofinterest

Page 17: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

LAERTES:Knowledgebaseofwhatweknow:

literature,labeling,spontaneousreporBng

Page 18: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

OHDSIinAcBon

•  Generateevidence– Randomizedtrialisthegoldstandard

– ObservaBonalresearchissupporBng•  Canitbecomeapartnership?

Page 19: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

CharacterizaBon

•  TodaywecarryoutRCTswithoutclearknowledgeofactualpracBce

•  TherewillbenoRCTswithoutanobservaBonalprecursor

–  ItwillberequiredtocharacterizeapopulaBonusinglarge-scaleobservaBonaldatabeforedesigninganRCT

–  Diseaseburden–  ActualtreatmentpracBce

–  Timeontherapy

–  CourseandcomplicaBonrate

–  Donenowsomewhatthroughliteratureandpilotstudies

Page 20: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

TreatmentPathways

Public

Industry

Regulator

AcademicsRCT,Obs

Literature

Laypress

Socialmedia

Guidelines

Formulary

Labels

AdverBsing Clinician

PaBent

Family

Consultant

IndicaBon

Feasibility

Cost

Preference

Localstakeholders

Globalstakeholders Conduits

Inputs

Evidence

Page 21: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Networkprocess

1.  JointhecollaboraBve2.  ProposeastudytotheopencollaboraBve3.  Writeprotocol

–  h>p://www.ohdsi.org/web/wiki/doku.php?id=research:studies

4.  Codeit,runitlocally,debugit(minimizeothers’work)

5.  Publishit:h>ps://github.com/ohdsi

6.  EachnodevoluntarilyexecutesontheirCDM

7.  Centrallyshareresults8.  CollaboraBvelyexploreresultsandjointlypublish

findings

Page 22: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

OHDSIinacBon:

Chronicdiseasetreatmentpathways

•  ConceivedatAMIA

•  Protocolwri>en,codewri>enandtestedat2

sites

•  Analysissubmi>edto

OHDSInetwork

•  Resultssubmi>edfor7

databases

15Nov2014

30Nov2014

2Dec2014

5Dec2014

Page 23: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

OHDSIparBcipaBngdatapartnersAbbre-via(on

Name Descrip(on Popula(on,millions

AUSOM AjouUniversitySchoolofMedicine SouthKorea;inpaBenthospital

EHR2

CCAE MarketScanCommercialClaimsand

EncountersUSprivate-payerclaims 119

CPRD UKClinicalPracBceResearchDatalink UK;EHRfromgeneralpracBce 11CUMC ColumbiaUniversityMedicalCenter US;inpaBentEHR 4GE GECentricity US;outpaBentEHR 33INPC RegenstriefInsBtute,IndianaNetworkfor

PaBentCareUS;integratedhealthexchange15

JMDC JapanMedicalDataCenter Japan;private-payerclaims 3MDCD MarketScanMedicaidMulB-State US;public-payerclaims 17MDCR MarketScanMedicareSupplementaland

CoordinaBonofBenefitsUS;privateandpublic-payer

claims9

OPTUM OptumClinFormaBcs US;private-payerclaims 40STRIDE StanfordTranslaBonalResearchIntegrated

DatabaseEnvironmentUS;inpaBentEHR 2

HKU HongKongUniversity HongKong;EHR 1

Page 24: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Treatmentpathwayeventflow

Page 25: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

ProceedingsoftheNaBonalAcademyofSciences,2016

Page 26: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

T2DM:Alldatabases

Treatmentpathwaysfordiabetes

Firstdrug

Seconddrug

Onlydrug

Page 27: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Type2DiabetesMellitus Hypertension Depression

OPTUM

GE

MDCDCUMC

INPC

MDCR

CPRD

JMDC

CCAE

PopulaBon-levelheterogeneityacrosssystems,

andpaBent-levelheterogeneitywithinsystems

Page 28: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

HTN:Alldatabases

PaBent-levelheterogeneity

25%ofHTNpaBents(10%ofothers)have

auniquepathdespite250Mpop

Page 29: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Monotherapy–diabetes

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1989 1994 1999 2004 2009

AUSOM(SKorea*) CCAE(US#) CPRD(UK*) CUMC(US*)

GE(US*) INPC(US*#) JMDC(Japan#) MDCD(US#)

MDCR(US#) OPTUM(US#) STRIDE(US*)

General

upwardtrend

in

monotherapy

Page 30: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Monotherapy–HTN

AUSOM(SKorea*) CCAE(US#) CPRD(UK*) CUMC(US*)

GE(US*) INPC(US*#) JMDC(Japan#) MDCD(US#)

MDCR(US#) OPTUM(US#) STRIDE(US*)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1989 1994 1999 2004 2009

Academic

medical

centers

differfrom

general

pracBces

Page 31: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Monotherapy–diabetes

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1989 1994 1999 2004 2009

AUSOM(SKorea*) CCAE(US#) CPRD(UK*) CUMC(US*)

GE(US*) INPC(US*#) JMDC(Japan#) MDCD(US#)

MDCR(US#) OPTUM(US#) STRIDE(US*)

General

pracBces,

whether

EHRor

claims,have

similar

profiles

Page 32: Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences and Informa(cs (OHDSI ... EHR 2 CCAE MarketScan ... a single data model

Conclusions:Networkresearch

•  ItisfeasibletoencodetheworldpopulaBoninasingledatamodel

– Over600,000,000recordsbyvoluntaryeffort(682,000,000)

•  GeneraBngevidenceisfeasible•  Stakeholderswillingtoshareresults•  Abletoaccommodatevastdifferencesin

privacyandresearchregulaBon