cultivating collaboration - sharing data, code, and tools ... collaboration - shari… · be...

83
EDM Forum EDM Forum Community Webinars Events 8-29-2013 Cultivating Collaboration - Sharing Data, Code, and Tools to Accelerate the Science of Healthcare Anthony D'Amico Kaiser Family Foundation Xiaoqian Jiang UC San Diego, [email protected] Daniella Meeker ND Corporation, [email protected] Fred Troer DocGraph Journal and Not Only Development Dave Clifford Avicenna Follow this and additional works at: hp://repository.academyhealth.org/webinars Part of the Health Services Research Commons , and the Social and Behavioral Sciences Commons is Video/Media is brought to you for free and open access by the Events at EDM Forum Community. It has been accepted for inclusion in Webinars by an authorized administrator of EDM Forum Community. Recommended Citation D'Amico, Anthony; Jiang, Xiaoqian; Meeker, Daniella; Troer, Fred; and Clifford, Dave, "Cultivating Collaboration - Sharing Data, Code, and Tools to Accelerate the Science of Healthcare" (2013). Webinars. Paper 12. hp://repository.academyhealth.org/webinars/12

Upload: others

Post on 27-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

EDM ForumEDM Forum Community

Webinars Events

8-29-2013

Cultivating Collaboration - Sharing Data, Code,and Tools to Accelerate the Science of HealthcareAnthony D'AmicoKaiser Family Foundation

Xiaoqian JiangUC San Diego, [email protected]

Daniella MeekerRAND Corporation, [email protected]

Fred TrotterDocGraph Journal and Not Only Development

Dave CliffordAvicenna

Follow this and additional works at: http://repository.academyhealth.org/webinars

Part of the Health Services Research Commons, and the Social and Behavioral SciencesCommons

This Video/Media is brought to you for free and open access by the Events at EDM Forum Community. It has been accepted for inclusion in Webinarsby an authorized administrator of EDM Forum Community.

Recommended CitationD'Amico, Anthony; Jiang, Xiaoqian; Meeker, Daniella; Trotter, Fred; and Clifford, Dave, "Cultivating Collaboration - Sharing Data,Code, and Tools to Accelerate the Science of Healthcare" (2013). Webinars. Paper 12.http://repository.academyhealth.org/webinars/12

Page 2: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Cultivating Collaboration – Sharing

Data, Code, and Tools to

Accelerate the Science of

Healthcare

Anthony D’Amico, Kaiser Family Foundation;

Xiaoqian Jiang, University of California- San

Diego; Daniella Meeker, RAND Corporation;

Fred Trotter, DocGraph Journal and Not Only

Development; Dave Clifford, Avicenna

August 29, 2013

Page 3: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Welcome

Erin Holve, Ph.D., M.P.H.,

M.P.P.

– Senior Director of Research

& Education, AcademyHealth

– Principal Investigator of the

EDM Forum

– eGEMs Editor-in-Chief

Follow the conversation on Twitter!

#eGEMs @edm_ah @academyhealth

Page 4: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

AcademyHealth: Improving

Health & Health Care AcademyHealth is a leading national organization serving the fields of health

services and policy research and the professionals who produce and use

this important work.

Together with our members, we offer programs and services that support the

development and use of rigorous, relevant and timely evidence to:

1. Increase the quality, accessibility and value

of health care,

2. Reduce disparities, and

3. Improve health.

A trusted broker of information, AcademyHealth

brings stakeholders together to address the current

and future needs of an evolving health system,

inform health policy, and translate evidence into action.

Page 5: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

The audio and slide presentation will

be delivered directly to your

computer

Speakers or headphones are required to hear the

audio portion of the webinar.

If you do not hear any audio now, check your

computer’s speaker settings and volume.

If you need an alternate method of accessing audio,

please submit a question through the Q&A pod.

Page 6: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Technical Assistance

Live technical assistance:

– Call Adobe Connect at (800) 422-3623

Refer to the ‘Technical Assistance’ box

in the bottom left corner for tips to

resolve common technical difficulties.

Please turn off your pop-up blocker in

order to take a survey

Page 7: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

To submit a question:

1. Click in the Q&A box on the left side of your screen

2. Type your question into the dialog box and click the Send button

Questions may be submitted at

any time during the presentation

Page 8: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Advancing the National Dialogue on Use of HIT

for Research & Quality Improvement

Electronic Data Methods

(EDM) Forum Goals

– Work with the community to

identify cross-cutting

• Challenges

• Opportunities

• Research priorities

– Provide opportunities for

collaborative learning

– Ensure widespread

promotion of tools,

techniques, and findings

Join the Discussion Sign up at [email protected]

Page 9: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Health Data Ecosystem

www.hhs.gov/open/datasets/communityhealthdata

*Researchers

are innovators

too….

Page 10: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

The Landscape Electronic Health Data

Initiatives

Page 11: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

The Data Quality

Collaborative Collaborative working

group of leading

experts

Developing a

comprehensive data

quality assessment

framework and

guidelines for the CER

community

Seeks feedback from

the community through

the EDM Forum

eRepository

Page 12: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

New Brief! An

Organizing

Framework for

New Informatics

Tools and

Approaches

AcademyHealth. “Informatics Tools and Approaches To Facilitate

the Use of Electronic Data for CER, PCOR, and QI: Resources

Developed by the PROSPECT, DRN, and Enhanced Registry

Projects,” EDM Forum, August 2013.

Page 13: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

New eJournal! eGEMs

(Generating Evidence and Methods to

improve patient outcomes)

Peer-reviewed and open access

ejournal

Submissions must:

– Address use of electronic clinical

data (i.e. EHRs) for research and

quality improvement

– Highlight generalizable ‘lessons

learned’ to accelerate translation,

dissemination, and implementation

of health science

– Explain why investigators’ work

contributes to improving patient

outcomes

Page 14: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Great Interest to Date

12 published manuscripts (since 1/17/13)

5,800+ publication downloads (as of 8/26/13)

20+ papers currently under review

Forthcoming Special Issues

– Ways Decision Makers Can Use Evidence to Improve Patient Outcomes in Learning Health Systems Guest Editor: Wade Aubry, University of California, San Francisco

– Methods for CER, PCOR, and QI Using Electronic Clinical Data in a Learning Health System Guest Editor: Michael Stoto, Georgetown University

For more information about eGEMs submission

guidelines, visit http://repository.academyhealth.org/egems

Page 15: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Transforming the Research Enterprise

“Make the idea bigger”

How to sustainably link emerging data and tools in a marketplace of people and ideas committed to transforming patient

care and outcomes?

Discovery

Implementation

Research

Care

Page 16: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Learning Objectives

Build awareness of opportunities to engage in open

data and research communities

Learn about coding in R for federal surveys;

techniques to facilitate distributed analyses; and use

provider data for research

Improve users' experience with new tools and data by

involving potential users in different stages of

development

Explore opportunities to build your career by engaging

in open source data and research activities

Page 17: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Today’s Faculty

Anthony D’Amico, Kaiser Family

Foundation

Xiaoqian Jiang, University of California-

San Diego

Daniella Meeker, RAND Corporation

Fred Trotter, DocGraph Journal and Not

Only Development

Dave Clifford, Avicenna, LLC

Page 18: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

How to Analyze Survey Data for Free with the R Language

Anthony Damico Statistical Analyst Kaiser Family Foundation

Page 19: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Do you analyze survey data for work or pleasure?

Analyze Survey Data with the scripts on http://asdfree.com

My sincerest apologies Why are you here?

Do you speak any R? Do you analyze survey data

with SAS, SUDAAN, Stata, or SPSS?

Are you concerned that proprietary software makes statistical research

difficult to reproduce?

Does it bother you that your analyses might all be wrong?

Learn R by watching two-minute videos on http://twotorials.com

Do you mind the price tag?

Read the “Getting Started with R” Guide on

http://flowingdata.com

Hopefully you’ll never have to change jobs

Enroll in the free “Computing for Data Analysis” on http://coursera.com

nah required by supervisor

nah nah

nah

nah

nah

yeah

yeah yeah

yeah

yeah

..but Anthony, I hate the sound of your voice

..but I need something structured

done

done

done

...so you’re using Excel

yeah

Page 20: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Complex Sampling

Sample geographies first, then sample individuals within those

geographies.

19

Page 21: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

American Community Survey (ACS) ; IPUMS - American Community Survey (IPUMS-USA) ; American Time Use Survey (ATUS) ; Behavior Risk Factor Surveillance System (BRFSS) ; Consumer Assessment of Healthcare Providers and Systems (CAHPS) ; Consumer Expenditure Survey (CE) ; Current Population Survey (CPS) ; IPUMS - Current Population Survey (IPUMS-CPS) ; Employer Health Benefits Survey (EHBS) ; General Social Survey (GSS) ; Health and Retirement Study (HRS) ; Medicare Current Beneficiary Survey (MCBS) ; Medical Expenditure Panel Survey (MEPS) ; National Health and Nutrition Examination Survey (NHANES) ; National Health Interview Survey (NHIS) ; National Longitudinal Study of Adolescent Health (AddHealth) ; National Longitudinal Surveys (NLS) ; National Study of Drug Use and Health (NSDUH) ; Panel Study of Income Dynamics (PSID) ; Survey of Business Owners (SBO) ; Survey of Consumer Finances (SCF) ; Survey of Income and Program Participation (SIPP) ; Youth Risk Behavior Surveillance System (YRBSS)

Complex Sample Survey Data Sets

Page 22: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

21

twotorials.com

Page 23: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

asdfree.com 1) Download Automation

2) Replication Scripts

3) Current Analysis Examples

22

Page 24: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Do you analyze survey data for work or pleasure?

Analyze Survey Data with the scripts on http://asdfree.com

My sincerest apologies Why are you here?

Do you speak any R? Do you analyze survey data

with SAS, SUDAAN, Stata, or SPSS?

Are you concerned that proprietary software makes statistical research

difficult to reproduce?

Does it bother you that your analyses might all be wrong?

Learn R by watching two-minute videos on http://twotorials.com

Do you mind the price tag?

Read the “Getting Started with R” Guide on

http://flowingdata.com

Hopefully you’ll never have to change jobs

Enroll in the free “Computing for Data Analysis” on http://coursera.com

nah required by supervisor

nah nah

nah

nah

nah

yeah

yeah yeah

yeah

yeah

..but Anthony, I hate the sound of your voice

..but I need something structured

done

done

done

...so you’re using Excel

yeah

Page 25: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Accelerating Open Science in Healthcare Through Open Code, Data and Process

Xiaoqian Jiang, Ph.D.

Division of Biomedical Informatics University of California San Diego

24

--Experience based on Grid Logistic Regression development

Page 26: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Open Code

25

Allow software to be freely used, modified, and shared.

Licence Year

BSD 3-Clause "New" or "Revised" license 1988

BSD 2-Clause "Simplified" or "FreeBSD" license 1988

MIT license 1988

Apache License 2.0 2004

Eclipse Public License 2004 Common Development and Distribution License 2005

GNU General Public License (GPL) 2007 GNU Library or "Lesser" General Public License (LGPL) 2007

Mozilla Public License 2.0 2012

Page 27: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Open Code

26

Webservices Location Privacy Preserving SVM http://privacy.ucsd.edu:8080/ppsvm/ Web Grid Logistic Regression http://dbmi-engine.ucsd.edu/webglore/ Interactive Matching Patients And randomized Clinical Trials http://dbmi-engine.ucsd.edu/IMPACT/

Softwares Deposit

Distributed Cox backbone https://code.google.com/p/distributed-cox/ Randomized clinical trial matching backbone https://code.google.com/p/grouprct/ Grid Logistic Regression Backbone https://code.google.com/p/glore/ Sequential minimal optimization based SVM http://hwanjoyu.org/svm-java/ Web-based model calibration framework https://code.google.com/p/webcalibsis/

Differential PCA algorithm https://code.google.com/p/dpca/

http://idash.ucsd.edu/idash-softwaretools More on

Page 28: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Open tutorial

27

Page 29: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

• Data use agreements across institutions o Limited and complicated

o Specific to a particular study

o Resources for sharing are limited

o Security/privacy constraints are hard for small institutions to follow

• Sharing data today o Little incentive

o Only one model: users download data

o Yes/No decision on sharing

Open Data

Thanks Dr. Ohno-Machado for this slide.

Page 30: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Accelerating Open Science

29

• Research is a Process,

sharing our

experience may

accelerate Science Open Science

Healthcare research

Data collection

Algorithm development

Software implementation

Results verification

Backbone development and verification

UI prototype and soliciting UX advices

Integrated system and leave room for extension

Tw

o S

tag

e d

eve

lop

me

nt

Constantly checking

users’ experience

Page 31: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

My Experience in Developing

Grid Logistic Regression

30

Two stage biomedical webservice

development

Improving users' experience

through involving potential users in

different stages of the development

Page 32: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Motivation

• Traditional approaches to data sharing has

limitations and undermined the ability of

researchers and clinicians to access, aggregate,

and meaningfully analyze patient records at the

point of care.

• WebGlore is a webservice for biomedical

researchers to build a global predictive logistic

regression model without sharing data.

31

Page 33: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Patient data

Patient data

32

Aggregated information, i.e., marginal distribution, sufficient statistics, kernel matrix

Share model vs. Disseminate data

Page 34: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Developers and expertise

33

Machine learning

Statistics

Signal Processing JAVA, PHP

JSP, PHP

JAVA, PHP

JAVA, PHP

JAVA UI, HTML, CSS

Predictive modeling

Page 35: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Google Code as Version Control

34

Page 36: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Foundation of GLORE

35

• Suppose m-1 features are

consistent over k sites

• In each iteration,

intermediary results of a

mxm matrix and a m-

dimensional vector are

transmitted to k-1 sites

Page 37: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Backbone implementation

Page 38: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Implementation • R backbone

o https://www.dropbox.com/s/gmnr

qgifdq9tjd7/glore_R.zip

• JAVA backbone o https://code.google.com/p/glore/

37

Human factor and user experience are important!

Page 39: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

38

Check Point 1:

Performance Validation

Check User Experience

Page 40: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

A first thought about UX

39

Client interface Setup task parameters

Page 41: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

A first thought about UX

40

Setup task parameters -- filling task details

Page 42: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

A first thought about UX

41

Client interface Join a task

Page 43: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

A first thought about UX

42

Client interface Show result

Page 44: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

43

Check Point 2:

Check Potential Users’ Satisfaction

Page 45: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Potential Users’ feedback

• Advantage o Easy to implement

o Flexibility in developing complex interface

o Friendly to tools and packages that sit on local clients

• Disadvantage o Healthcare environments are reluctant to install third party software

o Communication through pre-specified ports is of security concern

o Do not support all platforms unless implemented individually

44

Page 46: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

First webservice development

Page 47: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

WebGLORE 1.0

• An easy-to-use software as a service for healthcare

should be: o Plug-in ready(User protected)

o Deployable in a variety of hosting environments (Platform friendly)

o Security and firewall compatible(Security-enhanced network)

46

Page 48: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Applet-Servlet architecture

47

Page 49: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

48

Check Point 3:

Check Potential Users’ Satisfaction

Page 50: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Critical advises from testers

• Pros. o Transparent model construction procedures, which allow participants to

see the intermediary steps

o Visualization on model helps users to understand model performance and

reveal important factors

• Cons. o Users cannot see their historical activities

o Users cannot change the user profile

o Repeated warnings from JAVA applet in browsers are annoying

49

Page 51: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Second webservice development

Page 52: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

WebGLORE 2.0

51

http://dbmi-engine.ucsd.edu/webglore2/

Page 53: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Generate reports

52

Page 54: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

53

Check Point 4:

Check System Validity

Page 55: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Experiments

• CA-19 and CA-125 data

54

run

nin

g t

ime

(sec

on

ds)

co

mp

aris

on

Estimate Std. Error Z-value Pr(>|z|)

Intercept -1.4645 0.3881 -3.7739 1.61E-04

CA19 0.0274 0.0085 3.2063 1.34E-03

CA125 0.0163 0.0077 2.1008 3.57E-02

H-L test p-value = 0.891

AUC = 0.891

Page 56: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Experiments

• Breast cancer biomarkers (CA-19, CA-125)

H-L test p-value = 0.891

AUC = 0.891

• Edinburgh myocardial infraction data

H-L test p-value = 0.430

AUC = 0.699

55

Estimate Std. Error Z-value Pr(>|z|)

Intercept -1.4645 0.3881 -3.7739 1.61E-04

CA19 0.0274 0.0085 3.2063 1.34E-03

CA125 0.0163 0.0077 2.1008 3.57E-02

Estimate Std. Error Z-value Pr(>|z|)

Intercept -4.3485 0.2968 -14.6508 0.00E+00

Pain in left arm 0.1816 0.2680 0.6777 4.98E-01

Pain in right arm 0.1764 0.3061 0.5763 5.64E-01

Nausea 0.1323 0.3862 0.3426 7.32E-01

Hypoperfusion 2.2511 0.6590 3.4160 6.36E-04

ST elevation 5.5556 0.4404 12.6150 0.00E+00

New Q waves 4.1453 0.6747 6.1435 8.07E-10

ST depression 3.4173 0.2815 12.1392 0.00E+00

T wave inversion 1.2030 0.2635 4.5649 5.00E-06

Sweating 0.2721 0.2510 1.0837 2.79E-01

Page 57: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

56

Check Point 5:

External Validation

Page 58: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Experiments

57

• Cincinnati data (ImproveCareNow!)

Site 1 - 245 observations on 5 patients.

Site 2 - 563 observations on 24 patients.

A quality improvement and research

collaborative focused on improving the

care and outcomes of children with

Inflammatory Bowel Disease

Page 59: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Experiments • Cincinnati data (ImproveCareNow!)

58

Site 1 - 245 observations on 5 patients.

Site 2 - 563 observations on 24 patients.

F1 - patient id

F2 - weeks to response

F3 - patient on biologics

F4 - days since diagnosis

F5 – gender

F6 – Race

F7 - Age in years at start of treatment

F8 - Extent of disease

F9 - patient on thiopurine

F10 - patient on methotrexate

F11 - patient on salicylate

F12 - patient on steroids

F13 - days since diagnosis (recorded variable)

F14 - gender (recorded variable)

F15 - race (recorded variable)

F16 - race (factor variable)

F17 - patient on steroid (factor variable)

F18 - patient on salicylate (factor variable)

F19 - patient on thiopurine (factor variable)

F20 - patient on methotrexate (factor variable)

F21 - patient diagnosis F22 - patient diagnosis (factor variable)

Features

A quality improvement and research

collaborative focused on improving the

care and outcomes of children with

Inflammatory Bowel Disease

Page 60: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Experiments

59

F1 - patient id

F2 - weeks to response

F3 - patient on biologics

F4 - days since diagnosis

F5 – gender

F6 – Race

F7 - Age in years at start of treatment

F8 - Extent of disease

F9 - patient on thiopurine

F10 - patient on methotrexate

F11 - patient on salicylate

F12 - patient on steroids

F13 - days since diagnosis (recorded variable)

F14 - gender (recorded variable)

F15 - race (recorded variable)

F16 - race (factor variable)

F17 - patient on steroid (factor variable)

F18 - patient on salicylate (factor variable)

F19 - patient on thiopurine (factor variable)

F20 - patient on methotrexate (factor variable)

F21 - patient diagnosis F22 - patient diagnosis (factor variable)

Features

Target = responded to treatment (i.e., improvement in condition)

Target

• Cincinnati data (ImproveCareNow!)

Site 1 - 245 observations on 5 patients.

Site 2 - 563 observations on 24 patients.

A quality improvement and research

collaborative focused on improving the

care and outcomes of children with

Inflammatory Bowel Disease

Page 61: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Experiments • Cincinnati data (ImproveCareNow!)

60

Predictor Beta SE Z-statistics df p Odds ratio Intercept 4.8802 2581.989 0.0019 1 0.9985 N/A F1 0.0034 0.0016 2.1977 1 0.028 1.0035 F2 0.1143 0.0373 3.0652 1 0.0022 1.1211 F3 1.8766 0.9398 1.9969 1 0.0458 6.5311 F4 0.0027 0.0012 2.206 1 0.0274 1.0027 F5 -1.7232 1290.995 -0.0013 1 0.9989 0.1785 F6 -0.7147 0.4921 -1.4523 1 0.1464 0.4893 F7 -0.5522 0.1909 -2.8926 1 0.0038 0.5757 F8 0.0673 0.1231 0.5469 1 0.5845 1.0696 F9 -0.8537 2236.068 -0.0004 1 0.9997 0.4259 F10 0 3162.278 0 1 1 1 F11 0.5396 2236.068 0.0002 1 0.9998 1.7154 F12 0.3057 2236.068 0.0001 1 0.9999 1.3576 F13 0.0245 1.0657 0.023 1 0.9816 1.0248 F14 0.7519 1290.995 0.0006 1 0.9995 2.1211 F15 0.5949 2236.068 0.0003 1 0.9998 1.8128 F16 0.5949 2236.068 0.0003 1 0.9998 1.8128 F17 0.3057 2236.068 0.0001 1 0.9999 1.3576 F18 0.5396 2236.068 0.0002 1 0.9998 1.7154 F19 -0.8537 2236.068 -0.0004 1 0.9997 0.4259 F20 0 3162.278 0 1 1 1 F21 -0.3472 2236.068 -0.0002 1 0.9999 0.7066 F22 -0.3472 2236.068 -0.0002 1 0.9999 0.7066

Calibration Error = 0.05

AUC = 0.744

HL-C = 0.26

HL-H = 0.59

Page 62: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Acknowledgements

• We thank Dr. Hamish Fraser and Dr. Kelly Zou for providing the

clinical data

• We thank Dr. Keith Marsolo for the helpful advice

• We thank EDM forum and iDASH for supporting this research!

61

Page 63: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Discussion Questions • What is the most favorable format of open software

the community wants?

62

AMIA’12 Privacy Preserving Support Vector Machine

AMIA’13,

Bioinformatics Grid Logistic Regression

Submitted to BMC Distributed Cox Proportional Hazard Model

Page 64: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

How do you like to share?

63

SaaS

PaaS

IaaS Operators, Developers, Collaborators

Researchers, Developers Collaborators

Healthcare professionals, End-user services

• What are the features you envision to have in order

to facilitate code, data, and process sharing?

Page 65: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Thanks

64

Page 66: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Cultivating Collaboration – Sharing Data, Code, and Tools to Accelerate the Science of Healthcare 29 August, 2013 EDM Forum Webinar

Daniella Meeker, RAND Corporation

65

Page 67: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Research: Structured Data And Code

Academic healthcare science • Text-based journals are the currency of continued funding

• A journal article eliminates structure and information from original data and puts it into a file cabinet

• Obscuring methods and data

• Slow

dissemination publication

value

infrastructure

66

Page 68: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

• Methods cannot be exchanged and replicated

• Data is rarely exchanged and re-analyzed for robustness

• Redundant work

• Publication bias

• No infrastructure for efficient collaboration

• Code sharing

• Metadata standards

• No incentives for collaboration in the scientific community

• Journal articles are released slowly and without detail

• Data has greater utility to investigators if it is hoarded

• Academic funding model does not support sustainable infrastructure

Academic healthcare science

67

Page 69: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Commercial health data science AKA business intelligence

• Environment – the “real” learning health system • Health care practice is moving more quickly than health services

research.

• Post-ACA, providers and plans motivated to leverage their data to find efficiencies

• Despite regulation, healthcare is among the fastest growing segments of cloud computing: infrastructure as a services (IaaS) and software as a services (SaaS)

• Funding model for commercial healthcare data science supports creation of scalable tools and an efficient marketplace for tools and analysis • Software engineers are part of staff

• Analytic services

• Incentives for dissemination are mixed

68

Page 70: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Collaboration Infrastructure models from other sciences

• Open Science Grid Physics, nanotechnolgy, structural biology

• OSG: 1.4M CPU-hours/day, >90 sites, >3000 users,

• >260 pubs in 2010

• LIGO Physics/Astrophysics

• Established practices and metadata standards

• 1 PB data in last science run, distributed worldwide

• ESGF • 1.2 PB climate data • delivered to 23,000 users; 600+

pubs

• Collage – Executable papers Computer science

69

Page 71: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Incentivizing a Learning Health System

70

• Research and practice must become interoperable • Requires commitment to a single standard across multiple

agencies • In the age of BI relevant research must go beyond

secondary analysis and link basic biology and biomedicine data with patient reported data

• Repositories and clearinghouses are a good start, but not enough…LHS requires searchable assets with high utility

• discoverable • standards for metadata and coding practices • computable artifacts • application sandboxes with realistically simulated data

• Incentives for collaboration and sharing. • Create a marketplace for reusable tools that links tool

utility and reuse to research funding.

TOXNET

Page 72: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

If you really love your data, you will set it free.

-ft

Page 73: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Pursing Open Data in Healthcare

Why?

How?

Our Two Efforts

DocGraph

toEleven

Page 74: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Why Open Source your Data?

From Eric Raymond's “The Cathedral and the Bazaar”

The Tragedy of the Commons

vs

The Magic Cauldron

Page 75: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

How?

Prepare to receive the secret recipe for successfully running an Open Source project:

Page 76: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

In seriousness

Let your community connect with each other. You need a mailing list

Visit ours at DocGraph.org

Use either

Google Groups

Discourse

Page 77: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

DocGraph

Is an graph data set of the healthcare system

It shows how doctors, hospitals, labs, etc work together to provide care

Based on a FOIA request to CMS

~50 Million Edges

~2 Million nodes

Crowdfunded Asked for $15k to develop data set, and got $60k on Medstartr

Page 78: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Open Data set

Download the Open Data Set for $1

Open Source version requires research be contributed back

Join the mailing list

Do something amazing

Page 79: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

toEleven

Part of a “grand plan” with Ian Eslick

Born out of Academy Health collaboration

Goals:

Make research translation to digital interventions sustainable by dramatically

lowering development and ongoing maintenance costs.

Page 80: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

toEleven

Is the mobile app front end for Ians n=1 server backend components

Developing with CCHMC around iMigraine applications

About to announce a Food Database Project

Page 81: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Dave Clifford

Page 82: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

To submit a question:

1. Click in the Q&A box on the left side of your screen

2. Type your question into the dialog box and click the Send button

Submitting Questions

Page 83: Cultivating Collaboration - Sharing Data, Code, and Tools ... Collaboration - Shari… · be delivered directly to your computer Speakers or headphones are required to hear the audio

Thank You

Please take a moment to fill out the

brief evaluation which will appear in your browser.