biomedical research as part of the digital enterprise

41
Biomedical Research as Part of the Digital Enterprise Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health

Upload: philip-bourne

Post on 06-May-2015

311 views

Category:

Education


0 download

DESCRIPTION

Presented at the Association of Biomedical Resource Facilities Annual Meeting in Albuquerque NM March 25, 2014.

TRANSCRIPT

Page 1: Biomedical Research as Part of the Digital Enterprise

Biomedical Research as Part of the Digital Enterprise

Philip E. Bourne Ph.D.Associate Director for Data Science

National Institutes of Health

Page 2: Biomedical Research as Part of the Digital Enterprise

Disclaimer: I only started March 3, 2014

…but I had been thinking about this prior to my appointment

Page 3: Biomedical Research as Part of the Digital Enterprise

Let me start with a few factoids to get the ball rolling…

Page 4: Biomedical Research as Part of the Digital Enterprise

The Story of Meredith

http://fora.tv/2012/04/20/Congress_Unplugged_Phil_Bourne

Page 5: Biomedical Research as Part of the Digital Enterprise

1. The Era of Open Has The Potential to Deinstitutionalize & Democratize

Daniel Hulshizer/Associated Press

Page 6: Biomedical Research as Part of the Digital Enterprise

1. The Era of Open Has The Potential to Deinstitutionalize & Democratize

Daniel Hulshizer/Associated Press

Page 7: Biomedical Research as Part of the Digital Enterprise

2. I can’t reproduce research from my own laboratory?

Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome PLOS ONE 8(11) e80278 .

Page 8: Biomedical Research as Part of the Digital Enterprise

47/53 “landmark” publications could not be replicated

[Begley, Ellis Nature, 483, 2012] [Carole Goble]

Page 9: Biomedical Research as Part of the Digital Enterprise

Characteristics of the Original and Current Experiment

Original and Current:

– Purely in silico

– Uses a combination of public databases and open source software by us and others

Original:

– http://funsite.sdsc.edu/drugome/TB/

Current:

– Recast in the Wings workflow system

Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome PLOS ONE 8(11) e80278 .

Page 10: Biomedical Research as Part of the Digital Enterprise

Considered the Ability to Reproduce by Four Classes of User

REP-AUTHOR – original author of the work

REP-EXPERT – domain expert – can reproduce even with incomplete methods described

REP-NOVICE – basic domain (bioinformatics) expertise

REP-MINIMAL – researcher with no domain expertise

Garijo et al 2013 PLOS ONE 8(11): e80278

Page 11: Biomedical Research as Part of the Digital Enterprise

A Conceptual Overview of the Method Should Be Mandatory

Garijo et al 2013 PLOS ONE 8(11): e80278

Page 12: Biomedical Research as Part of the Digital Enterprise

Time to Reproduce the Method

Garijo et al 2013 PLOS ONE 8(11): e80278

Page 13: Biomedical Research as Part of the Digital Enterprise

2. Its not that we could not reproduce the work, but the effort involved was

substantial

Any graduate student could tell you this and little has changed in 40 years

Perhaps it is time we did better?

Page 14: Biomedical Research as Part of the Digital Enterprise

3. Data are accumulating!

Page 15: Biomedical Research as Part of the Digital Enterprise

4. We don’t know enough about how

existing data are used

* http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm

Jan. 2008 Jan. 2009 Jan. 2010Jul. 2009Jul. 2008 Jul. 2010

1RUZ: 1918 H1 Hemagglutinin

Structure Summary page activity forH1N1 Influenza related structures

3B7E: Neuraminidase of A/Brevig Mission/1/1918 H1N1 strain in complex with zanamivir

[Andreas Prlic]

Page 16: Biomedical Research as Part of the Digital Enterprise

We Need to Learn from Industries Whose Livelihood Addresses the Question of Use

Page 17: Biomedical Research as Part of the Digital Enterprise

5. Some would argue we are at an inflexion point for change

Evidence:– Google car

– 3D printers

– Waze

– Robotics

Page 18: Biomedical Research as Part of the Digital Enterprise

From the Second Machine Age

From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee

Page 19: Biomedical Research as Part of the Digital Enterprise

6. Scholarship is broken

I have a paper with 16,000 citations that no one has ever read

I have papers in PLOS ONE that have more citations than ones in PNAS

I have data sets I am proud of few places to put them

I edited a journal but it did not count for much

Page 20: Biomedical Research as Part of the Digital Enterprise

7. The reward system is in need of repair

Page 21: Biomedical Research as Part of the Digital Enterprise

Okay… enough of the problems

What are some solutions?

Page 22: Biomedical Research as Part of the Digital Enterprise

I cast the solutions in a vision …

something I call the digital enterprise

Any institution is a candidate as a digital enterprise, but lets explore it in the context

of the academic medical center

Page 23: Biomedical Research as Part of the Digital Enterprise

Components of The Academic Digital Enterprise

Consists of digital assets– E.g. datasets, papers, software, lab notes

Each asset is uniquely identified and has provenance, including access control– E.g. publishing simply involves changing the access control

Digital assets are interoperable across the enterprise

Page 24: Biomedical Research as Part of the Digital Enterprise

Life in the Academic Digital Enterprise

Jane scores extremely well in parts of her graduate on-line neurology class. Neurology professors, whose research profiles are on-line and well described, are automatically notified of Jane’s potential based on a computer analysis of her scores against the background interests of the neuroscience professors. Consequently, professor Smith interviews Jane and offers her a research rotation. During the rotation she enters details of her experiments related to understanding a widespread neurodegenerative disease in an on-line laboratory notebook kept in a shared on-line research space – an institutional resource where stakeholders provide metadata, including access rights and provenance beyond that available in a commercial offering. According to Jane’s preferences, the underlying computer system may automatically bring to Jane’s attention Jack, a graduate student in the chemistry department whose notebook reveals he is working on using bacteria for purposes of toxic waste cleanup. Why the connection? They reference the same gene a number of times in their notes, which is of interest to two very different disciplines – neurology and environmental sciences. In the analog academic health center they would never have discovered each other, but thanks to the Digital Enterprise, pooled knowledge can lead to a distinct advantage. The collaboration results in the discovery of a homologous human gene product as a putative target in treating the neurodegenerative disorder. A new chemical entity is developed and patented. Accordingly, by automatically matching details of the innovation with biotech companies worldwide that might have potential interest, a licensee is found. The licensee hires Jack to continue working on the project. Jane joins Joe’s laboratory, and he hires another student using the revenue from the license. The research continues and leads to a federal grant award. The students are employed, further research is supported and in time societal benefit arises from the technology.

From What Big Data Means to Me JAMIA 2014 21:194

Page 25: Biomedical Research as Part of the Digital Enterprise

Solution: Break Down the Silos

New policies, regulations e.g. data sharing

Economic drivers

The promise of shared data

Page 26: Biomedical Research as Part of the Digital Enterprise

Solution: Sustainability The How of Data Sharing

More credit to the data scientists

Change to funding models

Public/Private partnerships

Interagency cooperation

International cooperation

Better evaluation and more informed decisions about existing and proposed resources – How are current data being used?

Role of institutional repositories – reward institutions rather than PIs

Page 27: Biomedical Research as Part of the Digital Enterprise

Solution: Discoverability

Calls for data and software registries (e.g., DDI)

Data commons (NIH drive?)

More clinical trial data in the public domain

Facilitate accessibility and hence access to clinical data

Page 28: Biomedical Research as Part of the Digital Enterprise

Solution: Training

Calls out for training grants – new and as supplements to existing training efforts

Regional training centers (cf Cold Spring Harbor)?

Page 29: Biomedical Research as Part of the Digital Enterprise

These problems and potential solutions have been around a

long time

The good news is that “Big Data” has bought more attention to the

problem

Page 30: Biomedical Research as Part of the Digital Enterprise

What Are Big Data?

Large datasets from high throughput experiments

Large numbers of small datasets

Data which are “ill-formed”

The why (causality) is replaced by the what

A signal that a fundamental change is taking place – a tipping point?

Page 31: Biomedical Research as Part of the Digital Enterprise

The NIH is Starting to Think About the Digital Enterprise, Witness…

You will hear all about BD2K from:

– Jennie Larkin

– Warren Kibbe

– Dawei Lin

bd2k.nih.gov

Page 32: Biomedical Research as Part of the Digital Enterprise

This is great, but what will the end product look like?

Page 33: Biomedical Research as Part of the Digital Enterprise

1. A link brings up figures from the paper

0. Full text of PLoS papers stored in a database

2. Clicking the paper figure retrievesdata from the PDB which is

analyzed

3. A composite view ofjournal and database

content results

One Possible End Point

1. User clicks on thumbnail

2. Metadata and a webservices call provide a renderable image that can be annotated

3. Selecting a features provides a database/literature mashup

4. That leads to new papers

4. The composite view haslinks to pertinent blocks

of literature text and back to the PDB

1.

2.

3.

4.

PLoS Comp. Biol. 2005 1(3) e34

Page 34: Biomedical Research as Part of the Digital Enterprise

To get to that end point we have to consider the complete research lifecycle

Page 35: Biomedical Research as Part of the Digital Enterprise

The Research Life Cycle will Persist

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

Page 36: Biomedical Research as Part of the Digital Enterprise

Tools and Resources Will Continue To Be Developed

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

Software

Analysis Tools

Visualization

ScholarlyCommunication

Page 37: Biomedical Research as Part of the Digital Enterprise

Those Elements of the Research Life Cycle will Become More Interconnected Around a Common

Framework

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

Software

Analysis Tools

Visualization

ScholarlyCommunication

Page 38: Biomedical Research as Part of the Digital Enterprise

New/Extended Support Structures Will Emerge

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

Software

Analysis Tools

Visualization

ScholarlyCommunication

Commercial &Public Tools

Git-likeResources

By Discipline

Data JournalsDiscipline-

Based MetadataStandards

Community Portals

Institutional Repositories

New Reward Systems

Commercial Repositories

Training

Page 39: Biomedical Research as Part of the Digital Enterprise

We Have a Ways to Go

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

Software

Analysis Tools

Visualization

ScholarlyCommunication

Commercial &Public Tools

Git-likeResources

By Discipline

Data JournalsDiscipline-

Based MetadataStandards

Community Portals

Institutional Repositories

New Reward Systems

Commercial Repositories

Training

Page 40: Biomedical Research as Part of the Digital Enterprise

Thank You!Questions?

[email protected]

Page 41: Biomedical Research as Part of the Digital Enterprise

NIHNIH……Turning Discovery Into HealthTurning Discovery Into Health