open data in a global ecosystem
TRANSCRIPT
![Page 1: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/1.jpg)
Open Data in a Global EcosystemPhilip E. Bourne Ph.D., FACMIAssociate Director for Data Science
National Institutes of [email protected]
BioMedBridges, EBI, November 17, 2015
http://www.slideshare.net/pebourne
![Page 2: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/2.jpg)
Not a talking head….An on-going conversation
![Page 3: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/3.jpg)
Some context to start that conversation …
![Page 4: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/4.jpg)
Perspective
Structural bioinformatics researcher
Former custodian of the RCSB PDB
Obsessive about open science e.g., PLOS
NIH-wide responsibility for developments in data science
![Page 5: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/5.jpg)
Consider this change from my own career experience ….
![Page 6: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/6.jpg)
The History of Computational Biomedicine According to Bourne
1980s 1990s 2000s 2010s 2020
Discipline:
Unknown Expt. Driven Emergent Over-sold A Service A Partner A Driver
The Raw Material:
Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated
The People:
No name Technicians Industry recognition data scientists Academics
Searls (ed) The Roots in Bioinformatics Series PLOS Comp Biol
![Page 7: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/7.jpg)
It Follows …
We are entering a period of disruption in biomedical research and we should all be thinking about what this means
to bioinformatics & biomedicine
http://i1.wp.com/chisconsult.com/wp-content/uploads/2013/05/disruption-is-a-process.jpg http://cdn2.hubspot.net/hubfs/418817/disruption1.jpg
![Page 8: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/8.jpg)
Big Data in Biomedicine…
This speaks to something more fundamental that more data …
It speaks to new methodologies, new skills, new emphasis, new cultures,
new modes of discovery …
![Page 9: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/9.jpg)
We are at a Point of Deception …
Evidence:– Google car– 3D printers– Waze– Robotics– Sensors
From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee
![Page 10: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/10.jpg)
Disruption: Example - Photography
DigitizationDeception
Disruption
Demonetization
Dematerialization
Democratization
Time
Vol
ume,
Vel
ocity
, Var
iety
Digital camera invented byKodak but shelved
Megapixels & quality improve slowly; Kodak slow to react
Film market collapses;Kodak goes bankrupt
Phones replacecameras
Instagram,Flickr become thevalue proposition
Digital media becomes bona fide form of communication
![Page 11: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/11.jpg)
Disruption: Biomedical Research
Digitization of Basic & Clinical Research & EHR’s
Deception
We Are Here
Disruption
Demonetization
Dematerialization
Democratization
Open science
Patient centered health care
![Page 12: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/12.jpg)
Disruptive Features: Sustainability
Source Michael Bell http://homepages.cs.ncl.ac.uk/m.j.bell1/blog/?p=830
![Page 13: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/13.jpg)
Disruptive Features:Reproducibility
Changing Value of Scholarship (?)
![Page 14: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/14.jpg)
“And that’s why we’re here today. Because something called precision medicine … gives us one of the greatest opportunities for new medical breakthroughs that we have ever seen.”
President Barack ObamaJanuary 30, 2015
Disruptive Features – New Science
![Page 15: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/15.jpg)
Precision Medicine Initiative
National Research Cohort – >1 million U.S. volunteers– Numerous existing cohorts (many funded by NIH)– New volunteers
Participants will be centrally involved in design and implementation of the cohort
They will be able to share genomic data, lifestyle information, biological samples – all linked to their electronic health records
![Page 16: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/16.jpg)
What Are Some General Implications of Such a Future?
Open collaborative science becomes of increasing importance nationally and internationally
The value of data and associated analytics becomes of increasing value to scholarship
Opportunities exist to improve the efficiency of the research enterprise and hence fund more research
Global cooperation between funders will be needed to sustain the emergent digital enterprise
Current training content and modalities will not match supply to demand
Balancing accessibility vs security becomes more important yet more complex
![Page 17: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/17.jpg)
What Are Some General Implications of Such a Future?
Open collaborative science becomes of increasing importance nationally and internationally
The value of data and associated analytics becomes of increasing value to scholarship
Opportunities exist to improve the efficiency of the research enterprise and hence fund more research
Global cooperation between funders will be needed to sustain the emergent digital enterprise
Current training content and modalities will not match supply to demand
Balancing accessibility vs security becomes more important yet more complex
![Page 18: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/18.jpg)
How Should We Respond as Funders?
Community: – Encourage wherever possible a global cultural shift towards
open science– Encourage global exchanges – Encourage global projects
Policies:– Understand and map data sharing policies, standards etc.– Understand ethical, legal and societal differences
Infrastructure:– Share the burden and the reward
![Page 19: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/19.jpg)
How Should We Respond as Funders?
Community: – Encourage wherever possible a global cultural shift towards
open science– Encourage global exchanges
Policies:– Understand and map data sharing policies, standards etc.– Understand ethical, legal and societal differences
Infrastructure:– Share the burden and the reward
![Page 20: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/20.jpg)
https://www.openscienceprize.org/
![Page 21: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/21.jpg)
A Culture of Sharing
1999 20042003 2007 20142008
Research Tools Policy
NIH Data Sharing Policy
Model Organism Policy
Genome-wide Association (GWAS) Policy
2012
NIH Public Access Policy (Publications)
Big Data to Knowledge (BD2K) Initiative
Genomic Data Sharing (GDS) Policy
Modernization of NIH Clinical Trials
White House Initiative
(2013 “Holdren Memo”)
![Page 22: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/22.jpg)
The BD2K Program
BD2K Budget
![Page 23: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/23.jpg)
BD2K FY14 Awardssupported by all NIH Institutes
![Page 24: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/24.jpg)
MD2K Applications – CHF and Smoking
![Page 25: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/25.jpg)
How Should We Respond as Funders?
Community: – Encourage wherever possible a global cultural shift towards
open science– Encourage global exchanges – Encourage global projects
Policies:– Understand and map data sharing policies, standards etc.– Understand ethical, legal and societal differences
Infrastructure:– Share the burden and the reward
![Page 26: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/26.jpg)
The Commons is a shared virtual space which is FAIR:
– Find
– Access (use effectively)
– Interoperate
– Reuse
An environment to find and catalyze the use of shared digital research objects
The CommonsConcept
![Page 27: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/27.jpg)
The Developer or User Defines the Environment from the Appropriate
Building Blocks
![Page 28: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/28.jpg)
The CommonsComponents
![Page 29: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/29.jpg)
BD2KCenter
BD2KCenter
BD2KCenter
BD2KCenter
BD2KCenter
BD2KCenter
DDICC
Software
Standards
Infrastructure - The Commons
Labs
Labs
Labs
Labs
![Page 30: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/30.jpg)
Public Beacons
Host Content
AMPLab 1000 Genomes Project
Broad Institute ExAC
Curoverse PGP, GA4GH Example Data
EBI 1000 Genomes Project, UK10K, GoNL, EVS, GEUVADIS, UMCG Cardio GenePanel
Google 1000 Genomes Project, Phase III, Illumina Platinum Genomes
ISB Known VARiants
NCBI NHLBI Exome Sequence Project
OICR 55 cancer datasets
SolveBio 56 public datasets
UCSC ClinVar, LOVD, UniProt
University of Leicester Cafe CardioKit, Cafe Variome Central
WTSI IBD, Native American, Egyptian, UK10K
Over 120 public datasets beaconized across 21 institutions
10s thousands of individuals
![Page 31: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/31.jpg)
![Page 32: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/32.jpg)
Commons - Pilots
The Cloud Credits - business model
BD2K Centers
MODs (Model Organism Databases)
HMP Data and tools available in the cloud
NCI Cloud Pilots & Genomic Data Commons
![Page 33: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/33.jpg)
I not only use all the brains I have, but all I can borrow.
– Woodrow Wilson
![Page 34: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/34.jpg)
What Can We Do Now?
Extend the research pilots concept
Have TCC & TeSS work together
Global hackathons, competitions
Closer ties between NLM and EBI / Elixir
Student exchanges Engage foundations, charities
in more global initiatives
http://wwwdev.ebi.ac.uk/Tools/ddi/
![Page 35: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/35.jpg)
ADDS Team
BD2K Representatives
![Page 36: Open Data in a Global Ecosystem](https://reader036.vdocument.in/reader036/viewer/2022081604/58827c181a28ab24788b5993/html5/thumbnails/36.jpg)
NIHNIH……Turning Discovery Into HealthTurning Discovery Into Health
[email protected]://datascience.nih.gov/
http://www.ncbi.nlm.nih.gov/research/staff/bourne/