stephen friend inspire2live discovery network 2011-10-29
DESCRIPTION
Stephen Friend, Oct 29, 2011. Inspire2Live Discovery Network, Cambridge, UKTRANSCRIPT
Actionable Cancer Network Models And Open Medical Information Systems
Stephen Friend MD PhD
Sage Bionetworks (Non-Profit Organization) Seattle/ Beijing/ Amsterdam
Discovery Networks October 29th, 2011
Why not use data intensive science to build models of disease
Current Reward Structures
Organizational Structures and Tools
Pilots
Opportunities
What is the problem?
We need to rebuild the drug discovery process so that we be6er understand disease biology before tes8ng proprietary compounds on sick pa8ents
Personalized Medicine 101: Capturing Single bases pair mutations = ID of responders
Reality: Overlapping Pathways
The value of appropriate representations/ maps
Equipment capable of generating massive amounts of data
“Data Intensive” Science- Fourth Scientific Paradigm
Open Information System
IT Interoperability
Host evolving Models in a Compute Space- Knowledge Expert
WHY NOT USE “DATA INTENSIVE” SCIENCE
TO BUILD BETTER DISEASE MAPS?
what will it take to understand disease?
DNA RNA PROTEIN (dark maGer)
MOVING BEYOND ALTERED COMPONENT LISTS
2002 Can one build a “causal” model?
db/db mouse (p~10E(-30))
AVANDIA in db/db mouse
= up regulated = down regulated
Our ability to integrate compound data into our network analyses
db/db mouse (p~10E(-20) p~10E(-100))
50 network papers http://sagebase.org/research/resources.php
List of Influential Papers in Network Modeling
(Eric Schadt)
Equipment capable of generating massive amounts of data A-
“Data Intensive” Science- Fourth Scientific Paradigm Score Card for Medical Sciences
Open Information System D-
IT Interoperability D
Host evolving Models in a Compute Space- Knowledge Expert F
.
We still consider much clinical research as if we were “hunter gathers”- not sharing
TENURE FEUDAL STATES
Clinical/genomic data are accessible but minimally usable
Little incentive to annotate and curate data for other scientists to use
Mathematical models of disease are not built to be
reproduced or versioned by others
Lack of standard forms for sharing data and lack of forms for future rights and consentss
Publication Bias- Where can we find the (negative) clinical data?
sharing as an adoption of common standards.. Clinical Genomics Privacy IP
Sage Mission
Sage Bionetworks is a non-profit organization with a vision to create a “commons” where integrative bionetworks are evolved by
contributor scientists with a shared vision to accelerate the elimination of human disease
Sagebase.org
Data Repository
Discovery Platform
Building Disease Maps
Commons Pilots
Sage Bionetworks Collaborators
Pharma Partners Merck, Pfizer, Takeda, Astra Zeneca, Amgen, Johnson &Johnson
25
Foundations Kauffman CHDI, Gates Foundation
Government NIH, LSDF
Academic Levy (Framingham) Rosengren (Lund) Krauss (CHORI)
Federation Ideker, Califarno, Butte, Schadt
RULES GOVERN
PLAT
FORM
NEW
MAP
S PLATFORM
Sage Platform and Infrastructure Builders- ( Academic Biotech and Industry IT Partners...)
PILOTS= PROJECTS FOR COMMONS Data Sharing Commons Pilots-
(Federation, CCSB, Inspire2Live....)
NEW TOOLS Data Tool and Disease Map Generators- (Global coherent data sets, Cytoscape,
Clinical Trialists, Industrial Trialists, CROs…)
NEW MAPS Disease Map and Tool Users-
( Scientists, Industry, Foundations, Regulators...)
RULES AND GOVERNANCE Data Sharing Barrier Breakers-
(Patients Advocates, Governance and Policy Makers, Funders...)
Developing predictive models of genotype-specific sensitivity to compound treatment
Pred
ic8ve Features
(biomarkers)
Gene8c Feature Matrix Expression, copy number, somaQc mutaQons, etc.
Sensi8ve Refractory
(e.g. EC50)
Cancer samples with varying degrees of response to therapy
27
Elastic net regression 500
Features
100
Features
20
Features
1 Feature
28
Bootstrapping retains robust predictive features
29
Our approach identifies mutations in genes upstream of MEK as top predictors of sensitivity to MEK inhibition
#1 Mut BRAF
#3 Mut NRAS
PD-‐0325901
PD-‐0325901
#9 Mut BRAF
#312 Mut NRAS
!"#$% &"#$%
'"#(%
)*!+,-% #./0-11%2/345-674+%
30
Other top predictive features include expression levels of genes regulated by MEK
PraQlas et al., (2009), PNAS
#19 ETV5 expr
#8 DUSP6 expr
#5 ETV4 expr #3 NRAS mut #2 SPRY2 expr #1 BRAF mut
PD-‐0325901
!"#$% &"#$%
'"#(%
)*!+,-% #./0-11%2/345-674+%
31
Model built excluding expression data identifies BRAF, NRAS, and KRAS top predictive features for both MEK inhibitors
!"#$% &"#$%
'"#(%
)*!+,-% #./0-11%2/345-674+%
BRAF mut
NRAS mut
KRAS mut
PD-‐0325901
BRAF mut
NRAS mut
KRAS mut
AZD6244
#3 KRAS mut #2 NRAS mut #1 BRAF mut
#3 KRAS mut #2 NRAS mut #1 BRAF mut
32
TP53 mut
CDKN2A copy
MDM2 expr
HGF expr
CML linage EGFR mut
EGFR mut
EGFR mut
CML lineage
ERBB2 expr
BRAF mut
BRAF mut
NRAS mut
BRAF mut
NRAS mut
KRAS mut
BRAF mut
NRAS mut
KRAS mut
#1 BRAF mut
#2 NRAS mut #1 BRAF mut
#3 KRAS mut #2 NRAS mut #1 BRAF mut
#3 KRAS mut #2 NRAS mut #1 BRAF mut
#1 EGFR mut
#1 ERBB2 expr
#1 EGFR mut
#2 CML lineage #1 EGFR mut
#1 CML lineage
#1 HGF expr
#2 TP53 mut #3 CDKN2A copy #1 MDM2 expr
How accurate would predic8ve models perform for diagnos8cs?
For 11/12 compounds, the #1 predictive feature in an unbiased analysis corresponds to the known stratifier of sensitivity
33
Why not share clinical /genomic data and model building in the ways currently used by the software industry (power of tracking workflows and versioning
Leveraging Existing Technologies
Taverna
Addama
tranSMART
INTEROPERABILITY
INTEROPERABILITY
Genome Pattern CYTOSCAPE tranSMART I2B2
SYNAPSE
Watch What I Do, Not What I Say Reduce, Reuse, Recycle
Most of the People You Need to Work with Don’t Work with You
My Other Computer is Amazon
sage bionetworks synapse project
CTCAP Arch2POCM The FederaQon Portable Legal Consent Sage Congress Project
Select Six Pilots at Sage Bionetworks
RULES GOVERN
PLAT
FORM
NEW
MAP
S
Clinical Trial Comparator Arm Partnership “CTCAP” Strategic Opportunities For Regulatory Science
Leadership and Action
FDA September 27, 2011
CTCAP
Clinical Trial Comparator Arm Partnership (CTCAP)
Description: Collate, Annotate, Curate and Host Clinical Trial Data with Genomic Information from the Comparator Arms of Industry and Foundation Sponsored Clinical Trials: Building a Site for Sharing Data and Models to evolve better Disease Maps.
Public-Private Partnership of leading pharmaceutical companies, clinical trial groups and researchers.
Neutral Conveners: Sage Bionetworks and Genetic Alliance [nonprofits].
Initiative to share existing trial data (molecular and clinical) from non-proprietary comparator and placebo arms to create powerful new tool for drug development.
Started Sept 2010
Shared clinical/genomic data sharing and analysis will maximize clinical impact and enable discovery
• Graphic of curated to qced to models
Arch2POCM
Restructuring the PrecompeQQve Space for Drug Discovery
How to potenQally De-‐Risk High-‐Risk TherapeuQc Areas
What is the problem?
We need to rebuild the drug discovery process so that we be6er understand disease biology before tes8ng proprietary compounds on sick pa8ents
Jan 09
Well. Trust (£4.1M) NCGC (20HTSs)
GSK (8FTEs)
Ontario ($5.0M)
OICR (2FTEs)
UNC (3FTEs)
April 09 June 09 June 10
Pfizer (8FTEs)
Novartis (8FTEs)
A PPP to generate novel chemical probes
Sweden ($3.0M)
15 acad. labs
….more than £30M of resource….now Lilly (8FTEs)
Academic, scientific, drug discovery & economic impact
Published Dec 23 2010 - already cited 30 times
Distributed to >100 labs/companies - profile in several therapeutic areas
Pharmas - started proprietary efforts
Harvard spin off - $15 M seed funding
Opened new area: Zuber et al : BRD4/ JQ1 in acute leukaemia Nature, 2011 Aug 3 Delmore et al: BRD4/ JQ1 in multiple myeloma Cell, 2011 Volume 146, 904-917, 16 Dawson et al: BRD4/ JQ1 in MLL Nature 2011, in press.
Floyed et al: BRD4 in DNA damage response Cell, revised Filippakopoulos et al: Bromodomains structure and function Cell, revised Natoli et al: BRD4 in T-cell differentiation manuscript in preparation Bradner et al: BRDT in spermatogenesis submitted
collaborations with SGC
The FederaQon
2008 2009 2010 2011
How can we accelerate the pace of scientific discovery?
Ways to move beyond “traditional” collaborations?
Intra-lab vs Inter-lab Communication
Colrain/ Industrial PPPs Academic Unions
(Nolan and Haussler)
human aging: predicting bioage using whole blood methylation
!
!
!!!
!
!!!
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
! !!
!
!
!
!
!
!!!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!!
!
!
!
!
!
40 50 60 70 80 90 100
40
60
80
100
Training Cohort: San Diego (n=170)
Chronological Age
Bio
logic
al A
ge
RMSE=3.35
!
!!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
! !
!
!
!!
!
!
!
!
!
!!
!
!
!!
!
!!
!!
!
!
!
!!!
!!
!
!
!
!
!
!
!!
!!
!!
!
!!
!
!!
!
!!
!
!
!
!!!
!
!
!
! !
!
!
!!
!
!
!
!!
!
!
!!
! !
!!!
!
!
!
!!
!
!
!!
!
!
!!
40 50 60 70 80 90
40
60
80
100
Validation Cohort: Utah (n=123)
Chronological Age
Bio
logic
al A
ge
RMSE=5.44
• Independent training (n=170) and validation (n=123) Caucasian cohorts • 450k Illumina methylation array • Exom sequencing • Clinical phenotypes: Type II diabetes, BMI, gender…
sage federation: model of biological age
Faster Aging
Slower Aging
Clinical Association - Gender - BMI - Disease Genotype Association Gene Pathway Expression Pr
edicted Age (liver expression)
Chronological Age (years)
Age Differential
Reproducible science==shareable science
Sweave: combines programmatic analysis with narrative
Sweave.Friedrich Leisch. Sweave: Dynamic generation of statistical reports using literate data analysis. In Wolfgang Härdle and Bernd Rönz,editors, Compstat 2002 –
Proceedings in Computational Statistics,pages 575-580. Physica Verlag, Heidelberg, 2002. ISBN 3-7908-1517-9
Dynamic generation of statistical reports using literate data analysis
Federated Aging Project : Combining analysis + narraQve
=Sweave Vignette Sage Lab
Califano Lab Ideker Lab
Shared Data Repository
JIRA: Source code repository & wiki
R code + narrative
PDF(plots + text + code snippets)
Data objects
HTML
Submitted Paper
Portable Legal Consent
(AcQvaQng PaQents)
John Wilbanks
Sage Congress Project April 20 2012
RA Parkinson’s Asthma
(Responders CompeQQons)
Why not use data intensive science to build models of disease
Current Reward Structures
Organizational Structures and Tools
Six Pilots
Opportunities