pharmacogx: data sharing and research …...bioinformatics and computational genomics laboratory...

23
Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics Benjamin Haibe-Kains Princess Margaret Cancer Centre University Health Network University of Toronto Ontario Institute of Cancer Research 2 open postdoc positions: Re radiomics and single-cell RNA-seq

Upload: others

Post on 20-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Bioinformatics and Computational Genomics Laboratory

June 25, 2016

PharmacoGx: Data Sharing and Research Reproducibility in

Pharmacogenomics

Benjamin Haibe-Kains

Princess Margaret Cancer CentreUniversity Health NetworkUniversity of TorontoOntario Institute of Cancer Research

2 open postdoc positions:Re radiomics and single-cell RNA-seq

Page 2: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Reproducibility crisis

▷ Reproducibility in biomedical sciences has attracted a lot of attention in the last 10 years

Page 3: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Why data and code sharing?

▷ Data are precious due to limited○ Amount of samples○ Resources○ Budget

▷ Benefits of sharing data and code○ Replicability○ Reproducibility○ Reusability○ Post-publication peer review

“Anyone who believes in indefinite growth in anything physical, on a physically finite planet, is either mad or an economist.” ― Kenneth E. Boulding

Page 4: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

High-throughput in vitro drug screening

Page 5: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Long history of data sharing in pharmacogenomicsNCI60/DTPSince 1997

59 cell lines60,000 drugs

JCFR39Since 1999

39 cell lines557 drugs

GSKApril 2010

311 cell lines19 drugs

PGPJan 2012

87 cell lines2 drugs

GRAYFeb 2012

54 cell lines74 drugs

GDSCMar 2102

727 cell lines140 drugs

GRAY’Oct 2013

70 cell lines90 drugs

GNEDec 2014

675 cell lines5 drugs

CTRPAug 2013

242 cell lines354 drugs

CTRPv2Sep 2015

860 cell lines481 drugs

gCSIMay 2016

59 cell lines16 drugs

GDSC1000June 2016

1124 cell lines256 drugs

CCLEMar 2012

1061 cell lines24 drugs

More to come...

Page 6: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Predictors trained on one dataset hardly validate on an independent set

Page 7: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Comparative studies

2013

2015

2016

Page 8: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

▷ We sought to develop a computational platform for large-scale pharmacogenomic analyses

→ PharmacoGx R packagegithub.com/bhklab/PharmacoGx(available on CRAN, under review for BioC)

Challenges in pharmacogenomic analyses▷ Cell line and drug identifiers are not standardized

○ Difficult to assess overlap between studies

▷ Heterogeneous experimental protocols▷ No consensus on the processing of pharmacological

data▷ Diverse molecular profiling technologies

Page 9: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

PharmacoGx in a nutshell

Page 10: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

PharmacoSet S4 class

→ MultiAssayExperiment

Page 11: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

PharmacoGx enables meta-analysis

▷ Cellosaurus to uniquely identify and annotate cell lines and tissuesweb.expasy.org/cellosaurus/

▷ Drugs annotated with PubChem ID, InChiKey and SMILES○ Exact and fuzzy matching based on structure similarity

▷ Ensembl annotations for omics profiles▷ Functions to download, intersect, subset, and

summarize pharmacogenomic studies○ DownloadPSet()○ IntersectPSets()○ SubsetTo()○ summarize*()

Datasets available today:CMAP, GDSC, CCLE and gCSI

In the oven:L1000, NCI60, GSK, GNE,

CTRPv2, GRAY

Page 12: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Filtering of noisy dose-response curves

Page 13: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Fitting of drug dose-response curves

Highly consistent

Highly inconsistent

Page 14: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Correlations of drug sensitivity data2013 Inconsistency in large pharmacogenomics studies

2015 Revisiting inconsistency in large pharmacogenomic studies Pharmacogenomic agreement between two cancer cell line data sets

2016 Reproducible pharmacogenomic profiling of cancer cell line panels

Page 15: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Correlations of drug sensitivity data2013 Inconsistency in large pharmacogenomics studies

2015 Revisiting inconsistency in large pharmacogenomic studies Pharmacogenomic agreement between two cancer cell line data sets

2016 Reproducible pharmacogenomic profiling of cancer cell line panels

Page 16: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Correlations of drug sensitivity data2013 Inconsistency in large pharmacogenomics studies

2015 Revisiting inconsistency in large pharmacogenomic studies Pharmacogenomic agreement between two cancer cell line data sets

2016 Reproducible pharmacogenomic profiling of cancer cell line panels

Page 17: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Robust biomarker discoveryD

rug

sens

itivi

ty

EGFR expression Effect size

Erlotinib

Dru

g se

nsiti

vity

HGF expression Effect size

Crizotinib

Page 18: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Conclusions

▷ Pharmacogenomics is a hot field, new datasets and new players everyday○ You can even stay in the game after pissing off the major

league :-)

▷ Great need for standardization○ Experimental protocols○ Data processing○ Annotations

▷ PharmacoGx provides a unified platform for meta-analysis of pharmacogenomic studies

Our curation is far from perfect, we need your feedback to make it better!

Page 19: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Future directions

▷ MultiAssayExperiment (MAE) to replace the list of ExpressionSet objects and better integrate diverse molecular profiles -- Workshop session 3

▷ PharmacoDb: Companion web-application to faciltate exploration of the large compendium of published pharmacogenomics datasets

▷ Development of statistical/machine learning methods to jointly analyze heterogeneous pharmacogenomics datasets

▷ Extension to drug combinations (AstraZeneca-Sanger DREAM Challenge)

Page 20: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Data vultures

#IAmAResearchParasite

Data vampires

And research parasites

PharmacoGx can be safely used by

Page 21: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Research parasites

[...] concern held by some is that a new class of research person will emerge — people who had nothing to do with the design and execution of the study but use another group’s data for their own ends, possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what the original investigators had posited. There is concern among some front-line researchers that the system will be taken over by what some researchers have characterized as “research parasites.”

Scientists?

Doing Science?

Page 22: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Acknowledgements

BHK lab Princess Margaret Cancer Centre▷ Zhaleh Safikhani▷ Petr Smirnov▷ Nehme El-Hachem▷ Mark Freeman▷ Ali Madani

Collaborators▷ John Quackenbush▷ Christos Hatzis▷ Christopher Mason▷ Leming Shi▷ Anna Goldenberg▷ Nicolai Juul-Birkbak▷ Andrew Beck▷ Hugo Aerts

Page 23: PharmacoGx: Data Sharing and Research …...Bioinformatics and Computational Genomics Laboratory June 25, 2016 PharmacoGx: Data Sharing and Research Reproducibility in Pharmacogenomics

Thank you for your attention!

Questions?