integrating the connectivity map and the cancer · pdf fileintegrating the connectivity map...

1
Integrating the Connectivity Map and The Cancer Genome Atlas Ben Siranosian, Joseph Nasser, Rajiv Narayan, Aravind Subramanian, Todd Golub The Broad Institute of MIT and Harvard, Cambridge, MA The Connectivity Map (CMap) The Cancer Genome Atlas (TCGA) How can we integrate the two databases? Database of key genomic changes across major types of cancer - mRNASeq, mutations, copy number alterations Publically accessable to researchers around the world Patient-derived data High sample numbers - 1093 sequenced breast cancer tumors - Method development - Data processing, statistics, visualization of results - Are cell-derived CMap signatures applicable to patient-derived data? - Different system, experimental methods - Can a comparison tell us something new about cancer biology? - Sets of patients with alterations - Individuals with exceptional characteristics BRAF BRAF GENE OE GENE KD/KO DRUG DISEASE BRAF BRAF INPUT QUERY SIGNATURE +1 -1 0 CONNECTING SIGNATURES Similar Opposing mRNA expression database to connect genes to drugs to diease Perturbation with gene knock-down, knock-out, overexpression or drug treatment in 9 core cell lines TCGA samples TCGA mRNASeq Expression 7k best inferred genes CMap perts Connectivity Scores TCGA samples Top 100 Z-scored genes Bottom 100 Z-scored genes Z-score genes (within cohort, transcriptonal subtype) Methods 1) Define a differential expression signature from each TCGA sample 2) CMap query gives a connectivity score to each perturbation experiment in each cell line 3) Define a sample set based on characteristic of interest [ outlier gene expression | mutation | copy number alteration ] 4) Test for enrichment at top/bottom of ranked connectivity scores - Sets with strong enrichment: alteration is similar to CMap signature x - μ MAD (x) ~ [For each TCGA sample] Query CMap TP53 TP53 MYC MYC MYC CMap perts Connectivity Scores (ranked) TCGA samples BRCA samples with mutant TP53 Significant enrichment at top of ranked list. Conclusion: BRCA samples with mutant TP53 have similar gene expression signatures as knocking down TP53 in MCF7 Enrichment testing: Weighted Connectivity Score measurement Significance: permutation-based p-value TP53 knock-down MCF7 Z-score Samples Validation of cell-derived CMap signatures Expect sets of outlier gene expression to connect strongly to signatures of knock-down or overexpression of the same gene CMap gives a clue into cancer biology A signature of HDAC activation is predictive of survival in LUAD Connectivity scores to aggregated CMap signatures tested for correlation with survival (cox regression) Web apps accelerate discovery Future directions 50 100 150 -6 -3 0 3 Z-score ADAM10 Expression in BRCA Number of samples 55 samples with aberrantly low ADAM10 expression Connectivity to ADAM10 knock-down in MCF7 Normal samples Outlier samples The 55 outlier samples are significantly enrichment at the top of the list! 50 0 -50 -100 100 CMap connectivity score 1093 BRCA patients Connection recovery Outlier sample sets with a matched knock-down or overexpression signature in at least one cell line number significant at p=0.01 / total number BRCA PRAD LUAD COAD SKCM 699/1212 613/1131 326/847 78/562 131/1058 58% 54% 38% 14% 12% Low ADAM10 expression looks similar to ADAM10 knock-down - Positive enrichment = similarity - Validation of CMap signature - Applicable to patient-derived data 0.849 3.6 x 10 -4 99.6% 93.42 Enrichment score P-value Percentile rank Mean connectivity score Three use cases 1) Explore pre-computed results Users can explore the results of our investigation with a set of default sample sets. 2) Query with custom sample sets Rapid hypothesis testing with a new sample set of interest. Useful for biologists to explore new ideas. 3) Use external expression data Investigate novel data with our methods. Bring in your own expression data and sample sets. - Pilot investigation handled 5 cohorts - Expand to the rest of TCGA - Other large-scale expression databases - GEO data, TOX21 - Release and promote web apps - Open source soon! - Dive further into biological findings - KEAP1/STK11 findings - Compile list of best CMap perturbagens Subsets of patients that will have differential response to HDAC inhibitors ? Is there a link between KEAP1 and STK11 signaling? Collaborate and experiment to find out. - Positive connection to HDAC inhibitors correlated with survival - Negative connections to HDAC6 knock-down correlated with poor prognosis Acknowledgments CMap lab and data team Golub lab members Broad ACB resources Kayleigh and Cancer program admins cbioportal.org R / shiny / plotly NIH LINCS program TCGA 0 50 100 150 200 0.0 0.2 0.4 0.6 0.8 1.0 Top 50 positive connections to belinostat aggregated signature months proportion alive p=0.002177 other top 0 50 100 150 200 0.0 0.2 0.4 0.6 0.8 1.0 Top 50 negative connections to HDAC6 knock-down aggregated signature months proportion alive p=0.000191 other top compound belinostat trichostatin-a HC-toxin panobinostat givinostat trichostatin-a vorinostat THM-I-94 acepromazine SB-202190 scriptaid ISOX coef -0.0059 -0.0052 -0.0055 -0.0050 -0.0063 -0.0049 -0.0046 -0.0048 -0.0108 -0.0059 -0.0049 -0.0040 p 9.41E-06 1.42E-05 1.62E-05 2.88E-05 6.24E-05 8.78E-05 1.24E-04 1.32E-04 1.81E-04 2.22E-04 2.36E-04 2.98E-04 fdr 0.012 0.013 0.013 0.016 0.029 0.030 0.033 0.035 0.042 0.049 0.051 0.057 annotation HDAC inhibitor, cell cycle inhibitor HDAC inhibitor, CDK expression enhancer HDAC inhibitor HDAC inhibitor, apoptosis stimulant HDAC inhibitor, interleukin receptor antagonist HDAC inhibitor, CDK expression enhancer HDAC inhibitor, cell cycle inhibitor HDAC inhibitor, apoptosis stimulant dopamine receptor antagonist p38 MAPK inhibitor, interleukin inhibitor HDAC inhibitor HDAC inhibitor 10/12 best survival-correlated compound perturbagens are HDAC inhibitors Deep Deletion Missense Mutation Truncating Mutation mRNA Upregulation mRNA Downregulation KEAP1: Altered in 20% of lung adenocarcinoma tumors Stabilized Nrf2 Keap1 Keap1 Nrf2 Antioxidant response element Nrf2 Pol II Ubiq Degradation Oxidative stress KEAP1 sequesters NRF2 in the cytoplasm. Mutations or CNA should disrupt this interaction. What does CMap say? Perturbagen KEAP1 Knock-down NRF2 Knock-down NRF2 Overexpression STK11 Knock-down Enrichment 0.669 -0.616 0.736 0.734 P-value 0.0014 0.0131 0.0605 0.0004 Comparing LUAD samples with KEAP1 mutations to CMap Validation of signature: KEAP1 mutants are similar to KEAP1 knock-down Confirmation of biological interaction: KEAP1 mutants are opposite to NRF2 knock-down, similar to NRF2 overexpression The #1 connection in all of CMap aggregated signatures: STK11 knock-down. STK11 regulates AMPK signaling - effects on metabolism, energy homeostasis 1k

Upload: vuanh

Post on 13-Feb-2018

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Integrating the Connectivity Map and The Cancer · PDF fileIntegrating the Connectivity Map and The Cancer Genome Atlas Ben Siranosian, Joseph Nasser, Rajiv Narayan, ... p=0.002177

Integrating the Connectivity Map and The Cancer Genome AtlasBen Siranosian, Joseph Nasser, Rajiv Narayan, Aravind Subramanian, Todd Golub The Broad Institute of MIT and Harvard, Cambridge, MA

The Connectivity Map (CMap)

The Cancer Genome Atlas (TCGA)

How can we integrate the two databases?

Database of key genomic changes across major types of cancer - mRNASeq, mutations, copy number alterationsPublically accessable to researchers around the worldPatient-derived dataHigh sample numbers - 1093 sequenced breast cancer tumors

- Method development - Data processing, statistics, visualization of results- Are cell-derived CMap signatures applicable to patient-derived data? - Different system, experimental methods- Can a comparison tell us something new about cancer biology? - Sets of patients with alterations - Individuals with exceptional characteristics

BRAFBRAF GENE OE

GENE KD/KO

DRUGDISEASEBRAF

BRAF

INPUT QUERYSIGNATURE

+1

-1

0

CONNECTING SIGNATURES

Similar

Opposing

mRNA expression database to connect genes to drugs to dieasePerturbation with gene knock-down, knock-out, overexpression or drug treatment in 9 core cell lines

TCGA samples

TCGAmRNASeqExpression

7k b

est i

nfer

red

gen

es

CMap perts

ConnectivityScores

TCG

A s

amp

les

Top 100Z-scoredgenes

Bottom 100Z-scoredgenes

Z-scoregenes

(within cohort, transcriptonal

subtype)

Methods1) Define a differential expression signature from each TCGA sample2) CMap query gives a connectivity score to each perturbation experiment in each cell line

3) Define a sample set based on characteristic of interest [ outlier gene expression | mutation | copy number alteration ]

4) Test for enrichment at top/bottom of ranked connectivity scores - Sets with strong enrichment: alteration is similar to CMap signature

x - μ

MAD (x)

~

[For each TCGA sample]Query CMap

TP53

TP53

MYC

MYC

MYC

CMap perts

ConnectivityScores

(ranked)

TCG

A s

amp

les

BRCA samples with mutant TP53

Significant enrichment at top of ranked list.Conclusion: BRCA samples with mutant TP53 have similar gene expression signatures as knocking down TP53 in MCF7

Enrichment testing: Weighted Connectivity Score measurementSignificance: permutation-based p-value

TP53

knock-down

MCF7

Z-score

Sam

ple

s

Validation of cell-derived CMap signaturesExpect sets of outlier gene expression to connect strongly to signatures of knock-down or overexpression of the same gene

CMap gives a clue into cancer biology

A signature of HDAC activation is predictive of survival in LUADConnectivity scores to aggregated CMap signatures tested for correlation with survival (cox regression)

Web apps accelerate discovery

Future directions

50

100

150

−6 −3 0 3

Z−score

ADAM10 Expression in BRCA

Num

ber

of s

amp

les

55 samples with aberrantly low

ADAM10 expression

Connectivity to ADAM10 knock-down in MCF7

Normal samplesOutlier samples

The 55 outlier samples are significantly

enrichment at the top of the list!

50

0

-50

-100

100

CM

ap c

onn

ectiv

ity s

core

1093 BRCA patients

Connection recoveryOutlier sample sets with a matched knock-down or

overexpression signature in at least one cell linenumber significant at p=0.01 / total number

BRCAPRADLUADCOADSKCM

699/1212 613/1131 326/847 78/562 131/1058

58%54%38%14%12%

Low ADAM10 expression looks similar to ADAM10 knock-down

- Positive enrichment = similarity

- Validation of CMap signature

- Applicable to patient-derived data

0.849

3.6 x 10-4

99.6%

93.42

Enrichment score

P-value

Percentile rank

Mean connectivity score

Three use cases1) Explore pre-computed results

Users can explore the results of our investigation with a set of default sample sets.

2) Query with custom sample sets

Rapid hypothesis testing with a new sample set of interest. Useful for biologists to explore new ideas.

3) Use external expression data

Investigate novel data with our methods. Bring in your own expression data and sample sets.

- Pilot investigation handled 5 cohorts - Expand to the rest of TCGA- Other large-scale expression databases - GEO data, TOX21 - Release and promote web apps - Open source soon!- Dive further into biological findings - KEAP1/STK11 findings- Compile list of best CMap perturbagens

Subsets of patients that will have differential response to HDAC inhibitors ?

Is there a link between KEAP1 and STK11 signaling? Collaborate and experiment to find out.

- Positive connection to HDAC inhibitors correlated with survival

- Negative connections to HDAC6 knock-down correlated with poor prognosis

AcknowledgmentsCMap lab and data team

Golub lab members

Broad ACB resources

Kayleigh and Cancer program admins

cbioportal.org

R / shiny / plotly

NIH LINCS program

TCGA

0 50 100 150 200

0.0

0.2

0.4

0.6

0.8

1.0

Top 50 positive connections tobelinostat aggregated signature

months

pro

po

rtio

n al

ive p=0.002177

othertop

0 50 100 150 200

0.0

0.2

0.4

0.6

0.8

1.0

Top 50 negative connections to HDAC6 knock-down aggregated signature

months

pro

po

rtio

n al

ive p=0.000191

othertop

compound

belinostattrichostatin-aHC-toxinpanobinostatgivinostattrichostatin-avorinostatTHM-I-94acepromazineSB-202190scriptaidISOX

coef

-0.0059-0.0052-0.0055-0.0050-0.0063-0.0049-0.0046-0.0048-0.0108-0.0059-0.0049-0.0040

p

9.41E-061.42E-051.62E-052.88E-056.24E-058.78E-051.24E-041.32E-041.81E-042.22E-042.36E-042.98E-04

fdr

0.0120.0130.0130.0160.0290.0300.0330.0350.0420.0490.0510.057

annotation

HDAC inhibitor, cell cycle inhibitorHDAC inhibitor, CDK expression enhancerHDAC inhibitorHDAC inhibitor, apoptosis stimulantHDAC inhibitor, interleukin receptor antagonistHDAC inhibitor, CDK expression enhancerHDAC inhibitor, cell cycle inhibitorHDAC inhibitor, apoptosis stimulantdopamine receptor antagonistp38 MAPK inhibitor, interleukin inhibitorHDAC inhibitorHDAC inhibitor

10/12 best survival-correlated compound perturbagens are HDAC inhibitors

Deep Deletion Missense MutationTruncating Mutation

mRNA UpregulationmRNA Downregulation

KEAP1: Altered in 20% of lung adenocarcinoma tumors

Stabilized Nrf2

Keap1 Keap1

Nrf2

Antioxidant response element

Nrf2

Pol II

Ubiq

Degradation

Oxidative stress KEAP1 sequesters NRF2 in the cytoplasm. Mutations or CNA should disrupt this interaction. What does CMap say?

PerturbagenKEAP1 Knock-downNRF2 Knock-down NRF2 OverexpressionSTK11 Knock-down

Enrichment0.669

-0.6160.7360.734

P-value0.00140.01310.06050.0004

Comparing LUAD samples with KEAP1 mutations to CMap

Validation of signature: KEAP1 mutants are similar to KEAP1 knock-downConfirmation of biological interaction: KEAP1 mutants are opposite to NRF2 knock-down, similar to NRF2 overexpression

The #1 connection in all of CMap aggregated signatures: STK11 knock-down. STK11 regulates AMPK signaling - effects on metabolism, energy homeostasis

1k