transcriptionfactoractivitiesenhancemarkersof drug ......(y ic50) were modeled as a function of the...

13
Translational Science Transcription Factor Activities Enhance Markers of Drug Sensitivity in Cancer Luz Garcia-Alonso 1,2 , Francesco Iorio 1,2 , Angela Matchan 2,3 , Nuno Fonseca 1 , Patricia Jaaks 3 , Gareth Peat 1,2 , Miguel Pignatelli 1,2 , Fiammetta Falcone 4 , Cyril H. Benes 5 , Ian Dunham 1,2 , Graham Bignell 2,3 , Simon S. McDade 4 , Mathew J. Garnett 2,3 , and Julio Saez-Rodriguez 1,2,6 Abstract Transcriptional dysregulation induced by aberrant transcription factors (TF) is a key feature of cancer, but its global inuence on drug sensitivity has not been examined. Here, we infer the tran- scriptional activity of 127 TFs through analysis of RNA-seq gene expression data newly generated for 448 cancer cell lines, com- bined with publicly available datasets to survey a total of 1,056 cancer cell lines and 9,250 primary tumors. Predicted TF activities are supported by their agreement with independent shRNA essen- tiality proles and homozygous gene deletions, and recapitulate mutant-specic mechanisms of transcriptional dysregulation in cancer. By analyzing cell line responses to 265 compounds, we uncovered numerous TFs whose activity interacts with anticancer drugs. Importantly, combining existing pharmacogenomic mar- kers with TF activities often improves the stratication of cell lines in response to drug treatment. Our results, which can be queried freely at dorothea.opentargets.io, offer a broad foundation for discovering opportunities to rene personalized cancer therapies. Signicance: Systematic analysis of transcriptional dysregula- tion in cancer cell lines and patient tumor specimens offers a publicly searchable foundation to discover new opportunities to rene personalized cancer therapies. Cancer Res; 78(3); 76980. Ó2017 AACR. Introduction Transcriptional dysregulation is required for tumor progression and drug resistance acquisition. Many cancer driver genes are transcription factors (TF). Notable examples include TP53, the most commonly mutated tumor suppressor that controls cell growth arrest (1), and HIF1A, a key regulator of the adaptive response to hypoxia and angiogenesis (2). TFs are commonly dysregulated due to genomic alterations or aberrations in their regulatory proteins. For example, TP53 activity can be suppressed through amplication of its repressor MDM2 (3) and HIF1A upregulation is often induced by loss-of-function mutations in VHL (4). Because of their role as downstream signaling effectors, aberrant activities of any pathway protein may dysregulate TF activities, altering the expression of its transcriptional targets or "regulon." Different from driver alterations in kinase-mediated signaling cascades, where redundancy provides compensatory mechanisms, aberrant transcriptional regulators have been argued to be harder to circumvent by secondary genomic alterations (5). Consequently, TFs have been proposed as key nodal oncogenic drivers and their activity patterns used to characterize genomic aberrations in cancer (6, 7) or their inuence on a patient's prognosis (8). Recently, the Genomics of Drug Sensitivity in Cancer (GDSC; refs. 9, 10), Cancer Therapeutics Response Portal (11), and Cancer Cell Line Encyclopedia (CCLE; ref. 12) have generated large-scale public pharmacogenomic datasets spanning multiple molecular data types across hundreds of cancer cell lines. These datasets enabled the identication of genomic, transcriptomic, and epige- nomic markers of drug sensitivity (9, 10, 12) and have uncovered a complex network of genomic alterations interacting with sensitiv- ity to hundreds of drugs. The challenge is now to dissect the underlying molecular mechanisms regulating drug response, for which novel and more systemic functional approaches are needed. Here, we used TF regulatory activities as sensors of pathway dysregulation. Assuming that the activity of a TF can be estimated from the mRNA levels of its direct target genes, dened from prior TF-gene regulatory data, we derived single-sample TF activity proles across 9,250 primary tumors from The Cancer Genome Atlas (TCGA) and 1,056 cancer cell lines, employing newly generated RNA sequencing (RNA-seq) data for 448 cancer mod- els. We evaluated the prediction accuracy on independent geno- mics and gene essentiality screens. Then, we mined for statistical interactions between somatic mutations and TF activities. To discriminate mutant-specic effects, we functionally reannotated somatic mutations based on the affected protein feature (e.g., regulatory sites, protein interactions, truncation, etc.). Finally, we investigated TF activities alone or in combination with genomic markers as potential predictors of sensitivity to 265 compounds, performing a large-scale evaluation of TFs as markers of drug 1 European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom. 2 OpenTargets,Well- come Genome Campus, Cambridge, United Kingdom. 3 Wellcome Trust Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom. 4 Centre for Cancer Research and Cell Biology, Queen's University Belfast, Belfast, United Kingdom. 5 Massachusetts General Hospital, Boston, Massachusetts. 6 Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Faculty of Medicine, Aachen, Germany. Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/). L. Garcia-Alonso and F. Iorio are co-rst authors of this article. Corresponding Author: Julio Saez-Rodriguez, RWTH Aachen University, Fac- ulty of Medicine, Aachen 52074, Germany. Phone: 4924-1808-9347; Fax: 49241- 80-82189; E-mail: [email protected] doi: 10.1158/0008-5472.CAN-17-1679 Ó2017 American Association for Cancer Research. Cancer Research www.aacrjournals.org 769 on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Upload: others

Post on 07-Mar-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

Translational Science

TranscriptionFactorActivitiesEnhanceMarkersofDrug Sensitivity in CancerLuz Garcia-Alonso1,2, Francesco Iorio1,2, Angela Matchan2,3, Nuno Fonseca1,Patricia Jaaks3, Gareth Peat1,2, Miguel Pignatelli1,2, Fiammetta Falcone4,Cyril H. Benes5, Ian Dunham1,2, Graham Bignell2,3, Simon S. McDade4,Mathew J. Garnett2,3, and Julio Saez-Rodriguez1,2,6

Abstract

Transcriptional dysregulation induced by aberrant transcriptionfactors (TF) is a key feature of cancer, but its global influence ondrug sensitivity has not been examined. Here, we infer the tran-scriptional activity of 127 TFs through analysis of RNA-seq geneexpression data newly generated for 448 cancer cell lines, com-bined with publicly available datasets to survey a total of 1,056cancer cell lines and 9,250 primary tumors. Predicted TF activitiesare supported by their agreement with independent shRNA essen-tiality profiles and homozygous gene deletions, and recapitulatemutant-specific mechanisms of transcriptional dysregulation incancer. By analyzing cell line responses to 265 compounds, we

uncovered numerous TFs whose activity interacts with anticancerdrugs. Importantly, combining existing pharmacogenomic mar-kers with TF activities often improves the stratification of cell linesin response to drug treatment. Our results, which can be queriedfreely at dorothea.opentargets.io, offer a broad foundation fordiscovering opportunities to refine personalized cancer therapies.

Significance: Systematic analysis of transcriptional dysregula-tion in cancer cell lines and patient tumor specimens offers apublicly searchable foundation to discover new opportunities torefine personalized cancer therapies. Cancer Res; 78(3); 769–80.�2017 AACR.

IntroductionTranscriptional dysregulation is required for tumor progression

and drug resistance acquisition. Many cancer driver genes aretranscription factors (TF).Notable examples include TP53, themostcommonly mutated tumor suppressor that controls cell growtharrest (1), and HIF1A, a key regulator of the adaptive response tohypoxia and angiogenesis (2). TFs are commonly dysregulated dueto genomic alterations or aberrations in their regulatory proteins.For example, TP53 activity canbe suppressed throughamplificationof its repressorMDM2(3) andHIF1Aupregulation is often inducedby loss-of-function mutations in VHL (4). Because of their role asdownstream signaling effectors, aberrant activities of any pathwayprotein may dysregulate TF activities, altering the expression of itstranscriptional targets or "regulon."Different fromdriver alterationsin kinase-mediated signaling cascades, where redundancy provides

compensatory mechanisms, aberrant transcriptional regulatorshave been argued to be harder to circumvent by secondary genomicalterations (5). Consequently, TFs havebeenproposed as key nodaloncogenic drivers and their activity patterns used to characterizegenomic aberrations in cancer (6, 7) or their influence on a patient'sprognosis (8).

Recently, the Genomics of Drug Sensitivity in Cancer (GDSC;refs. 9, 10), Cancer Therapeutics Response Portal (11), and CancerCell Line Encyclopedia (CCLE; ref. 12) have generated large-scalepublic pharmacogenomic datasets spanning multiple moleculardata types across hundreds of cancer cell lines. These datasetsenabled the identification of genomic, transcriptomic, and epige-nomicmarkers of drug sensitivity (9, 10, 12) andhave uncovered acomplex network of genomic alterations interacting with sensitiv-ity to hundreds of drugs. The challenge is now to dissect theunderlying molecular mechanisms regulating drug response, forwhich novel andmore systemic functional approaches are needed.

Here, we used TF regulatory activities as sensors of pathwaydysregulation. Assuming that the activity of a TF can be estimatedfrom themRNA levels of its direct target genes, defined from priorTF-gene regulatory data, we derived single-sample TF activityprofiles across 9,250 primary tumors from The Cancer GenomeAtlas (TCGA) and 1,056 cancer cell lines, employing newlygenerated RNA sequencing (RNA-seq) data for 448 cancer mod-els. We evaluated the prediction accuracy on independent geno-mics and gene essentiality screens. Then, we mined for statisticalinteractions between somatic mutations and TF activities. Todiscriminate mutant-specific effects, we functionally reannotatedsomatic mutations based on the affected protein feature (e.g.,regulatory sites, protein interactions, truncation, etc.). Finally, weinvestigated TF activities alone or in combination with genomicmarkers as potential predictors of sensitivity to 265 compounds,performing a large-scale evaluation of TFs as markers of drug

1European Molecular Biology Laboratory - European Bioinformatics Institute,Wellcome Genome Campus, Cambridge, United Kingdom. 2OpenTargets,Well-come Genome Campus, Cambridge, United Kingdom. 3Wellcome Trust SangerInstitute, Wellcome Genome Campus, Cambridge, United Kingdom. 4Centre forCancer Research and Cell Biology, Queen's University Belfast, Belfast, UnitedKingdom. 5Massachusetts General Hospital, Boston, Massachusetts. 6JointResearch Centre for Computational Biomedicine (JRC-COMBINE), RWTHAachen University, Faculty of Medicine, Aachen, Germany.

Note: Supplementary data for this article are available at Cancer ResearchOnline (http://cancerres.aacrjournals.org/).

L. Garcia-Alonso and F. Iorio are co-first authors of this article.

Corresponding Author: Julio Saez-Rodriguez, RWTH Aachen University, Fac-ulty of Medicine, Aachen 52074, Germany. Phone: 4924-1808-9347; Fax: 49241-80-82189; E-mail: [email protected]

doi: 10.1158/0008-5472.CAN-17-1679

�2017 American Association for Cancer Research.

CancerResearch

www.aacrjournals.org 769

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 2: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

sensitivity in cancer. The collection of identified interactions ispublicly available at http://dorothea.opentargets.io.

Materials and MethodsCell lines and primary tumors' data

RNA from 448 cell lines was sequenced in-house (Supplemen-tary Table S1A). Cell lines were sourced from collaborators orrepositories and have been used for GDSC (9, 10) and COSMICcell line (13) projects. These have been cryopreserved in aliquotsin liquid nitrogen for 7 years in the laboratory. A single cryovialwas thawed for use and propagated for a maximum of 3 monthsbefore being discarded. All cell lines were mycoplasma negative.Cell line's identity was compared, where possible, against thoseprovided by the repositories (ATCC, Riken, JCRB, and DSMZ)using a panel of 16 STRs (AmpFLSTR Identifiler KIT, ABI) and thecorresponding genotype data are available at COSMIC database(http://cancer.sanger.ac.uk/cell_lines). RNA libraries were madewith the Stranded mRNA Library Kit from KAPA Biosystemsaccording to themanual using the Agilent Bravo platform. Librar-ies were sequenced on an Illumina HiSeq 2000. Raw and pro-cessed data were deposited on the European Genome-phenomeArchive (EGAS00001000828) and ExpressionAtlas (E-MTAB-3983). For the other cell lines, RNA-seq fastq files were down-loaded from CCLE (12) (PRJNA169425) and Klijn and collea-gues' work (14) (EGAS00001000610). To minimize technicalbiases, the 3 datasets were reanalyzed using iRAP (15) to obtainraw counts. TCGA samples' raw counts were downloaded fromthe Gene Expression Omnibus (GSE62944; ref. 16). Raw countswere normalized and processed into counts per million reads(Supplementary Methods).

For cell lines, whole-exome sequencing (WES), copy numberalterations (CNA), methylation, and drug response data wereretrieved from the GDSC1000 web portal (10), whereas geneessentiality scores were downloaded from the Project-Achillesweb portal (17). For primary tumors, WES, can, and clinical datawere retrieved from cBioPortal (Supplementary Methods). Sup-plementary Table S1A–S1D lists all samples and data types.

Consensus TF regulons' dataWe defined a set of high-confidence human TFs from the

Supplementary Table S3 provided in Vaquerizas and colleagues'work (18), by excluding unlikely TFs noted as "x." Second, weretrieved TF–target regulatory interactions from public resourcescovering different TF-binding evidences, including TF-binding site(TFBS) predictions, chromatin immunoprecipitation coupled withhigh-throughput data (ChIP-X), text-mining derived andmanuallycurated TF–target interactions (Supplementary Methods). For eachTF, we defined a consensus TF regulon (CTFR) selecting TF–targetinteractions reported in more than one source (SupplementaryTable S2). TF–target interactions are unsigned and unweighted.

Scoring basal TF activitiesGiven a matrix of normalized gene expression values per

sample, the first step consisted in a gene-wise normalizationemploying a kernel estimator of the cumulative density function(kcdf; ref. 19).Next, the level of activity of a TF regulon in a samplewas approximated as a function of the collective mRNA levels ofits targets using aREA (analytic rank-based enrichment analysis), astatistical method from the VIPER R package based on the meanranks' comparisons (6). aREA's normalized enrichment scores

(NES) were used as estimates of regulon relative activity. Positiveand negative scores indicated, respectively, greater or weakerrelative activity in a sample compared with the backgroundpopulation (cell lines or TCGA samples). The aREA algorithmwas selected because it takes into account the effect (activation/repression) of the TF on each target, thus enabling comparisonswith other types of regulatory networks. R code to compute TFactivities is available at https://github.com/saezlab/DoRothEA.

Comparison between normal and primary tumorsDifferential TF activities contrasting normal and tumoral TCGA

samples were computed using the limma R package. Thematrix ofrelative activities per samplewas used tofit a linearmodel for eachTF (lmFit) and the eBayes test used to obtain the correspondingmoderated t statistics, nominal and adjusted P values.

Association analysis with TF and driver mutationsWe grouped somatic variants in driver genes according to their

potential implications at the protein level (Supplementary Meth-ods). Next, we analyzed the association between each group ofmutations and the activity of a TF with an ANOVA test as in ref. 10.To ensure comparable measurements among datasets and groupsof mutations, we removed confounding effects by regressing outeffects associated with tissue lineage from the TF activity profiles.Next, for each mutation group/TF pair, the corrected TF activitiesweremodeledas a functionof themutation status. The change inTFactivity between mutants and wild-type samples was defined byCohen d effect size and significance was estimated with type-IIANOVAusing the carRpackage.P valueswere adjusted formultipletesting corrections (Benjamini–Hochbergmethod) on a gene basis.

Association analysis with drug responseWe used a linear model to correlate drug responses with TF

activities. For each drug–TF pair, drug IC50s across all samples(YIC50) were modeled as a function of the dependent covariates(Xcovariates, including tissue-type in pan-cancer analyses, micro-satellite instability status, and screening medium), TF-estimatedactivity (XTF), and noise (c):

YIC50 ¼ bcovariatesXcovariates þ bTFXTF þ c

The impact of the TF on drug response was defined by theregression coefficient (bTF) estimated with a multiple linear leastsquares regression. Significance of the regressors was estimatedwith a type-II ANOVA using the car R package. Finally, for eachcancer type,P valueswere adjusted formultiple testing correctionsusing the Benjamini–Hochberg method.

In the pan-cancer analysis, the tissue-type was defined by theGDSC_description2 due to the presence of several cell lineswithouta matching TCGA label. In cancer-specific analyses, we groupedthe samples according to the TCGA labels, for consistencywith theGDSC1000 study (10). We additionally tested Ewing sarcoma,leukemia, lymphoma, osteosarcoma, and rhabdomyosarcomatumor types.

Association analysis between TF activities and knownpharmacogenomic markers

Large-effect significant pharmacogenomic markers [P < 0.001,false discovery rate (FDR) < 20% and GlassDs > 1] were extractedfrom GDSC1000 (10). For each pharmacogenomic marker(GM), we fit a null multiple regression model (Mnull) where the

Garcia-Alonso et al.

Cancer Res; 78(3) February 1, 2018 Cancer Research770

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 3: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

Figure 1.

Estimation of TF activities overview.A, RNA-seq basal gene expression in cell lines data and TCGA samples (normal and tumors together) was processed separatelyto obtain normalized log counts permillion (CPM) that were then gene-wise normalized using a kernel estimation of the cumulative density function (kcdf).B,CTFRswere derived by selecting TF–target interactions observed in at least two sources among a collection of databases. In final CTFRs, targets under >10 TFs andTFswith <3 targets are removed.C,Estimation of single-sample TF activities fromgene expression data and the CTFRs using the aREA algorithm fromVIPER. Normaland tumor TCGA samples were analyzed together.

Transcription Factor Activities in Cancer Drug Sensitivity

www.aacrjournals.org Cancer Res; 78(3) February 1, 2018 771

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 4: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

independent variable is the drug IC50 (YIC50) and the dependentvariables are the cell line covariates (tumor type, MSI, and screen-ing media; Xcovariates), the genomic marker (XGM), and a noiseterm (c):

Mnull : YIC50 ¼ bcovariatesXcovariates þ bGMXGM þ c

Next, for each TF in our panel, we built a nested regressionmodel (MTF) containing the same variables of Mnull plus theactivities of the TF (XTF) and their interaction with GM (XGM�TF):

MTF : YIC50 ¼ bcovariatesXcovariates þ bGMXGM þ bTFXTF

þ bGM�TFXGM�TF þ c

Every MTF was compared with the corresponding null modelMnull using a likelihood ratio (LR) test. Resulting P values wereadjusted using the Benjamini–Hochberg method.

ResultsTF activities' estimation

First, we assembled a collection of basal transcriptionalprofiles of immortalized human cancer cell lines and primarytumors (Fig. 1A). For cancer cell lines, we newly derived RNA-seqdata from 448 samples, which we complemented with RNA-seqprofiles from 934 and 622 cell lines, respectively, from the CCLE(12) and Klijn and colleagues' work (14). This yielded a total of1,362unique cancer cell lines, ofwhich 1,056 are inCOSMIC (13)(Supplementary Table S1A). To minimize technical biases, wederived raw counts using a common pipeline. For primarytumors, we downloaded RNA-seq raw counts for 9,250 TCGAprimary tumors and 741 normal samples (16). Cell lines andTCGA samples were processed and normalized separately.

To define the TF regulons (i.e., sets of geneswhose transcriptionis regulated by a given TF), we collected 15,211 TF–target inter-actions appearing in at least two publicly available resources(hereafter CTFR; Fig. 1B; Supplementary Fig. S1A and S1B; Sup-plementary Table S2). To ensure a minimum signal when wecompute TF activities, we removed targets regulated bymore than10 TFs and TFs with less than 3 targets in the expression matrix.The final CTFRs consisted of 7,445 targets for 127 TFs, with 111targets per TF on average (Supplementary Fig. S1C). Pairwiseoverlap between regulons was low (average Jaccard similaritycoefficient ¼ 0.0044; Supplementary Fig. S1D), indicating negli-gible levels of redundancy between CTFRs.

Next, we normalized the transcriptomic data gene-wisely toestimate relative levels of basal activity of each CTFR in eachsample using the aREA algorithm (6). Cell lines and TCGAsamples (tumor þ normal) were analyzed separately (Fig. 1C;Supplementary Table S3A and S3B). NESs (Supplementary Fig.S2A–S2D) were used as estimates of CTFR activity relative to thebackground population (hereafter simplified as "TF activities").Subsampling analysis revealed that activity estimates were robustin populations with n � 20 (Supplementary Fig. S2E).

We evaluated the TF activity estimations using independentbenchmark data derived from an essentiality screening (17), andCNA and WES data in cell lines (Supplementary Methods).Moreover, we investigated the inclusion of methylation data asa means to refine CTFRs on a cell line basis, excluding from theregulons those targets with hypermethylated promoters, not

observing significant performance improvements (Supplementa-ry File, Supplementary Figs. S3 and S4). Finally, we compared theactivities derived from the CTFRs against those derived fromreverse-engineered regulons proposed in ref. 6, observing slightlybetter performances for CTFRs (Supplementary File; Supplemen-tary Figs. S3–S6). Hence, we selected CTFR-based estimations(without including promoter methylation information) for ourdownstream analysis.

TF activities across primary tumors and cell linesTo obtain a global picture of TFs operating in primary tumors,

we studied how TF activities distribute across TCGA samples.Differential activity analysis of normal versus tumor samplesrevealed groups of TFs consistently activated or repressed acrossthe 14 tumor typeswithmatched normal samples. AlthoughmostTF regulons decrease their activity, a small subset undergoes arecurrent increase across tumor types (Fig. 2A), including onco-genic regulators of cell cycle (MYC, MAX, E2F family members,FOS, and FOXM1), tumor invasion, and angiogenesis (ELK1 andETS1; ref. 20).

Next, we compared the TF profiles between cancer types. First,we summarized sample-level activities into cancer-level activities.For each TF, we ranked the samples based on TF activity andquantified the enrichment of each cancer type at the top of theranks using the aREA algorithm (Supplementary Fig. S7A and S7B;Supplementary Table S3C and S3D). Hierarchical clusteringbased on Euclidean distance highlighted similar activity profilesfor primary tumors from the same tissue lineage, such as thediffuse gliomas glioblastomamultiforme (GBM) and lower gradeglioma, hematopoietic and lymphoid acute myeloid leukemia(LAML) and diffuse large B-cell lymphoma, or squamous-liketumors bladder urothelial carcinoma, cervical squamous cellcarcinoma, head-neck squamous cell carcinoma (HNSC), andlung squamous cell carcinoma (Fig. 2B). These clusters were alsoobserved in the cell lines (Supplementary Fig. S7C). Correlationanalysis revealed a significant agreement in the TF profilesbetween cell lines and primary tumors from the same tissuelineage (average Pearson correlation 0.5 and �0.035 within andbetween different tumor types, respectively; Fig. 2C).

Closer examination of well-established tissue-specific TFs(retrieved from the Human Cancer Protein Atlas v15; ref. 21)showed that our approach captures 11 of 12 TFs operatingpreferentially in specific tissues in primary tumors (Fig. 2D), suchas ESR1 and FOXA1 in BRCA or MITF in SKCM. Note that forZEB1, a transcriptional repressor involved in epithelial-to-mes-enchymal transition (EMT; ref. 22), higher protein activitiescorrespond to downregulation of the regulon. Importantly, thesetendencies are maintained in the cell lines with the exception ofandrogen receptor (AR), where 4 of 6 prostate cell lines displayAR-independent proliferation (Supplementary Table S1C). Takentogether, these results show that our approach captures expectedactivity patterns of known cancer-specific TFs.

TF activities dissect mutant-specific aberrationsPrevious studies demonstrated that different mutations in the

same protein could cause a continuum of effects, ranging fromneutrality to a significant functional impact (23). We thus set outto characterize the effect of mutations occurring in TFs on theirown activity. As proof of concept, we focused on TP53 due to itshigh mutation frequency and heterogeneity. We curated TP53mutations according to specific mutations, hotspots, protein

Garcia-Alonso et al.

Cancer Res; 78(3) February 1, 2018 Cancer Research772

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 5: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

consequence, zygosity (in cell lines), affected domain, PTM, orstructural property and previously proposed mutation stratifiers(Supplementary Table S4A; ref. 24). Subsequently, we comparedpredicted TP53 activities of each TP53mutation group with wild-type samples (Fig. 3A). To avoid confounding effects due to the

use of different samples and tumor types, we regressed outthe tissue lineage from the TF activity profiles through linearmodeling. Our results indicated that all TP53 mutation groupssignificantly affecting TP53 transcriptional activity decreased it(Supplementary Fig. S8; Supplementary Table S4B). Overall,

Figure 2.

TF activities across primary tumors andcancer cell lines. A, Heatmap of thedifferential TF activity (log-foldchange) between normal and tumoralsamples across 14 tumor types. Red andblue indicate lower- and higher activityin tumors, respectively, whereas whiteindicates nonsignificant (Padj > 0.05)associations. Only TFs with significancein at least 50% of the tumors areplotted. B, Tumor type similarity:correlation-based hierarchicalclustering of tumor type–level TFactivities for 23 primary tumors.C, Comparison of TF activities betweenprimary tumors and cell lines for 19common tumor types. Each value in theheatmap represents the Pearsoncorrelation coefficients between tumortype–level TF activities. Asterisks,significant correlations (Padj < 0.05).D, Activity distributions for tissue-specific TFs. Each point represents theTF activity in a given patient or cell line.

Transcription Factor Activities in Cancer Drug Sensitivity

www.aacrjournals.org Cancer Res; 78(3) February 1, 2018 773

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 6: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

homozygous mutations and deletions had a stronger effect sizethan heterozygous mutations (Fig. 3B). Focusing on the mostfrequent mutational hotspots revealed that R231X and R273Creached larger effect sizes thanR248QandR175Hsubstitutions inboth primary tumors and cell lines. However, direct pairwisecomparison between mutants did not yield significant resultsalone. Importantly, these significant changes in activity werecorrelated between primary tumors and cell lines (R2 ¼ 0.551,P ¼ 1.14 � 10�8; Fig. 3C). This suggests that transcriptionalactivity prediction may better capture effect on TP53 activity thanmutation alone. This is supported by comparison of activitypredictions with experimentally defined TP53-mutant yeast trans-activation class from the IARC TP53 Database (25), where eachpossible TP53 missense mutation is assigned to a transactivationclass, functional, partial, or nonfunctional, according to its effectson the transcription of 8 TP53-responsive promoters in yeast.Comparison between nonfunctional and the other missensemutants showed a significant agreement with our predictions incell lines (one-tailed t tests, P ¼ 0.00535) and, although margin-ally significant, in primary tumors (P ¼ 0.0418).

Motivated by these results, we investigated systematically theeffect of mutations in TFs on their activity. To distinguishmutant-specific effects, these were studied individually. Impor-tantly, to consider nonrecurrent yet potentially functionaldriver mutations, we also grouped mutations that, although

introducing different changes in different residues, could affectprotein function in a similar way (e.g., same structural region,interaction, or posttranslational modification site). We recov-ered 1,200 mutation groups in 122 TFs from primary tumors(n � 3). Pan-cancer analysis in primary tumors identified 9 TFsthat, when mutated, exhibit a significant change in activity(FDR < 5%; Fig. 3D; Supplementary Table S4C). In general,mutations in TFs with known oncogenic roles, such as NFE2L2,HIF1A, and AHR, were associated with increased regulatoryactivity, pointing to gain-of-function mutations. In contrast,mutations in the proposed tumor suppressors STAT2 andFOXA1 are associated with decreased activity. Also, truncatingmutations in the transcriptional repressor REST resulted inincreased regulon expression (Fig. 3E). Analyzing cell linesshowed similar trends for REST truncating mutations (P ¼0.00367, FDR ¼ 0.0367) and NFE2L2 missense mutation inD29 (P ¼ 0.009, FDR ¼ 0.099).

Closer examination of results revealed again differences in theeffect of mutation types on protein activity. In NFE2L2, a cyto-protective oncogene, missense mutations affecting p.W24/p.D29residues at the surface or at the KEAP1 interface (positions 77–82)are associated with higher NFE2L2 activity, with NFE2L2W24R/C

mutations causing the strongest increase (Fig. 3E). Mutations atthe KEAP1-binding site were already proposed to be positivelyselected to abolish NFE2L2 degradation (26).

Figure 3.

Functional characterization ofmutant TF on transcriptional activities.A, Functional evaluation of TFmutations on their own activity.B,Boxplot depicting TF activitiesaccording TP53 mutation zygosity in cell lines. C, Comparison of the predicted effect size of significant TP53 mutations between primary tumors and celllines. D, Systematic characterization of mutant TFs in primary tumors. Each bar represents the number of significant mutant groups in the TF impacting its activity.Red and blue indicate positive and negative effects, respectively. E, Boxplots comparing TF activities across different variants for NFE2L2, AHR, FOXA1, REST, andSREBF2. Red dots indicate that the mutation is reported in COSMIC v70.

Garcia-Alonso et al.

Cancer Res; 78(3) February 1, 2018 Cancer Research774

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 7: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

Associations with known driver mutationsNext, we evaluated how mutations in any cancer driver genes,

proposed in refs. 27, 28, could impact TF activities. We groupedmutations in driver genes following the same strategy describedfor TFs. This yielded 1,774 mutation groups (n� 5) in 171 drivergenes. Systematic comparisons of TF activities in mutant againstwild-type primary tumors yielded 3,565 driver mutation groups–TF associations involving 97 driver genes and 75 TFs (FDR <5%; Fig. 4A and B; Supplementary Table S5A). The same analysisin cell lines allowed us to study only 533 mutation groups andrendered fewer associations (probably due to lower sample num-ber) that involved 36 interactions between 17 drivers and 25 TFs(FDR < 5%; Supplementary Table S5B). Importantly, 12 hits wereshared between primary tumors and cell lines with concordanteffect (Fisher exact test,OR¼7.89,P<1.32�10�6, Fig. 4C). Someof these associations represent proposed mechanisms of TF reg-ulation, such as the repression of E2F1 by RB1, perhaps the best-described inhibitor of TF function (29), or ELK1 regulation byERK–MAPK pathway (20, 30).

To assess whether the detected associations represent plau-sible driver–TF regulatory events, we extracted directed edgesfrom literature-curated signaling networks from OmniPath(31) and quantified shortest path lengths between every driv-er–gene/TF pair. Enrichment analysis confirmed that significantdriver–gene/TF associations tend to be closer than nonsignif-icant associations (Fig. 4D). Next, we investigated whether thepredicted effect of driver mutations on TF activities (associationsign) agrees with the TF's role in cancer. We classified TFs into 3groups according to their role in cancer: (i) upregulated incancer, if the TF displays significant greater activity in tumorthan in normal samples or is a known oncogene (27, 28); (ii)downregulated, if the TF function is repressed in tumor samplesor is a tumor suppressor; or (iii) neutral. Enrichment analysisrevealed that positive driver/TF interactions (i.e., potential TF-activating events) tend to involve cancer-upregulated TFs, in

contrast, negative interactions are more prone to involve can-cer-downregulated TFs (Fig. 4E). Taken together, our resultssuggest that the identified associations point to potentialmechanisms of driver-mediated transcriptional dysregulationin cancer.

Drug sensitivity interactions in 984 cancer cell linesWe next investigated the potential of TF activities as markers of

response to 265 compounds across 984 cancer cell lines (10).Association between TF–drug pairs was tested with a linearregression approach accounting for potentially confounding fac-tors (tissue lineage, microsatellite instability, and cell line growthmedia).

A pan-cancer analysis identified 3,300 significant TF–drugassociations (P < 0.001, FDR < 5%), with 251 drugs (95%) and123 TFs (97%) implicated in at least one interaction (Supple-mentary Table S6A). Most drugs were associated with multipleTFs, which, considering the relatively low overlap between reg-ulons (Supplementary Fig. S1D), may correspond to functionalcooperation in transcriptional regulators rather than target redun-dancy. In fact, interacting TFs display similar activity profiles(Supplementary Fig. S9). Most TF–drug associations involvedrelevant oncogenic TFs, such as MYC, PAX5, MYCN, FOXA1, andGATA3 (Fig. 5A; Supplementary Table S6B and S6C). Significantassociations were enriched for cytotoxic drugs and compoundstargeting cytoskeleton, metabolism, DNA replication, JNK-p38,and ERK–MAPK signaling (Fisher exact test, P < 0.001, Fig. 5B;Supplementary Table S6D).

Some of the investigated TFs are recurrently mutated incancer and have already been proposed as genomic markersof drug sensitivity. To validate whether TF activities are able torecapitulate the same drug–gene associations observed at thegenomic level, we compared our findings with the pharmaco-genomic interactions (FDR < 25% and P < 0.001) previouslyidentified for these cell lines (10). Our approach identified 12

Figure 4.

Functional characterization of driver mutations on TF activities. A, Volcano plot with effect size (x) and adjusted P value (y) of all tested pan-cancer associations.B, Number of significant associations per TF in primary tumors and cell lines, colored according to the sign of the association. Red and blue correspond tosignificantly higher or lower TF activities inmutants comparedwithwild type, respectively.C, Significant (FDR < 5%) TF–driver associations fromprimary tumors andcell lines and the overlap. Shared driver–TF pairs are indicated in the table. D, Log OR of finding a significant interaction by network distance (minimumnumber of directed intermediates between the driver and the corresponding TF). �� , P < 0.05; ��� , P < 0.001 (Fisher exact test). E, Enrichment in positive/negativedriver–TF associations (red/blue, respectively) to involve oncogenic/tumor suppressor TFs, respectively.

Transcription Factor Activities in Cancer Drug Sensitivity

www.aacrjournals.org Cancer Res; 78(3) February 1, 2018 775

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 8: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

of the 21 significant pharmacogenomic interactions involving aTF in our panel (Fisher exact test, P ¼ 8.39 � 10�4, OR ¼ 4.45),including TP53 mutations interacting with response to Nutlin-3a; MYC with vismodegib and PAX5 with bleomycin, among

others. The same drug association analysis on TF activitiesderived from reverse-engineered regulons (6) instead of CTFRs(on the overlapping TFs) rendered fewer hits than CTFRs, nonein the pharmacogenomic marker list (Supplementary Fig. S10).

Figure 5.

Associations between TF activities and drug sensitivity. A, Frequency of TFs in significant pan-cancer TF–drug associations. B, Enrichment P values for drug classesthat are overrepresented among significant pan-cancer associations. C and D, Heatmaps of significant associations with drugs targeting ERK–MAPKpathway (C) and cytotoxic drugs (D). E,Volcano plotwith effect size (x) and adjustedP value (y) of all tested pan-cancer TF–drug associations. Red and blue indicatepositive (resistance) and negative (sensitivity) effects, respectively. F, Volcano plot with effect size (x) and adjusted P value (y) of all tested cancer-specific TF–drugassociations. G, Examples of cancer-specific TF–drug associations. Red and blue indicate positive (resistance) and negative (sensitivity) effects, respectively.

Garcia-Alonso et al.

Cancer Res; 78(3) February 1, 2018 Cancer Research776

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 9: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

Next, we investigated whether TF activity predicts sensitivityto direct intervention of their upstream regulators with targeteddrugs. We extracted from OmniPath the interactions involvingthe proteins targeted by the drugs. Enrichment analysis con-firmed that significant hits were more likely to involve TFsdirectly connected to the targets of the associated drug (Fisherexact test, P ¼ 0.0155, OR ¼ 1.2), suggesting that predictedactivities may be indeed indicative of upstream pathway acti-vation and therefore useful sensitivity markers for drugs target-ing their components. For example, sensitivity to drugs target-ing the ERK–MAPK pathway (Fig. 5C) was associated withincreased activities in several MEK-targeted TFs, including SPI1,JUN, JUND, and STAT3 (32, 33), whereas vulnerability to thetwo tested RSK inhibitors correlates with ELK1, another knowndownstream MAPK target (20, 30).

Also, our analyses also identified TFs showing simultaneoussensitivity interactions to drugs targeting common processes. Forexample, sensitivity to cytotoxic compounds was associated withTFs classically upregulated in actively proliferating cells such asMYC, whereas activity of tissue-specific TFs (such as MITF, REST,orHNF4A)was associatedwith resistance to these drugs (Fig. 5D).

The strongest detected association involved TP53 and Nutlin-3a [regression coefficient (coeff) ¼ �0.57, P ¼ x1.58 �10�30, Fig. 5E]. Nutlin-3a is an MDM2 inhibitor that blocksMDM2-mediated TP53 degradation. Our results agree with phar-macogenomic studies in that samples with lower TP53 activitiesshow lower sensitivity to MDM2 inhibition (9, 10). Anotherstrong interaction was ZEB1 upregulation, an EMT marker, asso-ciatedwith resistance to EGFR inhibitor afatinib (coeff¼�0.53, P¼ 5.19 � 10�15) and gefitinib (coeff ¼ �0.24, P ¼ 5.9 � 10�7).This is in agreement with a recent study in NSCLC describingZEB1-mediated acquired resistance to EGFR inhibitors (34).

Cancer-specific analysis revealed fewer associations comparedwith the pan-cancer analysis, probably due to reduced sample size(Fig. 5F; Supplementary Table S6E). Still, we recovered 125 TF–drug associations (P<0.001, FDR<10%),most in lymphoma, thelargest subpopulation. Some hits involved drugs with no associ-ated genomic markers (Fig. 5G). Among the top hits, we foundthatNFKB1activitywas associatedwith sensitivity to ITK inhibitorBMS-509744 in lymphoma cells (coeff¼ 0.612, P¼ 4.5� 10�7);in STAD, sensitivity to PHA-793887, a pan-CDK inhibitor, wasassociated with YY1, recently proposed to contribute to gastriconcogenesis (coeff¼�1.05, P¼ 5.9� 10�7; ref. 35); inmyeloma,resistance to the tyrosine kinase inhibitor sorafenibwas associatedwith the activity of IRF1, a proposed tumor suppressor for acutemyeloid leukemia (coeff ¼ 0.8, P ¼ 8.27� 10�7; ref. 36); finally,sensitivity to the MEK inhibitor RDEA119 in HNSC was associ-ated with ZEB1 activity (coeff ¼ �0.898, P ¼ 1.13� 10�6), a keyEMT effector in HNSC development (37).

TF activities enhance the predictive ability of genomic markersWe showed before that the strongest TF–drug association

detected involved the well-known interaction between TP53 andNutlin-3a. According to previous studies, samples with TP53mutations are Nutlin3a resistant (9, 10), whereas our resultssuggest that samples with higher TP53 activities are more sensi-tive. We reasoned that protein activities might complementmutation-based markers to further improve the stratification ofsensitive and nonsensitive cell lines. To test this hypothesis, weused a LR test to compare pharmacogenomic models with andwithout including TF activities (Fig. 6A). We confirmed that TP53

activitywas able to further identify sensitive cell lines amongwild-type samples (Fig. 6B, P ¼ 2.1 � 10�15). No other TF outper-formed TP53. This observation was reproduced in 3 of 5 indi-vidual tumor types (OV, GBM, and LAML; P < 0.05) where TP53mutations are markers of Nutlin-3a response.

Motivated by this finding, we ran a systematic analysis to searchfor TFs able to refine known pharmacogenomic interactions.Overall, 95of 160 (59.4%) tested strong-effect pharmacogenomicinteractions identified in ref. 10 are improved by at least one TF(FDR < 5%, LR test; Supplementary Table S7). The second mostsignificant hit after TP53-Nutlin3a involved the interactionbetween BRAF mutations and the FDA-approved BRAF inhibitordabrafenib. Specifically, in BRAF-mutant samples, resistance todabrafenib interacts with ATF2 andMITF activity (Fig. 6C, P¼ 1.2� 10�12 and P ¼ 4.68 � 10�10). Resistance in BRAF mutants todabrafenib was still observable in SKCM samples with higherATF2 target expression (P ¼ 0.00123). The importance of ATF2in melanoma is supported by several lines of evidence; ATF2 isrequired for melanoma tumor development (38), and nuclearATF2 (transcriptionally active) is associated with poor prognosisand genotoxic stress resistance (39). Moreover, PKCe, the kinasemediating ATF2 transcriptional activity, is among the top 10kinases associated with BRAF inhibition resistance, which sup-ports the relationship between ATF2 and dabrafenib resistance(40). In fact, ATF2 essentiality scores from Achilles project (41)correlatewith the predicted activity for ATF2 in BRAFV600E-mutantsamples at the pan-cancer level (Pearson correlation, R¼�0.615,P ¼ 0.0332; Supplementary Fig. S11A) but not in BRAFwt (R ¼0.082, P ¼ 0.347; Supplementary Fig. S11B).

Interestingly, themost significant improvements in predictionswere observed between drugs targeting ERK–MAPK signaling(Fisher exact test, P ¼ 5.28 � 10�6) and the driver genes BRAF,KRAS, or HRAS. For example, in BRAF wild-type samples, sensi-tivity to MEK inhibitors improved including JUND in the model,among others (P¼ 1.86� 10�11, P¼ 3.12� 10�11, and P¼ 1.77� 10�8; trametinib, RDEA119, and AZD6244, respectively; Fig.6D). JUND is a downstream substrate in ERK–MAPK signaling(32). Our previous analysis already suggested JUND activity ispredictive ofMEK inhibition sensitivity alone.Here, we showhowJUND also improves response prediction to MEK inhibitorAZD6244 within HRAS-mutant pan-cancer samples (P ¼ 1.21� 10�7). Taken together, our results suggest that JUND regulonexpression may be used as a sensor of ERK–MAPK pathwayactivity and vulnerability to MEK inhibition.

Finally, other potential interactions affecting well-establishedpharmacogenomicmarkers are: the interaction of JUNDwith cell-cycle CDK4/CDK6 inhibitor in RB1 mutants (P ¼ 1.9 � 10�6),which modulate cyclins (42); sensitivity to AKT inhibitorGSK690693 interaction with several TFs in OV PIK3CA mutants,where the stronger hits involve EGR1 and CREB1 (P¼ 5.1� 10�6

and P¼ 7.6� 10�3), downstream effectors of PI3K–Akt pathway(43); and, in HER2þ BRCA samples, sensitivity interactionbetween ELF1 activity and ERBB2 inhibitors lapatinib andCP724714 (P ¼ 3.9 � 10�6 and P ¼ 8.9 � 10�5), a candidateregulator of ERBB2 expression (44).

DiscussionTF activities derived from gene expression data have attracted

much attention in cancer research during the past years. Recentstudies have used different strategies to derive TF activity

Transcription Factor Activities in Cancer Drug Sensitivity

www.aacrjournals.org Cancer Res; 78(3) February 1, 2018 777

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 10: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

profiles across different cancers, evaluated their potential asprognostic markers (8), and applied them to characterize theimpact of cancer somatic alterations (6, 7). Although based ondifferent definitions of TF regulons, the common outcome isthat estimating regulatory activities from the mRNA levels ofthe targeted genes can reveal known and novel mechanismsinvolved in tumor development. However, the potential of TFactivities as markers to guide personalized treatments, alone orin combination with established genomic markers, has not yetbeen explored.

Here, we applied a pipeline to derive signatures of TFactivity from new and existing RNA-seq data in 1,056 cancer

cell lines and 9,250 primary tumors. Our approach combinesCTFRs and gene-wise normalized expression data with unsu-pervised single-sample enrichment algorithms. This circum-vents the need for a prior classification of samples into sub-types, of particular benefit when working with heterogeneousgroup of cancer samples, and does not require of unperturbedreference samples (often not available). Moreover, comparableTF activity signatures can be obtained for new samples bynormalizing the expression values of each gene against ourreference panel of samples.

TF activity profiles enabled us to (i) functionally characterizedifferent TF mutations; (ii) link genomic aberrations in cancer

Figure 6.

Modeling the combined effect on drug sensitivity of known pharmacogenomic markers and TF activities. A, Analysis strategy: two pharmacogenomic regressionmodels are built, one without any TF information (null model) and another including the activity of a TF (test model). Both models are compared using a LR test. B,Improvement of the association TP53MUTNutlin-3a by including TP53 TF activity. C, Improvement of the association BRAFMUT–dabrafenib by includingATF2 TF activity. D, Improvement of the association BRAFMUT–trametinib by including JUND TF activity. Left box represents the top TFs improving the nullpharmacogenomic model. Indicated P values are nominal, with FDR < 0.05. First boxplot represents the IC50 (y) in mutant (blue) and WT (red) samples (x). Thesecond scatterplot represents the IC50 (y) and the predicted TF activity (x). The third scatterplot represents the interaction between the IC50 (y) and the predicted TFactivity (x) in mutant (blue) and WT (red) samples. The fourth boxplot represents the IC50 (y) in mutant and WT samples (x) colored according to the TF activity(low: activity < �1; basal: �1 < activity < 1; high: activity > 1).

Garcia-Alonso et al.

Cancer Res; 78(3) February 1, 2018 Cancer Research778

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 11: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

driver genes with TF dysregulation; (iii) suggest newmechanismsfor response to specific compounds in cancer models; and (iv)propose new markers of drug response, alone or in combinationwith genomic markers. Although we expect some interactions toreflect the cooperative behavior between TFs controlling commonprocesses rather than causal associations, these recapitulatedknown pharmacogenomic relationships and were enriched forTF–drug pairs close in the signaling network. Thus, we envisionthat the identified associations provide reliable evidence to refineexisting hypotheses or formulate new ones to understand thera-peutic outcomes. Particularly, our study shows that predictions oftherapeutic response can be improved if, in addition to themutational status of a marker gene, the regulatory activity of thecoded protein is also considered. This can be achieved directlywhen the marker gene codes for a TF (as exemplified by TP53-Nutlin3a response), or indirectly when the protein targeted by thedrug regulates a TF (as the case of JUND in MEK inhibitors).

The critical factor in the quantification of TF activities is thedefinition of the targets putatively regulated. Here, we chose touse a curated compendium of regulatory interactions (CTFRs)derived from different TF–DNA binding evidences such as invivo ChIP-seq experiments, in silico TFBS predictions, and man-ual curations. The major limitations of our approach are: (i)CTFRs are restricted by prior knowledge, which may renderincomplete regulons; (ii) the assumption that a TF eitherinduces or represses its targets (but TFs may have both roles);and (iii) the cell-type dependencies of some TF–target interac-tions. Taken together, these limitations may cause inaccurateactivity estimations for TFs with dual activator/repressor role orfor TFs whose targets vary across cell types (45). Under theseconsiderations, approaches inferring condition-specific regu-lons from transcriptomic associations have become popular(46). The underlying principle is that TF circuits can be inferredby correlating mRNA levels of the TFs with all other genes (47).However, our comparison revealed that activities derived fromCTFRs perform slightly better than those from inferred regu-lons. Potential explanations may be that: (i) inference methodsassume that mRNA levels are good activity indicators of thecoded proteins, which may fail for TFs whose activity dependson posttranslational regulation (such phosphorylation) orindeed their stoichiometric assembly as heteromeric complexes(48); (ii) these methods are susceptible of being confounded byindirect associations or coexpression with other TFs (49); (iii)regulons inferred from primary tumors may not capture regu-latory events occurring in cell lines; and (iv) the pervasivenessof somatic mutations changing the function of TFs. Pertinentexamples are loss-of-function TP53 missense mutants, which,although abundantly present at mRNA and protein level, areunable to regulate the expression of its canonical targets.Finally, the inference of such condition-specific networksrequires a prior classification of samples, which may not betrivial for heterogeneous cancer cell line panels. An alternativecould combine CTFRs with network inference approaches (50).

Nonetheless, our TF predictions based on CTFRs agree withindependent essentiality screenings and genomic data andmimic

changes in transactivation potential observed in mutagenesisstudies. Importantly, CTFRs are able to reproduce known phar-macogenomic interactions, whereas inferred regulons fail to doso. However, it is worth mentioning that our strategy to retrieveCTFRs may favor well-studied TFs, whose targets are thoroughlycharacterized, thus resulting in biased performances. Furtherrefinement of the approaches to define TF regulon activity incancer should enable to find further pharmacogenomic interac-tions, novel markers, and therapeutic opportunities.

Briefly, our results demonstrate that TF activity profiles derivedfrom CTFRs can be used to characterize genomic alterations anddrug response in cancer patients, proposing these as promisingcomplementary therapeutic markers. The proposed approachmay have strong implications in the refinement of personalizedtreatment methodologies. We envision that with the increase inthe coverage and quality of the CTFRs, the proposed strategy willbecome instrumental to interpret transcriptional dysregulation incancer and elucidate its clinical implications.

Disclosure of Potential Conflicts of InterestNo potential conflicts of interest were disclosed.

Authors' ContributionsConception and design: L. Garcia-Alonso, F. Iorio, J. Saez-RodriguezDevelopment of methodology: L. Garcia-Alonso, F. Iorio, S.S. McDadeAcquisition of data (provided animals, acquired and managed patients,provided facilities, etc.): L. Garcia-Alonso, C.H. Benes, S.S. McDade,M.J. GarnettAnalysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): L. Garcia-Alonso, A. Matchan, N. Fonseca, P. Jaaks,F. Falcone, G. Bignell, S.S. McDadeWriting, review, and/or revision of the manuscript: L. Garcia-Alonso, F. Iorio,P. Jaaks, C.H. Benes, I. Dunham, S.S. McDade, J. Saez-RodriguezAdministrative, technical, or material support (i.e., reporting or organizingdata, constructing databases): L. Garcia-Alonso, A. Matchan, G. Peat,M. Pignatelli, I. Dunham, G. BignellStudy supervision: C.H. Benes, I. Dunham, J. Saez-Rodriguez

AcknowledgmentsThis work was supported by Open Targets (grant number OTAR016).

Research in M.J. Garnett laboratory is funded by the Wellcome Trust(102696) and Open Targets (OTAR014). We thank the Gene Expression Atlasteam for the help with the RNA sequencing processing, especially Laura Huertafor the curation of sample annotations and Robert Petryszak and Alvis Brazmafor the general support. We thank Nils Bl€uthgen for the help in the curation ofJASPAR data. We thank Ultan McDermott, Simon Cook, Stacey Price, JayetaSaxena, and Hayley Francies for feedback on targeted therapies in melanomaand colorectal cancer cells. We thank Pedro Beltrao, Ivan Costa, Luis Tobalina,and Denes Turei for insightful discussions and providing valuable feedback onthe manuscript and Euan Stronach, Paul Fisher, and Glyn Bradley for input indesign and analysis. We thank Roberto Battisti for designing and implementinga first prototype version of the DoRothEA online tool.

The costs of publication of this articlewere defrayed inpart by the payment ofpage charges. This article must therefore be hereby marked advertisement inaccordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Received June 6, 2017; revisedOctober 16, 2017; acceptedDecember 4, 2017;published OnlineFirst December 11, 2017.

References1. Levine AJ. p53, the cellular gatekeeper for growth and division. Cell

1997;88:323–31.2. Semenza GL. Hypoxia-inducible factor 1: oxygen homeostasis and disease

pathophysiology. Trends Mol Med 2001;7:345–50.

www.aacrjournals.org Cancer Res; 78(3) February 1, 2018 779

Transcription Factor Activities in Cancer Drug Sensitivity

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 12: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

3. Oliner JD, Kinzler KW, Meltzer PS, George DL, Vogelstein B. Amplificationof a gene encoding a p53-associated protein in human sarcomas. Nature1992;358:80–3.

4. Ohh M, Park CW, Ivan M, Hoffman MA, Kim TY, Huang LE, et al.Ubiquitination of hypoxia-inducible factor requires direct binding tothe beta-domain of the von Hippel-Lindau protein. Nat Cell Biol2000;2:423–7.

5. Gonda TJ, Ramsay RG. Directly targeting transcriptional dysregulation incancer. Nat Rev Cancer 2015;15:686–94.

6. Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH, et al.Functional characterization of somatic mutations in cancer using net-work-based inference of protein activity. Nat Genet 2016;48:838–47.

7. Osmanbeyoglu HU, Toska E, Chan C, Baselga J, Leslie CS. Pancancermodelling predicts the context-specific impact of somatic mutations ontranscriptional programs. Nat Commun 2017;8:14249.

8. Falco MM, Bleda M, Carbonell-Caballero J, Dopazo J. The pan-cancerpathological regulatory landscape. Sci Rep 2016;6:39709.

9. Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW,et al. Systematic identification of genomic markers of drug sensitivity incancer cells. Nature 2012;483:570–5.

10. Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, MendenMP, Schubert M, et al.A landscape of pharmacogenomic interactions in cancer. Cell 2016;166:740–54.

11. Rees MG, Seashore-Ludlow B, Cheah JH, Adams DJ, Price EV, Gill S, et al.Correlating chemical sensitivity and basal gene expression reveals mech-anism of action. Nat Chem Biol 2016;12:109–16.

12. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S,et al. The cancer cell line encyclopedia enables predictive modelling ofanticancer drug sensitivity. Nature 2012;483:603–7.

13. Forbes SA, Beare D, Boutselakis H, Bamford S, Bindal N, Tate J, et al.COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res2017;45:D777–83.

14. Klijn C, Durinck S, Stawiski EW, Haverty PM, Jiang Z, Liu H, et al. Acomprehensive transcriptional portrait of human cancer cell lines. NatBiotechnol 2014;33:306–12.

15. Fonseca NA, Petryszak R, Marioni J, Brazma A. iRAP - an integrated RNA-seq Analysis Pipeline [Internet]. bioRxiv. 2014[cited 2017 Feb 27]. Avail-able from: http://biorxiv.org/content/early/2014/06/06/005991.

16. Rahman M, Jackson LK, Johnson WE, Li DY, Bild AH, Piccolo SR.Alternative preprocessing of RNA-sequencing data in the cancergenome atlas leads to improved analysis results. Bioinformatics 2015;31:3666–72.

17. Cowley GS, Weir BA, Vazquez F, Tamayo P, Scott JA, Rusin S, et al. Parallelgenome-scale loss of function screens in 216 cancer cell lines for theidentification of context-specific genetic dependencies. Sci Data 2014;1:140035.

18. Vaquerizas JM, Kummerfeld SK, Teichmann SA, LuscombeNM.A census ofhuman transcription factors: function, expression and evolution. Nat RevGenet 2009;10:252–63.

19. H€anzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis formicroarray and RNA-seq data. BMC Bioinformatics 2013;14:7.

20. M€uller JM, Krauss B, Kaltschmidt C, Baeuerle PA, Rupec RA. Hypoxiainduces c-fos transcription via a mitogen-activated protein kinase-depen-dent pathway. J Biol Chem 1997;272:23435–9.

21. Uhl�en M, Fagerberg L, Hallstr€om BM, Lindskog C, Oksvold P, MardinogluA, et al. Proteomics. Tissue-based map of the human proteome. Science2015;347:1260419.

22. Spaderna S, Schmalhofer O, Hlubek F, Berx G, Eger A, Merkel S, et al.A transient, EMT-linked loss of basement membranes indicates metas-tasis and poor survival in colorectal cancer. Gastroenterology 2006;131:830–40.

23. Ory K, Legros Y, Auguin C, Soussi T. Analysis of the most representativetumour-derived p53mutants reveals that changes in protein conformationare not correlated with loss of transactivation or inhibition of cell prolif-eration. EMBO J 1994;13:3496–504.

24. Zhu J, Sammons MA, Donahue G, Dou Z, Vedadi M, Getlik M, et al. Gain-of-function p53 mutants co-opt chromatin pathways to drive cancergrowth. Nature 2015;525:206–11.

25. Bouaoun L, SonkinD, ArdinM,HollsteinM, ByrnesG, Zavadil J, et al. TP53variations in human cancers: new lessons from the IARC TP53 databaseand genomics data. Hum Mutat 2016;37:865–76.

26. Shibata T, Ohta T, Tong KI, Kokubu A, Odogawa R, Tsuta K, et al. Cancerrelated mutations in NRF2 impair its recognition by Keap1-Cul3 E3 ligaseand promote malignancy. Proc Natl Acad Sci USA 2008;105:13568–73.

27. Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA Jr, KinzlerKW. Cancer genome landscapes. Science 2013;339:1546–58.

28. Gonzalez-Perez A, Perez-Llamas C, Deu-Pons J, Tamborero D, SchroederMP, Jene-Sanz A, et al. IntOGen-mutations identifies cancer drivers acrosstumor types. Nat Methods 2013;10:1081–2.

29. Dyson N. The regulation of E2F by pRB-family proteins. Genes Dev1998;12:2245–62.

30. Janknecht R, Ernst WH, Pingoud V, Nordheim A. Activation of ternarycomplex factor Elk-1 by MAP kinases. EMBO J 1993;12:5097–104.

31. T€urei D, Korcsm�aros T, Saez-Rodriguez J. OmniPath: guidelines andgateway for literature-curated signaling pathway resources. Nat Methods2016;13:966–7.

32. Hess J, Angel P, Schorpp-Kistner M. AP-1 subunits: quarrel and harmonyamong siblings. J Cell Sci 2004;117:5965–73.

33. Ceresa BP, Horvath CM, Pessin JE. Signal transducer and activator oftranscription-3 serine phosphorylation by insulin is mediated by a Ras/Raf/MEK-dependent pathway. Endocrinology 1997;138:4131–7.

34. Yoshida T, Song L, Bai Y, Kinose F, Li J, Ohaegbulam KC, et al. ZEB1mediates acquired resistance to the epidermal growth factor receptor-tyrosine kinase inhibitors in non-small cell lung cancer. PLoS One2016;11:e0147344.

35. Kang W, Tong JHM, Chan AWH, Zhao J, Dong Y, Wang S, et al. Yin Yang 1contributes to gastric carcinogenesis and its nuclear expression correlateswith shorter survival in patients with early stage gastric adenocarcinoma. JTransl Med 2014;12:80.

36. GreenWB, Slovak ML, Chen IM, Pallavicini M, Hecht JL, Willman CL. Lackof IRF-1 expression in acute promyelocytic leukemia and in a subset ofacute myeloid leukemias with del(5)(q31). Leukemia 1999;13:1960–71.

37. Duque-Afonso J, Wei MC, Lin C-H, Feng J, Buechele C, Wong SH-K, et al.Oncogenic role for the Lck/ZAP70/PLCG2 signaling pathway in Pre-B-ALLpathogenesis. Blood 2015;126:810–810.

38. Berger AJ, Kluger HM, Li N, Kielhorn E, Halaban R, Ronai Z, et al.Subcellular localization of activating transcription factor 2 in melanomaspecimens predicts patient survival. Cancer Res 2003;63:8103–7.

39. Lau E, Kluger H, Varsano T, Lee K, Scheffler I, Rimm DL, et al. PKCepromotes oncogenic functions of ATF2 in the nucleus while blocking itsapoptotic function at mitochondria. Cell 2012;148:543–55.

40. Sharma V, Young L, Cavadas M, Owen K, Reproducibility Project: CancerBiology. Registered report: COTdrives resistance to RAF inhibition throughMAP kinase pathway reactivation. Elife 2016;5:e11414.

41. Shao DD, Tsherniak A, Gopal S, Weir BA, Tamayo P, Stransky N, et al.ATARiS: computational quantification of gene suppression phenotypesfrom multisample RNAi screens. Genome Res 2013;23:665–78.

42. Vanden Bush TJ, BishopGA. CDK-mediated regulation of cell functions viac-Jun phosphorylation and AP-1 activation. PLoS One 2011;6:e19468.

43. Clarkson AN, Parker K, Nilsson M, Walker FR, Gowing EK. Combinedampakine and BDNF treatments enhance poststroke functional recovery inaged mice via AKT-CREB signaling. J Cereb Blood Flow Metab 2015;35:1272–9.

44. Scott GK, Chang CH, Erny KM, Xu F, FredericksWJ, Rauscher FJ III, et al. Etsregulation of the erbB2 promoter. Oncogene 2000;19:6490–502.

45. Slattery M, Zhou T, Yang L, Dantas Machado AC, Gordan R, Rohs R.Absence of a simple code: how transcription factors read the genome.Trends Biochem Sci 2014;39:381–99.

46. Marbach D, Costello JC, K€uffner R, Vega NM, Prill RJ, Camacho DM, et al.Wisdom of crowds for robust gene network inference. Nat Methods2012;9:796–804.

47. Tegn�er J, Yeung MKS, Hasty J, Collins JJ. Reverse engineering gene net-works: Integrating genetic perturbations with dynamical modeling. ProcNatl Acad Sci USA 2003;100:5944–9.

48. Margolin AA, Califano A. Theory and limitations of genetic networkinference from microarray data. Ann N Y Acad Sci 2007;1115:51–72.

49. Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G.Revealing strengths and weaknesses of methods for gene network infer-ence. Proc Natl Acad Sci USA 2010;107:6286–91.

50. Ernst J, Beg QK, Kay KA, Bal�azsi G, Oltvai ZN, Bar-Joseph Z. A semi-supervised method for predicting transcription factor–gene interactions inescherichia coli. PLoS Comput Biol 2008;4:e1000044.

Cancer Res; 78(3) February 1, 2018 Cancer Research780

Garcia-Alonso et al.

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679

Page 13: TranscriptionFactorActivitiesEnhanceMarkersof Drug ......(Y IC50) were modeled as a function of the dependent covariates (X covariates, including tissue-type in pan-cancer analyses,

2018;78:769-780. Published OnlineFirst December 11, 2017.Cancer Res   Luz Garcia-Alonso, Francesco Iorio, Angela Matchan, et al.   in CancerTranscription Factor Activities Enhance Markers of Drug Sensitivity

  Updated version

  10.1158/0008-5472.CAN-17-1679doi:

Access the most recent version of this article at:

  Material

Supplementary

  http://cancerres.aacrjournals.org/content/suppl/2017/12/09/0008-5472.CAN-17-1679.DC1

Access the most recent supplemental material at:

   

   

  Cited articles

  http://cancerres.aacrjournals.org/content/78/3/769.full#ref-list-1

This article cites 49 articles, 10 of which you can access for free at:

  Citing articles

  http://cancerres.aacrjournals.org/content/78/3/769.full#related-urls

This article has been cited by 21 HighWire-hosted articles. Access the articles at:

   

  E-mail alerts related to this article or journal.Sign up to receive free email-alerts

  Subscriptions

Reprints and

  [email protected]

To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at

  Permissions

  Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://cancerres.aacrjournals.org/content/78/3/769To request permission to re-use all or part of this article, use this link

on August 4, 2021. © 2018 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1679