predictive models of eukaryotic transcriptional regulation ... · predictive models of eukaryotic...
TRANSCRIPT
Predictive models of eukaryotic transcriptional regulationreveals changes in transcription factor roles and promoter usagebetween metabolic conditions
Downloaded from: https://research.chalmers.se, 2020-12-28 19:46 UTC
Citation for the original published paper (version of record):Holland, P., Bergenholm, D., Börlin, C. et al (2019)Predictive models of eukaryotic transcriptional regulation reveals changes in transcriptionfactor roles and promoter usage between metabolic conditionsNucleic Acids Research, 47(10): 4986-5000http://dx.doi.org/10.1093/nar/gkz253
N.B. When citing this work, cite the original published paper.
research.chalmers.se offers the possibility of retrieving research publications produced at Chalmers University of Technology.It covers all kind of research output: articles, dissertations, conference papers, reports etc. since 2004.research.chalmers.se is administrated and maintained by Chalmers Library
(article starts on next page)
4986–5000 Nucleic Acids Research, 2019, Vol. 47, No. 10 Published online 12 April 2019doi: 10.1093/nar/gkz253
Predictive models of eukaryotic transcriptionalregulation reveals changes in transcription factorroles and promoter usage between metabolicconditionsPetter Holland 1, David Bergenholm 1, Christoph S. Borlin 1, Guodong Liu1 andJens Nielsen1,2,3,*
1Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg SE-41296,Sweden, 2Novo Nordisk Foundation Center for Biosustainability, Chalmers University of Technology, GothenburgSE-41296, Sweden and 3Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark,Kgs. Lyngby DK-2800, Denmark
Received January 11, 2019; Revised March 26, 2019; Editorial Decision March 28, 2019; Accepted April 04, 2019
ABSTRACT
Transcription factors (TF) are central to transcrip-tional regulation, but they are often studied in relativeisolation and without close control of the metabolicstate of the cell. Here, we describe genome-widebinding (by ChIP-exo) of 15 yeast TFs in four chemo-stat conditions that cover a range of metabolicstates. We integrate this data with transcriptomicsand six additional recently mapped TFs to identifypredictive models describing how TFs control geneexpression in different metabolic conditions. Contri-butions by TFs to gene regulation are predicted to bemostly activating, additive and well approximated byassuming linear effects from TF binding signal. No-tably, using TF binding peaks from peak finding al-gorithms gave distinctly worse predictions than sim-ply summing the low-noise and high-resolution TFChIP-exo reads on promoters. Finally, we discoverindications of a novel functional role for three TFs;Gcn4, Ert1 and Sut1 during nitrogen limited aerobicfermentation. In only this condition, the three TFshave correlated binding to a large number of genes(enriched for glycolytic and translation processes)and a negative correlation to target gene transcriptlevels.
INTRODUCTION
The relationship between transcription factor (TF) bind-ing to DNA and gene transcription in eukaryotes is com-plex. This is highlighted in several studies integrating chro-matin immunoprecipitation (ChIP)-based TF binding data
with transcriptomics from knockout or knockdown exper-iments of the TF with the goal of defining regulatory tar-gets. Studies of transcriptional response to hormones foundthat in mice with the TF glucocorticoid receptor knockedout, only 11% of the differently expressed genes were tar-geted by the TF (1) and a similar study of human estro-gen receptor function found only 6% of differentially ex-pressed genes to be targeted (2). Integration of a large-scalestudy using microarray transcriptomics in yeast TFs dele-tion strains (3) with previously generated ChIP-chip data(4) for 188 TFs showed even less overlap with an average3% of differentially expressed genes being targeted by thecorresponding TF (5). Thus, combining ChIP methods withtranscriptomics to understand transcriptional regulation ineukaryotic systems gives disappointing results compared tothe demonstrated successes of this approach in bacteria (6).
The increased difficulty in understanding eukaryal generegulation in comparison to bacteria may be explainedby the additional levels of regulation present, such asnucleosome–TF interactions, histone modifications andlong-range effects of binding. The impressive ENCODEdataset containing TF binding, transcriptomics as well asother chromatin features (7) has been used to explore thecontributions of different features of human promoters togene regulation. Using machine learning approaches, strongpredictive models were created and analysis of the modelssuggested a highly interconnected regulatory system whereTF binding has functional interactions with both nucleo-some occupancy and histone modifications to regulate tran-scriptional outcomes (8). A different approach to createpredictive models of transcriptional regulation based onlyon TF binding was to build a model from TF-associationscores that includes both the strength of the binding eventand the distance from a given gene in data collected from
*To whom correspondence should be addressed. Tel: +46 31 772 3804; Fax: +46 31 772 3801; Email: [email protected]
C© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), whichpermits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
Nucleic Acids Research, 2019, Vol. 47, No. 10 4987
mouse embryonic stem cells (9). For 12 TFs, these scoreswere combined into principal components and using linearregressions on the principal components it was possible topredict an impressive 65% of the variation in gene expres-sion genome wide (9). Such complex analysis methods willbe important tools to fully understand eukaryal transcrip-tional regulation and may allow cell engineering relying onpredictably changes in gene transcription.
While binding has been mapped for most central yeastTFs in one of the impressive large-scale studies (4,10–12),the majority of this data is captured only in a single state ofthe cell; exponential growth in nutrient excess. Here we per-formed a large-scale study of mapping TF binding of multi-ple yeast TFs known to be involved in metabolic regulationby ChIP-exo (chromatin immunoprecipitation with lambdaexonuclease) in four distinct metabolic states of the yeastcell. We integrate TF binding data with transcriptomics ofthe same metabolic conditions with the goal of buildingpredictive models using relatively simple statistical meth-ods that allow full transparency for insights into contribu-tions of different TFs to gene expression. Using ChIP-exoallowed us to study TF binding with high resolution andminimal background and using yeast as a model organismallowed us to study metabolic gene regulation utilizing a va-riety of nutrients with a constant growth rate in chemostats.
MATERIALS AND METHODS
Yeast strains and media
The host strain for all experiments contained in thisstudy was Saccharomyces cerevisiae CEN.PK 113-5D(URA-). CEN.PK 113–5D with Kluyveromyces lactisURA3 (KiURA3) re-integrated was used as control strainfor transcriptome analysis. Strains for ChIP-exo were cre-ated by amplifying either a TAP tag or a 9xMyc tag withKiURA3 and homology arms for recombination into theC-terminal end of the TF coding sequence.
The components of the chemostat media that were dif-ferent between the experimental conditions are as follows:Nitrogen limited media – 1 g/l (NH4)2SO4, 5.3 g/l K2SO4,150 ml/l glucose 40%, 12 drops Antifoam204. Ethanol lim-ited media – 5 g/l (NH4)2SO4, 6.67 ml/l Ethanol 96%,12 drops Antifoam204. Respiratory glucose limited me-dia – 5 g/l (NH4)2SO4, 18.75 ml/l glucose 40%, 12 dropsAntifoam204. Anaerobic glucose limited media – 5 g/l(NH4)2SO4, 25 ml/l glucose 40%, 4 ml/l ergosterol inTween80 (2.6 g/l), 16 drops Antifoam204. In addition tothe previously stated components changing between the me-dia, all media have the following: 14.4 g/l KH2PO4, 0.5g/l MgSO4, 1 ml/l of 1000× vitamin and 1000× tracemetal stock solutions. The 1000× stocks contains the fol-lowing: Vitamins – 0.05 g/l biotin, 0.2 g/l 4-aminobenzoicacid, 1 g/l nicotinic acid, 1 g/l calcium pantothenate,1 g/l pyridoxine HCl, 1 g/l thiamine HCl, and 25 g/lmyo-inositol. Trace metals – 15.0 g/l EDTA-Na2, 4.5 g/lZnSO4·7H2O, 0.84 g/l MnCl2·2H2O, 0.3 g/l CoCl2·6H2O,0.3 g/l CuSO4·5H2O, 0.4 g/l Na2MoO4·2H2O, 4.5 g/lCaCl2·2H2O, 3 g/l FeSO4·7H2O, 1g/l H3BO3 and 0.1 g/lKI. pH of the media was adjusted by adding KOH pelletsto get media pH of 6.0–6.5 that result in a final pH of allchemostat cultures close to 5.5.
Chemostat cultivation
Cells were cultivated in chemostats with a dilution rate of0.1 h−1 at 30c. Stirring and aeration was performed by ei-ther N2 (fermentative glucose metabolism) or pressurizedair (for the three other conditions) supplied to the cultures(13). Cultures were sampled for either ChIP-exo or tran-scriptomics after steady state was achieved for 48–60 h.
ChIP-exo
When chemostat cultures were measured to be stable for48–60 h, formaldehyde with a final concentration of 1%(wt/vol) and distilled water were added to the cultures tocreate a final OD600 of 1.0 and a total volume of 100ml. Cellswere incubated in formaldehyde for 12 min at room temper-ature followed by quenching by addition of L-glycine to a fi-nal concentration of 125 mM. Cells were then washed twicewith cold TBS and snap-frozen with liquid N2. ChIP-exowas then performed according to a protocol based on theoriginally established protocol (14) with certain modifica-tions, as described in (15). Presentation of the ChIP-exo rawdata and replicates is included in Supplementary Data 1.
Peak finding and target gene identification
Peak detection was performed by GEM (16) with defaultparameters. A peak signal threshold of >2-fold peak signalover the local genomic noise was applied and peaks wereannotated to a gene if it was found within –500 to +500 bpof a given genes TSS, as defined by (17). The full list of peaksdetected by GEM (without peak signal threshold) for eachTF is included in Supplementary Data 2.
RNA sequencing
From chemostats at steady-state, 10 OD600 from three bio-logical replicates were collected into tubes and put directlyon ice. Cells were washed twice in cold TBS and snap-frozedin liquid N2. RNA extraction was performed as describedin the manual for the RNeasy® Mini kit (QIAGEN). RNAquality was inspected by Nanodrop, Qubit and Bioanalyzerbefore proceeding with sample preparation for Illumina se-quencing and following sequencing on the NextSeq 500 sys-tem (2 × 75 bp, mid-output mode; Illumina). The RNA se-quencing read counts per gene in each metabolic conditionis included in Supplementary Data 3.
Sequencing data processing
For both the ChIP-exo data and transcriptomics, the rawsequencing output (.fastq) was mapped to a recently pub-lished CEN.PK genome (18) using Bowtie2 (19) with the -Uparameter. Samtools (20) was then used to generate sortedand indexed .bam files by first creating .bam files by the‘view’ command with parameters –bS –q 20 and further the‘sort’ and ‘index’ commands.
For the ChIP-exo data, the read count covering each nu-cleotide position genome-wide was determined from .bamfiles using the genomecov function of BEDTools (21), wherea read length of 10 was used for all TFs. The read counts ofbiological duplicate were then averaged and output as .wig
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
4988 Nucleic Acids Research, 2019, Vol. 47, No. 10
files that can be found in our Mendeley Data archives asdescribed under the Data Availability section. All furtherprocessing of ChIP-exo data is through import of .wig filesinto R (22) and executable scripts to reproduce all figuresare included in Supplementary Data 6. We also supply inSupplementary Data 4 processed versions of the .wig data,containing only the total TF read counts for each gene pro-moter (TSS –500 to TSS +500 bp) for each metabolic con-dition. These files can be used directly as inputs to replicateour linear regression analysis.
For RNA sequencing data analysis, Subread (23) wasused to map the aligned reads (.bam) to gene annotations.The resulting output of transcript read counts for the repli-cates from Subread (171206 CENPK subOut wt.txt) canbe found in Supplementary Data 3. Using R (22), geneswere first filtered to only include those with minimum 1read detected in all samples and then the edgeR (24) pack-age was used to calculate FPKM values before averagingthe triplicates. All subsequent analysis using the transcrip-tomics data in the manuscript can be reproduced using theR scripts supplied in Supplementary Data 6.
Linear regressions
The mathematical expression of a simple linear regressionusing one explanatory variable (TF binding of a single TFin this case) to model the relationship to the dependent vari-able (FPKM transcript level in our case) is Yi = �0 + �1 X1i+ εi. Yi is the FPKM of gene i, �0 is the intercept (commonto all genes in the regression), X1i is the amount of TF bind-ing to gene i, �1 is the coefficient that is selected to give thebest fit together with the intercept to the transcript levels(common to all genes in the regression) and εi is the errorof the prediction for gene i.
In multiple linear regressions, several predictors areadded together, each with their own coefficient. The equa-tion then takes the format: Yi = �0 + �1 X1i + . . . + �k Xki+ εi where in our case k indicates the index of a TF. Whileour analysis using TF binding contains 21 TFs, we alwaysuse variable selection (TF selection) from the earth pack-age (25) in R for multiple linear regressions, in which onlythe most predictive set of TFs will be included and addedtogether for predictions of transcript levels of a given setof genes. In some of our analysis we also allow the earthscripts to introduce splines (earth() parameters ‘linpreds =F’ and ‘endspan = 100’), effectively allowing the algorithmto model regions where it is advantageous to have nonlin-earities in the explanatory variables.
RESULTS
Most yeast metabolic TFs show large changes in genes tar-gets in different metabolic states
To get information about several distinctly different statesof yeast metabolism we decided to analyze gene expressionregulation in chemostat cultures operated at the same spe-cific growth rate, but still causing a range of different typesof metabolism; aerobic fermentation using nitrogen limita-tion, respiratory glucose metabolism using glucose limita-tion, fermentative glucose metabolism using anaerobic con-ditions, and gluconeogenic respiration using ethanol limita-
tion. These four states of metabolism should involve largechanges in central carbon metabolism and hence we focusedon TFs that have enriched binding (relative to all otherbinding targets) to central carbon metabolism enzymes. Todefine a list of TFs to focus on we started from the land-mark dataset collected by Harbison et al. containing TFpromoter enrichment genome-wide for a majority of yeastTFs mapped by ChIP-chip in batch cultures with rich media(4). All TFs that had >50 total targeted genes and >5% en-richment of central carbon metabolism genes were selectedas candidates, as well as certain additional TFs suggestedfrom other studies to be important for controlling centralcarbon metabolism. The criteria and process of selectingTFs for this study is described in more detail in Supplemen-tary Data 5.
To map and quantify TF binding, strains were createdwith TFs tagged by a C-terminal TAP or 9xMyc tag. Allstrains were validated for presence of the tag as well as func-tional binding of the tagged TF to a known target gene’spromoter by ChIP-qPCR. The successfully validated strainswere cultivated as biological duplicates in the four differ-ent chemostat conditions and genome-wide binding eventswere mapped and quantified by ChIP-exo. This methodis an improvement over ChIP-seq, including exonucleasetreatment of the cross-linked TF-DNA complex to increasethe resolution and reduce unspecific background binding(14). A demonstration of our raw data and replicates isshown for each TF in Supplementary Data 1.
Peaks were identified by GEM (16), the duplicates aver-aged, and a signal threshold of >2 peak signal relative to thenoise of the local genomic context was applied. Comparingto what degree the targeted genes overlap between the ex-perimental conditions (Figure 1A), only two of the 15 TFsfirst reported in this study show a relatively stable set of tar-gets between the studied nutrient limited conditions: Cbf1and Gcr1. For the remaining TFs there are large changes inwhich genes are being targeted between conditions. By an-alyzing the targeted genes for the most relatively enrichedgene ontology (GO) term, we confirm many well-knownmetabolic roles for these TFs, but also find indications fornew functions such as drug transport for Ert1 and carbohy-drate transport for Sut1 (Figure 1A).
For further analysis of the relationship between TF bind-ing and transcriptional outcomes, we combined binding in-formation of the 15 TFs reported here with data for an ad-ditional 6 TFs obtained with the same experimental con-ditions and using the same protocols (Ino2, Ino4, Hap1,Oaf1 and Pip2 from Bergenholm et al. (26) and Stb5 fromOuyang et al. (27) (Supplementary Figure S1A). We firstexplored the general distribution of peaks on promotersrelative to the TSS and we found the expected strong en-richment upstream of the TSS for all metabolic conditions(Supplementary Figure S1b). Notably, when comparing thenumber of peaks detected for each condition, we discov-ered a significant decrease in count of peaks for most of thestudied TFs in aerobic fermentation (Supplementary Fig-ure S1C).
To look for an explanation for the broad changes in whichgenes are targeted between conditions we compared themost enriched DNA motif bound by the TFs for the dif-ferent metabolic conditions. For most TFs, we found only
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
Nucleic Acids Research, 2019, Vol. 47, No. 10 4989
A
B
Figure 1. (A) Genome-wide TF binding characterization for multiple conditions of 15 TFs first described in this study. The most enriched DNA sequence(defined by MEME) is indicated under the TF name. The euler diagrams indicate the number of genes targeted between the experimental conditions andhow the conditions overlap. Numbers are shown for all overlaps with more than two genes. For each condition, the top significant (if there is any withP.adj < 0.05) GO category for the genes targeted that condition is shown in the same color color of the condition in the euler diagram. (B) A summary ofour experimental approach.
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
4990 Nucleic Acids Research, 2019, Vol. 47, No. 10
minor differences in the enriched motif between conditions(Supplementary Figure S2A). There are a few notable situa-tions where the most enriched motif may be different in onecondition, for example Gcn4 and Rtg1 in aerobic fermen-tation, but we cannot conclude if these cases are from a truechange of DNA preference for the TF or if it is due to noisyvariation from the TF having fewer peaks in aerobic fer-mentation. For the lipid metabolism TFs Ino2, Ino4, Hap1,Oaf1 and Pip2, this analysis was performed in Bergenholmet al. (26), with similar conclusions.
Our data and peak detection was further compared totwo previously published datasets. We focused on eight TFsthat are first reported in this manuscript where binding inmultiple conditions was reported by Harbison et al. (4)and we also compared to the refined peak definitions re-ported in the MacIsaac et al. (28). We see strong over-lap for many TFs and generally stronger overlap with theHarbison et al. experiments using synthetic media (Supple-mentary Figure S2B), an expected observation because thismedia more closely resembles our experimental conditions.Two TFs show relatively poor overlap with existing datasets,Rtg1 and Rtg3, but we do see a strong enrichment of bothTFs to genes involved in amino acid biosynthesis (Figure1A), which is a role for these TFs that is thoroughly demon-strated (29,30) and makes us confident in the quality of ourdata also for these TFs.
While the selection of TFs was focused on finding TFs en-riched for binding to central carbon metabolism genes, wedecided to expand the gene sets for further studies of howthe TFs are affecting transcriptional regulation to cover allmetabolic genes. Metabolic genes were defined as being in-cluded in the latest published yeast genome-scale model,v7.6 (31); in total 849 genes from the model that have aclearly defined TSS (17) and where we also have robust geneexpression data from transcriptomics were selected for fur-ther analysis. Using all metabolic genes was a compromiseto have enough genes for strong statistical power and re-liable observations from predictive models, but also retainthe property of having relatively good TF-coverage of thegenes. Our experimental approach is summarized in Figure1B.
Comparing predictive models of transcriptional regulation
We next compared performance of different types of pre-processing of the TF binding data in predicting transcriptlevels (measured by RNA sequencing) using multiple linearregressions. The regressions assume a linear relationship be-tween TF binding and effects on transcriptional regulationand we build a model where TFs binding signal is multipliedby a coefficient and added together to predict transcript lev-els. We first tested different signal/noise ratio (SNR) thresh-olds for TF peak binding signal, but found only a mini-mal effect on performance of the predictive models (Figure2A). A different numeric representation of TF binding isto sum TF binding over an interval of DNA and we foundthat summing all binding -50 to +50bp around the identifiedpeaks gave stronger predictive power to transcriptional out-comes (Figure 2A). We further tested an even simpler sum-mation of the whole promoter region and found that thisgave even better predictive power (Figure 2A). We think this
improvement is most likely driven by contributions to tran-scriptional regulation from relatively weaker TF bindingevents that are not strong enough to be detected by a peakfinding algorithm. The reduced background of ChIP-exo ishere leveraged to be able to detect such weaker events overbackground noise. The promoter signal sum data formatwas also tested with multivariate adaptive regression splines(MARS) (32). In MARS, if it is advantageous for predictionperformance, the algorithm can introduce splines in the lin-ear regressions, effectively allowing a type of peak definitionwhere the peak threshold (spline) is introduced to create alinear relationship between TF binding and transcript levelsonly for a certain range of TF binding strength. We foundthat with MARS, the performance of the predictions fur-ther increased.
We were curious to see where in the promoter region TFbinding is most strongly contributing to gene regulation. Wetested the predictive power of binding in segments of thepromoter using linear regressions and found that bindingsignal upstream of the TSS (where we also detect the ma-jority of strong TF-binding peaks, Supplementary FigureS1B) is predicted to be most consequential for transcrip-tional regulation (Supplementary Figure S2C), but with anotable influence also from binding directly downstream ofthe TSS. Comparing the conditions, it appears that there isa relative increase in influence of TF binding directly down-stream of the TSS in aerobic fermentation (SupplementaryFigure S2c; highest point of red line is downstream of TSSwhile highest point of the other conditions is upstream ofTSS). To select a region of a gene’s promoter which capturesas much as possible of the consequential TF binding for fur-ther analysis, we started with the assumption of a symmet-rical region around the TSS (assumed based on Supplemen-tary Figure S2c) and tested extensions of this region in 50 bpincrements for predicting transcript levels (SupplementaryFigure S2d). The performance of predictions increase untilit reaches –500 to +500 around the TSS, after which thereis no further increase, indicating that this region contains amajority of the consequential TF binding.
MARS define a set of core TFs for different conditions and re-veal general quantitative features of the relationship betweenTF binding and transcriptional regulation
Based on the finding that MARS provided the best pre-dictions of transcript levels from TF binding (Figure 2A),we explored what we could learn about the roles and func-tions of TFs from MARS regressions. For multiple linearregressions, the interesting parameters to describe TF func-tion are the coefficients, which can tell us if the TF is anactivator (positive correlation between binding and tran-script levels) or a repressor (negative correlation betweenbinding and transcript levels). The MARS algorithm canalso introduce splines to improve prediction performance,which can be another parameter of TF function, describ-ing the range of TF binding where there is a linear relation-ship with transcriptional outcomes. Finally, variable selec-tion in MARS will select only the best combination of TFsto predict as much as possible while penalizing increasingthe complexity of the model. To define a set of TFs that aremost strongly predictive of transcriptional regulation for
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
Nucleic Acids Research, 2019, Vol. 47, No. 10 4991
A
TF binding data preprocessing
Predicting transcript levels from TF binding
2.5 3.5 4.5 5.5
1.5
2.0
2.5
3.0
Cbf1
log10(binding)
log1
0(N
it FP
KM
)
3.5 4.0 4.5 5.0
1.5
2.0
2.5
3.0 Gcr1
log10(binding)
log1
0(N
it FP
KM
)
2.0 2.5 3.0 3.5 4.0 4.5 5.0
1.5
2.0
2.5
3.0 Hap1
log10(binding)
log1
0(N
it FP
KM
)
3.5 4.0 4.5 5.0 5.5 6.0
1.5
2.0
2.5
3.0
Ino4
log10(binding)
log1
0(N
it FP
KM
)
1 2 3 4 5
1.5
2.0
2.5
3.0
Rds2
log10(binding)
log1
0(N
it FP
KM
)
2.5 3.0 3.5 4.0 4.5
1.5
2.0
2.5
3.0 Rtg1
log10(binding)
log1
0(N
it FP
KM
)
2 3 4 5
1.5
2.0
2.5
3.0
Cat8
log10(binding)
log1
0(E
th F
PK
M)
3.0 3.5 4.0 4.5 5.0 5.5
1.5
2.0
2.5
3.0
Gcn4
log10(binding)
log1
0(E
th F
PK
M)
3.0 3.5 4.0 4.5 5.0 5.5
1.5
2.0
2.5
3.0
Gcr1
log10(binding)
log1
0(E
th F
PK
M)
2.5 3.0 3.5 4.0 4.5
1.5
2.0
2.5
3.0 Hap4
log10(binding)
log1
0(E
th F
PK
M)
3 4 5 6
1.5
2.0
2.5
3.0
Ino2
log10(binding)
log1
0(E
th F
PK
M)
1.5 2.5 3.5 4.5
1.5
2.0
2.5 Cat8
log10(binding)
log1
0(A
na F
PK
M)
2.5 3.5 4.5 5.5
1.5
2.0
2.5 Gcn4
log10(binding)
log1
0(A
na F
PK
M)
1 2 3 4
1.5
2.0
2.5 Gcr2
log10(binding)
log1
0(A
na F
PK
M)
3.5 4.0 4.5 5.0
1.5
2.0
2.5 Hap1
log10(binding)
log1
0(A
na F
PK
M)
3.5 4.0 4.5 5.0 5.5 6.0
1.5
2.0
2.5 Ino4
log10(binding)
log1
0(A
na F
PK
M)
2.0 2.5 3.0 3.5 4.0 4.5
1.5
2.0
2.5 Tye7
log10(binding)
log1
0(A
na F
PK
M)
3.0 3.5 4.0 4.5 5.0 5.5 6.0
2.0
2.5
3.0 Gcn4
log10(binding)
log1
0(G
lu F
PK
M)
1 2 3 4 5
2.0
2.5
3.0 Gcr2
log10(binding)
log1
0(G
lu F
PK
M)
3.5 4.0 4.5 5.0 5.5
2.0
2.5
3.0 Hap1
log10(binding)
log1
0(G
lu F
PK
M)
3.0 3.4 3.8 4.2
2.0
2.5
3.0 Hap4
log10(binding)
log1
0(G
lu F
PK
M)
3 4 5 6
2.0
2.5
3.0 Ino4
log10(binding)
log1
0(G
lu F
PK
M)
C D E
B
TFs and splines used TFs and splines used TFs and splines used
TFs and splines used
GDH3
BDH1ACS1
GCV3
CDC19
PMT2FUN26
CYS3
ADE1
YAT1
PHO11
IMA5
OPT1
PHO90
ELO1
ERG20
FBP26
INO1
YUR1
GLG2
LCB3URA2
TRK1
RPE1
GSH1LSB6
PHS1
GWT1
ARG3
ARG2YJL068C
BNA3
TDH1
YJL045W
RNR2
CYR1
AVT1
TDH2
MET3GPI14
ILV3
TES1
BNA1
CYC1
UTR1 CDC8
OPI3
MIR1
BNA2
SFC1
URA8
ADO1CPA2
YMR1
ATP2
STR2
XPT1
MET5
HOM6
PMT4
BAT2
DAL5
JEN1
URA1SAC1
TRP3MST1
PXA2
SPE1
PRS1
TPO5
MCD4
GPM1
SDH1
AVT3
SDH3
TGL1
PGM1
OAC1
AAT1
GFA1
YJU3
CAB3MDH1
VMA5
YNK1
FBA1
OAR1UGP1
MAE1
GPX1URA6
RAM2
ATP7
LAC1
AUR1
MET14
FOX2
SPO14
GAP1
SHB17
YSR3GLG1
KTR2
CCP1
GPT2
MET1
SIS2
MTD1
TGL4
PCK1
MHT1
MMP1
AQY2YBT1 FPS1
SDH2
GPI13
TPO1
DPS1
BPT1
YEH1
LOT6
MEU1
YEH2
AAT2ADE16 COX12
PDC1
ERG3
SHM2
FRS1XYL2
ALT1
SUL2
ERG27
AHP1
CKI1
PDC5
NHA1
PUT1
SPE4
ACS2
ASP3−2
ASP3−3DPH5
IDP2 SAM1
ATG26
COQ9
PNP1
BNA5
VPS34
CDD1
GSY2
LCB5
ECI1
NNT1
ATP14
ECM38
EXG1
MET17
ACO1
STT4
CDA1 NMA1FKS1
DIC1 TAL1
ILV5
ADE13
SUR4
FBP1
VAC14
COX8
VIP1URA4
IMD3
CAR2
VMA6
HMG2
ERG13PHO84
NDI1
COQ5
URA5
TSL1
ALO1
ATP18
HMG1
DAK1NTE1
IMD4
CYB2
CAT2
AMD1
APT1
ERG6
GLO1
PLB2
PLB1
ADI1
HXT2
SEC59
ERG5
FMS1
ARA2
STV1
AAC1
ARG7
ADH3
VBA1
PGM2 ILV2
FOL3ADE17
NDE1
PAH1
ALD3
ALD2
GCV2
ERG2
PFK2
HFA1ERG12 GUA1
ERG8
YMR226C
YHM2
FAA4
GAD1
COX7
TPS3
URA10
SCS7PGM3
GPI12
ABZ2
HER2
LCB1LIP1
ADE4
ADH2
TGL3
ADH6
FET4
FIG4
PHA2PUS4
ERG24
MET2
ALP1
LYP1
PIK1
FOL1
YNL247W
ZWF1
ADE12
SPS19
CHS1
PSD1
MEP2
AAH1
CPT1 NRK1
MLS1
INP52
LEU4
AVT4
MSK1LAT1
COX5A
LAP2
IDH1
KTR5IDP3
PET8
CIT1
LRO1URK1PHO91
ARE2
ABZ1
COQ2
MVD1
LYS9
BIO5 BIO4
DSE4
GRE2
RIB4
ARG8
PFK27MDH2
ITR2
WRS1
COQ3
ADH1
HST1
RIB2
INP54
MET22
PRS5
GPD2ARG1
SPE2
GSH2
MSE1TAT2
PLB3HST3GLO4
VHS3
CYT1
NRT1
CDC21
TGL5RKI1
KTR1
CRC1
LEU9
INP53
GCY1
CAT5
IAH1
ADE2
ORT1
IDH2
LSC1
THI80
ISN1
DDP1GLN4
LCB4
ALE1
HEM15
DCI1
SER1
THI72
HIS3
NPT1
MCT1
ODC2
DFR1
MET7
DGA1
VPH1HEM4
CPA1
DGK1
FAA1
PMT3
PRO2
VMA4
ALA1
PYK2
PUT4
PDE2
ALD4
GDH1
ATF1
IMD2
SAM3
SAM4
ATP15
PLC1
DIP5
FUM1YAH1
HUT1
VMA11
FAS2THI6 PGC1
POS5
COX10
CDC60PPT2
ODC1
IDI1
CAR1
MSD1
MSY1
SSU1
GLR1
YDC1
ATP4
BTS1
ALD6
GRX5
SUR1
KTR6
ISM1
PMA2
ERG10
MET12
CIT3
PDH1
ICL2
ATP20AGC1
ATH1
HTS1
GLN1
VMA13
MSF1 ARO7FCY1
SPE3TKL1
GRS2
PIS1
ANT1
MEP3
TAZ1
ASN1
TPO3
KRE6
GPH1 MET16
DPM1GDB1
QCR2
AQY1
ATP1
BNA4
ILS1
PRS4
PRX1
COR1
FUI1
URA7
RIB1
PET9
ACH1
SCT1 RER2COQ1
UGA2
IPP1
FUR4
CHS3
ETR1
CDS1
HMT1
PDX3
CSG2
CHS2
ATP3
FAT1
CST26
TSC3
BAP2
TAT1
MIS1
AAC3 VPS15
ALG1
LYS2
TKL2
GRS1
TPS1
VMA2
AGP2
ADH5
ARA1
RIB7
IFA38
CSH1
TYR1
ECM31 YPC1RIM2
PGI1
KTR4
LDH1 KTR3
MET8
PYC2
PDB1HIS7
ARO4
DUT1
RIB5
SHM1
TSC10CTP1
VBA2
SUL1
PHO89
CHA1PBN1
APA1
GLK1
ATG22
GRX1
HIS4ILV6
PGS1
CIT2
ADY2
PGK1
SLM5
PMP1
FEN2
FEN1 RBK1
PHO87ARE1
THR4ERS1
TRX3
MPH2
GUD1
GDH2
HEM3
GGC1
VMA1
LYS20
DLD2
DLD1
SFA1
CRD1BPL1
LYS21
QRI1
GET3PMT1
PMT5
RAM1NDE2
THI3
MDH3
COX9IDP1
PSA1
SLC1
FAD1
SIR2
GPD1
TSC13
ATP16
NTH1
GCV1
SES1
ARO3HEM13
BAP3
HEM12
TPI1
TGL2
LCB2
IPT1
TPS2GRX3
ARO1
YCF1EKI1
KGD2
HOM2
ARG82
SDH4HST4
CAB5COQ4
MSS4
ADK1LYS4
FMN1
CTA1
EXG2
MSW1
GLO2DPP1
DPL1
SUR2ATP5
PRO1
GPI11
HNT2IPK1
ASP1
TIM11
YDR341C
HXT7HXT6
HXT3
TRR1TRP4
KEI1
YPR1FRQ1
ARH1ATP17
ATO3
HPT1
URH1
DIT1
ADE8
BNA7
GUK1PHO8
KRE2RIB3
ITR1
SAM2
LPP1
GNP1
GRX2
QCR7
CAB1
STL1
DLD3
HXT13CAN1
PCM1
VMA8
YEL047C
GLY1
GDA1 CYC7
UTR4
BUD16
VMA3
RIP1
URA3
PMP2
YEA6
PMI40YND1
HEM14FAA2 ISC1PRO3YAT2
CHO1
PHM8
SAH1
HOM3PIC2
HIS1
FCY2
FCY21
CEM1
HOR2
ICL1
ARG5
RNR1
ALD5 SER3
ILV1
TRP2
MET6
PRS2
AVT6
COX15YER152C
ADK2TMT1
PDA1
FAU1
AGP3
SEC53
AGX1FRS2
LPD1
GNA1
HXT10
DEG1
GSY1
FAB1
HIS2
MET10
QCR6
BNA6
HXK1HXK2
PDE1
GUS1
ADE5
VRG4SDT1
POX1
ARO8
COX13
COX4TPN1STR3
LYS5
ARO2
GPI10
MET13
COQ8GUP1
FMP37HNM1
NPY1PUS2
PYC1
OLE1
HEM2
PNC1
TRP5
ERG4
LEU1
PMA1
ERG26ECT1
NMA2
VMA7
GSC2
MUP1
ERG25
ADE6
VHT1PDC6
CTT1
VAS1
TPC1
CLD1
MEP1
ASN2
CYS4
CHO2
PSD2
MSM1
ERG1
ATF2
QCR9
TYS1
HIP1
TDH3
PDX1
XKS1
ADE3
SER2 TRX2
PFK1
FMP43
LSC2
CPD1
SOL4
ENO1
COQ6GND2 TNA1
MES1
FOL2
CAB4
BGL2
BIO2
IMA1
MAL11
MAL12
MUP3
GUT1
DUR3
PRS3LAG1
QCR10
LEU5
ERG11
DIA4
ARG4
DED81
YHR020W
THR1
VMA16 PUT2
VMA10
NCP1INM1
COX6
PAN5
DYS1
ERG7
QNS1
MSR1
HXT4
HXT1
HXT5
GEP4
GRE3
TRR2
EPT1 FUR1
ARO9
DCD1
MPC2
SOL3
ENO2
GND1ERG9
BAT1SUC2
POT1
GUT2
PAN6
FLX1
KGD1 AYR1
COX5B
PFK26LYS12
CAB2THS1
SER33
RHR2
HIS6
PDR11
DOT5FAA3
YIA6
INP51DAL1
DAL4
DAL2
DAL7DAL3
LYS1
HYR1HTD2INM2
DUR1
SKN1
AUS1 KCS1
FAS1
THI7
HEM1TSA2
PCT1
PXA1
HST2
NAM2 APA2
YEF1
RNR4TRP1
ACC1
THI20ADH4
GLT1UGA1
HIS5
TPO4AGP1FMT1
PUS12
3
1 2 3log10(Nit_FPKM)
Pre
dict
ed lo
g10(
Nit_
FPK
M)
R2: 0.339
GDH3
BDH1ACS1
GCV3
CDC19
PMT2
FUN26
CYS3
ADE1
YAT1
PHO11
IMA5
OPT1 PHO90ELO1
ERG20
FBP26
INO1
YUR1
GLG2
LCB3
URA2
TRK1
RPE1
GSH1
LSB6
PHS1
GWT1
ARG3
ARG2
YJL068C
BNA3
TDH1
YJL045W
RNR2
CYR1AVT1
TDH2
MET3
GPI14
ILV3
TES1
BNA1
CYC1
UTR1
CDC8
OPI3
MIR1
BNA2 SFC1
URA8
ADO1
CPA2
YMR1
ATP2
STR2
XPT1
MET5
HOM6
PMT4
BAT2
DAL5
JEN1
URA1
SAC1
TRP3
MST1PXA2
SPE1 PRS1
TPO5
MCD4
GPM1
SDH1AVT3
SDH3
TGL1
PGM1
OAC1
AAT1GFA1
YJU3
CAB3
MDH1
VMA5
YNK1
FBA1
OAR1
UGP1MAE1
GPX1
URA6RAM2
ATP7
LAC1
AUR1
MET14
FOX2
SPO14
GAP1
SHB17
YSR3
GLG1
KTR2
CCP1GPT2
MET1
SIS2
MTD1
TGL4PCK1
MHT1MMP1
AQY2
YBT1 FPS1
SDH2
GPI13
TPO1
DPS1
BPT1
YEH1
LOT6
MEU1YEH2
AAT2
ADE16
COX12
PDC1
ERG3
SHM2
FRS1
XYL2 ALT1
SUL2
ERG27
AHP1
CKI1
PDC5NHA1
PUT1
SPE4
ACS2
ASP3−2ASP3−3
DPH5
IDP2
SAM1ATG26
COQ9PNP1
BNA5
VPS34
CDD1
GSY2
LCB5ECI1NNT1
ATP14
ECM38
EXG1
MET17ACO1
STT4
CDA1
NMA1
FKS1
DIC1
TAL1
ILV5
ADE13SUR4FBP1
VAC14
COX8
VIP1
URA4
IMD3
CAR2
VMA6
HMG2
ERG13
PHO84
NDI1
COQ5
URA5
TSL1
ALO1
ATP18
HMG1
DAK1
NTE1
IMD4
CYB2
CAT2AMD1
APT1
ERG6
GLO1PLB2
PLB1
ADI1
HXT2
SEC59
ERG5
FMS1
ARA2
STV1
AAC1
ARG7
ADH3
VBA1
PGM2
ILV2
FOL3
ADE17
NDE1
PAH1
ALD3
ALD2
GCV2
ERG2
PFK2
HFA1
ERG12GUA1
ERG8
YMR226CYHM2
FAA4
GAD1
COX7
TPS3 URA10
SCS7
PGM3
GPI12
ABZ2
HER2
LCB1
LIP1
ADE4ADH2TGL3
ADH6
FET4
FIG4
PHA2
PUS4
ERG24MET2
ALP1
LYP1
PIK1
FOL1 YNL247W
ZWF1
ADE12
SPS19
CHS1PSD1 MEP2
AAH1
CPT1NRK1
MLS1
INP52
LEU4
AVT4
MSK1
LAT1
COX5A
LAP2
IDH1
KTR5
IDP3
PET8
CIT1
LRO1
URK1
PHO91
ARE2
ABZ1
COQ2MVD1
LYS9
BIO5BIO4
DSE4GRE2
RIB4
ARG8
PFK27
MDH2
ITR2 WRS1
COQ3
ADH1
HST1
RIB2INP54
MET22
PRS5
GPD2ARG1
SPE2
GSH2
MSE1
TAT2PLB3
HST3GLO4
VHS3
CYT1
NRT1
CDC21
TGL5
RKI1
KTR1
CRC1
LEU9
INP53
GCY1
CAT5
IAH1 ADE2
ORT1
IDH2
LSC1
THI80
ISN1
DDP1GLN4
LCB4
ALE1HEM15
DCI1
SER1
THI72
HIS3
NPT1
MCT1
ODC2
DFR1
MET7
DGA1
VPH1
HEM4
CPA1
DGK1
FAA1
PMT3
PRO2
VMA4
ALA1
PYK2
PUT4
PDE2
ALD4
GDH1
ATF1IMD2
SAM3 SAM4 ATP15
PLC1
DIP5FUM1
YAH1
HUT1VMA11
FAS2
THI6
PGC1
POS5
COX10
CDC60
PPT2
ODC1
IDI1
CAR1
MSD1MSY1
SSU1
GLR1
YDC1ATP4
BTS1
ALD6
GRX5
SUR1
KTR6
ISM1
PMA2
ERG10
MET12CIT3
PDH1
ICL2ATP20
AGC1
ATH1
HTS1
GLN1
VMA13
MSF1
ARO7
FCY1
SPE3
TKL1
GRS2
PIS1
ANT1
MEP3
TAZ1
ASN1
TPO3
KRE6GPH1
MET16DPM1
GDB1
QCR2
AQY1
ATP1
BNA4
ILS1
PRS4
PRX1
COR1
FUI1
URA7
RIB1
PET9
ACH1
SCT1
RER2COQ1
UGA2
IPP1
FUR4CHS3
ETR1
CDS1
HMT1 PDX3CSG2
CHS2
ATP3
FAT1
CST26
TSC3
BAP2
TAT1MIS1
AAC3
VPS15
ALG1
LYS2
TKL2
GRS1
TPS1
VMA2
AGP2ADH5
ARA1RIB7 IFA38
CSH1TYR1
ECM31
YPC1
RIM2
PGI1
KTR4
LDH1
KTR3
MET8
PYC2
PDB1
HIS7
ARO4
DUT1
RIB5SHM1
TSC10
CTP1VBA2
SUL1PHO89
CHA1
PBN1
APA1
GLK1
ATG22
GRX1
HIS4
ILV6
PGS1 CIT2
ADY2
PGK1
SLM5
PMP1
FEN2
FEN1
RBK1PHO87
ARE1
THR4
ERS1
TRX3
MPH2
GUD1
GDH2
HEM3
GGC1
VMA1
LYS20
DLD2
DLD1
SFA1
CRD1BPL1
LYS21
QRI1
GET3
PMT1
PMT5
RAM1 NDE2
THI3
MDH3
COX9IDP1
PSA1SLC1
FAD1
SIR2
GPD1
TSC13
ATP16
NTH1GCV1
SES1
ARO3
HEM13BAP3
HEM12
TPI1
TGL2
LCB2
IPT1
TPS2
GRX3
ARO1
YCF1EKI1
KGD2HOM2
ARG82
SDH4
HST4
CAB5COQ4
MSS4ADK1
LYS4
FMN1
CTA1
EXG2
MSW1
GLO2
DPP1DPL1SUR2
ATP5
PRO1
GPI11 HNT2
IPK1
ASP1
TIM11YDR341CHXT7
HXT6
HXT3TRR1
TRP4
KEI1
YPR1FRQ1ARH1
ATP17ATO3
HPT1
URH1
DIT1 ADE8
BNA7
GUK1
PHO8
KRE2
RIB3
ITR1
SAM2
LPP1
GNP1
GRX2
QCR7
CAB1 STL1
DLD3
HXT13CAN1
PCM1VMA8
YEL047C
GLY1
GDA1
CYC7UTR4
BUD16VMA3
RIP1
URA3 PMP2
YEA6
PMI40YND1
HEM14
FAA2
ISC1PRO3
YAT2
CHO1
PHM8 SAH1
HOM3
PIC2
HIS1FCY2
FCY21
CEM1
HOR2
ICL1
ARG5
RNR1
ALD5
SER3
ILV1
TRP2
MET6
PRS2
AVT6
COX15
YER152C
ADK2
TMT1PDA1
FAU1
AGP3SEC53
AGX1
FRS2
LPD1
GNA1HXT10
DEG1GSY1
FAB1
HIS2
MET10
QCR6
BNA6
HXK1
HXK2PDE1
GUS1
ADE5
VRG4
SDT1
POX1
ARO8
COX13COX4
TPN1
STR3
LYS5
ARO2
GPI10
MET13
COQ8
GUP1FMP37
HNM1NPY1
PUS2
PYC1
OLE1
HEM2
PNC1
TRP5
ERG4
LEU1PMA1
ERG26
ECT1
NMA2
VMA7
GSC2
MUP1
ERG25
ADE6 VHT1PDC6
CTT1
VAS1TPC1
CLD1MEP1
ASN2
CYS4
CHO2
PSD2
MSM1
ERG1
ATF2
QCR9
TYS1HIP1
TDH3
PDX1
XKS1
ADE3
SER2TRX2
PFK1
FMP43
LSC2
CPD1
SOL4
ENO1
COQ6
GND2
TNA1
MES1FOL2
CAB4
BGL2
BIO2
IMA1
MAL11
MAL12
MUP3
GUT1
DUR3
PRS3
LAG1
QCR10
LEU5
ERG11
DIA4
ARG4
DED81
YHR020W
THR1
VMA16
PUT2
VMA10
NCP1
INM1
COX6
PAN5
DYS1
ERG7
QNS1MSR1
HXT4
HXT1
HXT5
GEP4GRE3
TRR2
EPT1
FUR1
ARO9DCD1
MPC2
SOL3
ENO2
GND1
ERG9
BAT1SUC2
POT1
GUT2
PAN6
FLX1
KGD1
AYR1COX5B
PFK26
LYS12
CAB2
THS1
SER33
RHR2
HIS6
PDR11 DOT5FAA3YIA6
INP51
DAL1DAL4
DAL2
DAL7
DAL3
LYS1
HYR1
HTD2
INM2
DUR1
SKN1AUS1
KCS1
FAS1
THI7
HEM1
TSA2PCT1
PXA1
HST2NAM2
APA2
YEF1
RNR4
TRP1
ACC1
THI20
ADH4
GLT1
UGA1
HIS5
TPO4
AGP1FMT1
PUS1
1.5
2.0
2.5
3.0
3.5
1 2 3log10(Glu_FPKM)
Pred
icte
d lo
g10(
Glu
_FPK
M)
R2: 0.342
GDH3
BDH1
ACS1
GCV3
CDC19
PMT2
FUN26
CYS3ADE1
YAT1
PHO11
IMA5OPT1
PHO90
ELO1 ERG20
FBP26
INO1
YUR1
GLG2LCB3
URA2
TRK1
RPE1
GSH1
LSB6
PHS1
GWT1
ARG3
ARG2
YJL068C
BNA3
TDH1
YJL045W
RNR2
CYR1AVT1
TDH2
MET3
GPI14
ILV3
TES1
BNA1
CYC1
UTR1CDC8
OPI3
MIR1BNA2
SFC1URA8
ADO1
CPA2YMR1
ATP2
STR2 XPT1
MET5
HOM6
PMT4
BAT2
DAL5
JEN1
URA1
SAC1
TRP3
MST1PXA2
SPE1PRS1
TPO5
MCD4
GPM1
SDH1AVT3
SDH3
TGL1
PGM1
OAC1
AAT1GFA1
YJU3
CAB3
MDH1VMA5
YNK1
FBA1
OAR1
UGP1
MAE1
GPX1
URA6
RAM2
ATP7LAC1 AUR1
MET14
FOX2SPO14
GAP1SHB17
YSR3
GLG1
KTR2CCP1GPT2
MET1SIS2
MTD1
TGL4
PCK1
MHT1MMP1AQY2
YBT1 FPS1
SDH2GPI13
TPO1
DPS1
BPT1
YEH1
LOT6
MEU1
YEH2
AAT2
ADE16
COX12
PDC1
ERG3
SHM2
FRS1
XYL2
ALT1
SUL2
ERG27
AHP1
CKI1
PDC5NHA1
PUT1
SPE4
ACS2
ASP3−2
ASP3−3
DPH5
IDP2
SAM1
ATG26
COQ9PNP1
BNA5
VPS34CDD1
GSY2LCB5ECI1 NNT1
ATP14
ECM38
EXG1MET17
ACO1
STT4
CDA1
NMA1
FKS1
DIC1
TAL1
ILV5
ADE13
SUR4FBP1VAC14
COX8
VIP1 URA4
IMD3CAR2
VMA6
HMG2
ERG13
PHO84
NDI1
COQ5URA5
TSL1
ALO1
ATP18
HMG1
DAK1
NTE1
IMD4
CYB2
CAT2AMD1 APT1
ERG6
GLO1PLB2
PLB1
ADI1
HXT2
SEC59
ERG5
FMS1
ARA2
STV1
AAC1ARG7
ADH3
VBA1PGM2
ILV2
FOL3
ADE17
NDE1
PAH1
ALD3
ALD2
GCV2
ERG2
PFK2
HFA1
ERG12
GUA1
ERG8
YMR226C
YHM2FAA4
GAD1
COX7
TPS3
URA10
SCS7
PGM3 GPI12
ABZ2
HER2
LCB1
LIP1
ADE4
ADH2
TGL3ADH6
FET4
FIG4
PHA2
PUS4
ERG24
MET2
ALP1
LYP1
PIK1
FOL1YNL247W
ZWF1
ADE12
SPS19
CHS1PSD1
MEP2
AAH1
CPT1NRK1
MLS1INP52
LEU4
AVT4
MSK1
LAT1
COX5A
LAP2
IDH1
KTR5
IDP3
PET8
CIT1
LRO1URK1
PHO91ARE2
ABZ1
COQ2
MVD1LYS9
BIO5BIO4
DSE4
GRE2
RIB4
ARG8
PFK27MDH2
ITR2WRS1COQ3
ADH1
HST1
RIB2
INP54
MET22
PRS5
GPD2
ARG1
SPE2
GSH2MSE1
TAT2
PLB3
HST3
GLO4VHS3
CYT1
NRT1
CDC21TGL5
RKI1
KTR1
CRC1
LEU9
INP53 GCY1CAT5
IAH1
ADE2
ORT1IDH2
LSC1THI80 ISN1
DDP1
GLN4
LCB4 ALE1HEM15
DCI1
SER1
THI72
HIS3
NPT1
MCT1
ODC2DFR1
MET7DGA1
VPH1
HEM4
CPA1
DGK1
FAA1
PMT3PRO2
VMA4ALA1
PYK2
PUT4
PDE2
ALD4
GDH1
ATF1
IMD2
SAM3SAM4
ATP15
PLC1
DIP5
FUM1
YAH1
HUT1VMA11
FAS2
THI6
PGC1POS5COX10
CDC60
PPT2ODC1
IDI1
CAR1
MSD1MSY1
SSU1
GLR1
YDC1
ATP4
BTS1
ALD6
GRX5
SUR1
KTR6
ISM1PMA2
ERG10
MET12
CIT3PDH1
ICL2
ATP20
AGC1
ATH1
HTS1
GLN1VMA13
MSF1
ARO7
FCY1
SPE3
TKL1
GRS2 PIS1ANT1MEP3
TAZ1
ASN1
TPO3KRE6GPH1
MET16
DPM1
GDB1
QCR2AQY1
ATP1
BNA4 ILS1PRS4PRX1
COR1
FUI1
URA7RIB1
PET9ACH1
SCT1
RER2COQ1
UGA2
IPP1FUR4
CHS3
ETR1
CDS1
HMT1
PDX3
CSG2CHS2
ATP3
FAT1CST26
TSC3
BAP2
TAT1 MIS1AAC3
VPS15ALG1
LYS2
TKL2
GRS1
TPS1VMA2AGP2
ADH5
ARA1
RIB7IFA38
CSH1TYR1
ECM31
YPC1
RIM2
PGI1
KTR4
LDH1KTR3
MET8
PYC2
PDB1
HIS7
ARO4
DUT1
RIB5SHM1
TSC10CTP1
VBA2
SUL1PHO89
CHA1
PBN1APA1
GLK1
ATG22
GRX1
HIS4
ILV6
PGS1
CIT2
ADY2
PGK1
SLM5
PMP1
FEN2
FEN1RBK1PHO87
ARE1
THR4ERS1
TRX3
MPH2
GUD1
GDH2
HEM3
GGC1
VMA1
LYS20
DLD2
DLD1
SFA1CRD1
BPL1
LYS21
QRI1
GET3PMT1PMT5
RAM1
NDE2
THI3 MDH3COX9
IDP1
PSA1SLC1
FAD1
SIR2
GPD1
TSC13
ATP16NTH1
GCV1
SES1
ARO3
HEM13
BAP3
HEM12
TPI1
TGL2LCB2
IPT1
TPS2
GRX3ARO1
YCF1
EKI1
KGD2
HOM2
ARG82SDH4
HST4
CAB5
COQ4
MSS4
ADK1
LYS4
FMN1
CTA1EXG2MSW1
GLO2DPP1
DPL1
SUR2
ATP5
PRO1GPI11HNT2
IPK1
ASP1TIM11
YDR341C
HXT7
HXT6
HXT3
TRR1
TRP4
KEI1
YPR1FRQ1
ARH1 ATP17
ATO3 HPT1
URH1
DIT1
ADE8
BNA7
GUK1
PHO8
KRE2
RIB3ITR1
SAM2
LPP1
GNP1
GRX2
QCR7
CAB1
STL1
DLD3
HXT13
CAN1
PCM1
VMA8
YEL047C
GLY1
GDA1
CYC7UTR4
BUD16 VMA3
RIP1
URA3
PMP2
YEA6 PMI40YND1HEM14
FAA2
ISC1PRO3
YAT2
CHO1
PHM8
SAH1HOM3
PIC2
HIS1
FCY2
FCY21
CEM1
HOR2
ICL1
ARG5
RNR1
ALD5
SER3
ILV1
TRP2
MET6
PRS2AVT6
COX15
YER152C
ADK2TMT1
PDA1
FAU1AGP3
SEC53
AGX1
FRS2
LPD1GNA1
HXT10
DEG1
GSY1
FAB1
HIS2 MET10
QCR6
BNA6
HXK1
HXK2
PDE1GUS1
ADE5
VRG4SDT1
POX1
ARO8
COX13COX4
TPN1
STR3
LYS5
ARO2
GPI10
MET13
COQ8
GUP1
FMP37
HNM1
NPY1
PUS2
PYC1
OLE1
HEM2
PNC1TRP5
ERG4
LEU1
PMA1
ERG26
ECT1
NMA2
VMA7GSC2
MUP1
ERG25
ADE6VHT1
PDC6
CTT1VAS1
TPC1
CLD1
MEP1
ASN2
CYS4
CHO2PSD2
MSM1
ERG1
ATF2
QCR9
TYS1HIP1
TDH3
PDX1XKS1
ADE3
SER2 TRX2
PFK1 FMP43
LSC2
CPD1
SOL4
ENO1
COQ6
GND2TNA1MES1 FOL2
CAB4 BGL2
BIO2
IMA1
MAL11
MAL12
MUP3
GUT1DUR3
PRS3LAG1
QCR10
LEU5
ERG11
DIA4
ARG4
DED81YHR020W
THR1
VMA16
PUT2
VMA10
NCP1
INM1
COX6
PAN5
DYS1
ERG7
QNS1
MSR1HXT4
HXT1
HXT5
GEP4
GRE3
TRR2EPT1
FUR1
ARO9
DCD1
MPC2
SOL3
ENO2
GND1
ERG9
BAT1
SUC2
POT1
GUT2
PAN6
FLX1
KGD1
AYR1
COX5B
PFK26
LYS12
CAB2
THS1SER33
RHR2
HIS6PDR11
DOT5FAA3
YIA6INP51DAL1
DAL4
DAL2
DAL7
DAL3
LYS1
HYR1
HTD2
INM2
DUR1
SKN1
AUS1
KCS1
FAS1
THI7
HEM1
TSA2
PCT1
PXA1
HST2
NAM2
APA2
YEF1
RNR4
TRP1
ACC1
THI20
ADH4GLT1
UGA1
HIS5
TPO4
AGP1
FMT1
PUS1
1
2
3
1 2 3log10(Ana_FPKM)
R2: 0.347
GDH3BDH1
ACS1
GCV3
CDC19
PMT2
FUN26
CYS3
ADE1
YAT1
PHO11
IMA5
OPT1
PHO90
ELO1ERG20
FBP26
INO1
YUR1
GLG2
LCB3
URA2
TRK1
RPE1
GSH1LSB6
PHS1
GWT1
ARG3
ARG2YJL068CBNA3
TDH1YJL045W
RNR2
CYR1AVT1
TDH2
MET3
GPI14
ILV3
TES1BNA1
CYC1
UTR1CDC8
OPI3
MIR1
BNA2
SFC1
URA8
ADO1CPA2YMR1
ATP2
STR2
XPT1
MET5
HOM6PMT4
BAT2
DAL5
JEN1
URA1
SAC1
TRP3
MST1PXA2
SPE1
PRS1
TPO5MCD4
GPM1
SDH1
AVT3
SDH3
TGL1
PGM1
OAC1
AAT1
GFA1 YJU3
CAB3 MDH1
VMA5
YNK1
FBA1
OAR1UGP1
MAE1
GPX1URA6
RAM2
ATP7
LAC1
AUR1
MET14
FOX2
SPO14
GAP1
SHB17
YSR3
GLG1KTR2
CCP1
GPT2
MET1SIS2
MTD1
TGL4
PCK1
MHT1
MMP1
AQY2
YBT1
FPS1
SDH2
GPI13
TPO1
DPS1
BPT1
YEH1
LOT6
MEU1YEH2
AAT2
ADE16
COX12
PDC1
ERG3SHM2
FRS1
XYL2 ALT1
SUL2
ERG27
AHP1
CKI1
PDC5 NHA1
PUT1
SPE4
ACS2
ASP3−2
ASP3−3
DPH5
IDP2
SAM1
ATG26COQ9
PNP1
BNA5
VPS34CDD1
GSY2
LCB5
ECI1NNT1
ATP14
ECM38
EXG1
MET17
ACO1
STT4CDA1
NMA1
FKS1DIC1
TAL1
ILV5
ADE13SUR4
FBP1
VAC14
COX8
VIP1
URA4
IMD3
CAR2
VMA6
HMG2
ERG13PHO84
NDI1
COQ5URA5
TSL1
ALO1
ATP18
HMG1
DAK1NTE1
IMD4
CYB2
CAT2
AMD1APT1
ERG6
GLO1PLB2
PLB1
ADI1 HXT2SEC59
ERG5
FMS1
ARA2
STV1
AAC1
ARG7
ADH3
VBA1 PGM2
ILV2
FOL3
ADE17
NDE1
PAH1ALD3
ALD2
GCV2
ERG2
PFK2
HFA1
ERG12
GUA1
ERG8
YMR226C
YHM2FAA4
GAD1
COX7
TPS3URA10
SCS7
PGM3
GPI12
ABZ2
HER2
LCB1
LIP1
ADE4
ADH2
TGL3ADH6
FET4
FIG4 PHA2PUS4
ERG24
MET2
ALP1LYP1
PIK1
FOL1
YNL247W
ZWF1
ADE12
SPS19
CHS1
PSD1
MEP2
AAH1
CPT1
NRK1
MLS1
INP52
LEU4
AVT4
MSK1 LAT1
COX5A
LAP2
IDH1
KTR5
IDP3
PET8
CIT1
LRO1URK1
PHO91
ARE2
ABZ1COQ2 MVD1
LYS9
BIO5
BIO4
DSE4GRE2
RIB4
ARG8
PFK27
MDH2
ITR2
WRS1COQ3
ADH1HST1
RIB2INP54
MET22
PRS5
GPD2 ARG1
SPE2
GSH2MSE1
TAT2
PLB3
HST3GLO4
VHS3
CYT1
NRT1
CDC21
TGL5
RKI1
KTR1
CRC1
LEU9
INP53
GCY1
CAT5IAH1
ADE2
ORT1
IDH2
LSC1
THI80
ISN1
DDP1
GLN4
LCB4ALE1HEM15
DCI1
SER1
THI72HIS3
NPT1
MCT1 ODC2
DFR1
MET7
DGA1
VPH1
HEM4
CPA1
DGK1
FAA1
PMT3
PRO2VMA4
ALA1
PYK2
PUT4
PDE2
ALD4
GDH1
ATF1
IMD2
SAM3SAM4
ATP15
PLC1DIP5
FUM1
YAH1
HUT1
VMA11FAS2
THI6
PGC1POS5
COX10
CDC60
PPT2
ODC1
IDI1
CAR1
MSD1
MSY1
SSU1
GLR1
YDC1
ATP4
BTS1
ALD6
GRX5
SUR1
KTR6
ISM1PMA2
ERG10
MET12
CIT3
PDH1
ICL2
ATP20
AGC1
ATH1
HTS1
GLN1
VMA13
MSF1ARO7
FCY1SPE3
TKL1
GRS2
PIS1
ANT1
MEP3
TAZ1
ASN1
TPO3
KRE6GPH1 MET16
DPM1GDB1
QCR2
AQY1
ATP1
BNA4
ILS1PRS4
PRX1
COR1
FUI1
URA7
RIB1
PET9
ACH1
SCT1
RER2COQ1UGA2 IPP1
FUR4CHS3
ETR1
CDS1
HMT1
PDX3
CSG2
CHS2
ATP3
FAT1
CST26
TSC3BAP2TAT1
MIS1
AAC3
VPS15
ALG1
LYS2
TKL2
GRS1
TPS1
VMA2AGP2
ADH5
ARA1
RIB7IFA38
CSH1
TYR1
ECM31
YPC1
RIM2
PGI1
KTR4
LDH1 KTR3
MET8
PYC2
PDB1
HIS7
ARO4
DUT1
RIB5SHM1
TSC10CTP1
VBA2
SUL1
PHO89
CHA1 PBN1
APA1
GLK1
ATG22
GRX1
HIS4
ILV6
PGS1
CIT2ADY2
PGK1
SLM5
PMP1
FEN2
FEN1RBK1
PHO87ARE1
THR4
ERS1
TRX3
MPH2
GUD1
GDH2
HEM3
GGC1
VMA1
LYS20
DLD2
DLD1
SFA1CRD1BPL1
LYS21
QRI1
GET3
PMT1PMT5RAM1
NDE2
THI3MDH3
COX9IDP1
PSA1SLC1FAD1
SIR2
GPD1
TSC13
ATP16
NTH1
GCV1
SES1
ARO3HEM13
BAP3
HEM12
TPI1
TGL2LCB2
IPT1
TPS2
GRX3
ARO1
YCF1
EKI1
KGD2
HOM2ARG82
SDH4
HST4CAB5
COQ4
MSS4
ADK1
LYS4
FMN1
CTA1
EXG2MSW1
GLO2
DPP1
DPL1
SUR2
ATP5
PRO1
GPI11HNT2
IPK1
ASP1
TIM11
YDR341C
HXT7HXT6
HXT3
TRR1TRP4
KEI1
YPR1
FRQ1
ARH1
ATP17
ATO3
HPT1URH1
DIT1
ADE8
BNA7
GUK1
PHO8
KRE2
RIB3
ITR1 SAM2
LPP1
GNP1
GRX2
QCR7
CAB1
STL1
DLD3
HXT13CAN1
PCM1
VMA8YEL047C
GLY1
GDA1CYC7UTR4 BUD16
VMA3
RIP1
URA3
PMP2
YEA6PMI40
YND1
HEM14
FAA2
ISC1
PRO3
YAT2
CHO1
PHM8
SAH1
HOM3
PIC2HIS1
FCY2
FCY21
CEM1
HOR2
ICL1
ARG5
RNR1
ALD5
SER3
ILV1
TRP2
MET6
PRS2AVT6COX15YER152C
ADK2TMT1 PDA1
FAU1
AGP3
SEC53
AGX1
FRS2
LPD1
GNA1
HXT10
DEG1
GSY1
FAB1
HIS2
MET10
QCR6
BNA6
HXK1
HXK2
PDE1
GUS1
ADE5
VRG4
SDT1
POX1ARO8
COX13
COX4
TPN1
STR3
LYS5ARO2
GPI10
MET13
COQ8
GUP1
FMP37
HNM1
NPY1
PUS2
PYC1
OLE1
HEM2
PNC1TRP5
ERG4
LEU1
PMA1
ERG26
ECT1NMA2
VMA7
GSC2
MUP1
ERG25
ADE6
VHT1PDC6
CTT1VAS1
TPC1
CLD1
MEP1
ASN2CYS4CHO2
PSD2
MSM1
ERG1ATF2
QCR9
TYS1
HIP1
TDH3
PDX1XKS1
ADE3
SER2
TRX2
PFK1
FMP43
LSC2
CPD1 SOL4
ENO1
COQ6GND2
TNA1
MES1 FOL2
CAB4BGL2
BIO2IMA1
MAL11
MAL12
MUP3
GUT1
DUR3
PRS3
LAG1
QCR10
LEU5
ERG11
DIA4
ARG4DED81
YHR020W
THR1
VMA16
PUT2
VMA10NCP1
INM1
COX6
PAN5
DYS1
ERG7QNS1
MSR1HXT4
HXT1
HXT5
GEP4GRE3
TRR2EPT1 FUR1
ARO9DCD1
MPC2
SOL3
ENO2
GND1ERG9
BAT1
SUC2
POT1 GUT2PAN6
FLX1
KGD1
AYR1
COX5B
PFK26
LYS12CAB2
THS1
SER33
RHR2
HIS6PDR11
DOT5
FAA3
YIA6INP51
DAL1DAL4
DAL2DAL7DAL3
LYS1
HYR1HTD2
INM2
DUR1
SKN1
AUS1
KCS1
FAS1
THI7
HEM1
TSA2
PCT1
PXA1
HST2
NAM2
APA2
YEF1
RNR4
TRP1
ACC1
THI20ADH4
GLT1
UGA1 HIS5
TPO4
AGP1
FMT1 PUS1
2
3
1 2 3 4log10(Eth_FPKM)
Pre
dict
ed lo
g10(
Eth
_FP
KM
)
R2: 0.428
Aerobic fermentation(nitrogen limited)
Gluconeogenic respiration(ethanol limited)
Respiratory glucose metabolism(glucose limited)
Fermentative glucose metabolism(glucose limited, anaerobic)
Gluconeogenic respiration(ethanol limited)
Aerobic fermentation(nitrogen limited)
Fermentative glucose metabolism(glucose limited, anaerobic)Respiratory glucose metabolism(glucose limited)
NitEth
AnaGlu
Nit
Eth
Ana
Glu
Nit
Eth
AnaGlu
Nit
Eth
Ana
Glu
Nit
Eth
Ana
Glu Nit
Eth
Ana
Glu Nit
Eth
Ana
Glu
0.2
0.3
0.4
GEMSNR > 1
Peak signal
GEMSNR > 2
Peak signal
GEMSNR > 3
Peak signal
GEMSNR > 2Sum of
−50 to +50 bpsurrounding
peaks
Sum of TSS−500 −> +500 bp
promotersignal
Sum of TSS−500 −> +500 bp
promotersignal
allowingMARS hinges
R2
Pre
dict
ed lo
g10(
Ana
_FP
KM
)
Figure 2. Comparing performance of TF binding data preprocessing in linear regressions to predict transcript levels and details of multivariate adaptiveregression splines (MARS) models. (A) Correlations between predicted transcript levels and real transcript levels for the different formats of TF bindingdata. The black line indicates the mean of the four metabolic conditions. (B–E) MARS used to predict metabolic gene transcript levels of the differentconditions from the amount of TF binding per gene promoter. The boxes shown below the predictions plots represent the different TFs that are selectedby MARS to give strongest predictive performance in the conditions and how their signal is contributing to predictions in the model.
each condition, we used MARS with cross validation for theTF and spline selection to ensure that only the most robustTFs and splines were included in the final model (modelbuilding is illustrated in Supplementary Figure S3A). Theresulting predictions of transcript levels from TF bindingusing MARS for the four metabolic conditions are shownin Figure 2B–E. Using these conservative MARS models,
we could predict 34–43% of the variation in expression lev-els of metabolic genes at the four metabolic conditions weinvestigated. We judged this predictive power as being suf-ficiently good to assume that the parameters given to theTFs in these predictive models can give insights into howthe TFs are contributing to gene regulation in metabolism.Typical quality control checks of the MARS models pre-
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
4992 Nucleic Acids Research, 2019, Vol. 47, No. 10
sented in Figure 2B–E, such as the distribution of residualsand QQ plots are shown in Supplementary Figure S3B.
The details of how TFs are contributing at different TFbinding strengths in these predictive models are shown inthe lower panels of Figure 2B–E. The coefficient of the TFsfrom the regressions are illustrated here by either an up-ward slope (activation) or a downward slope (repression).A striking observation from these models is that of the 22TF-transcript level correlations selected by MARS over thefour metabolic conditions, the linear correlations were allpositive, suggesting a predominance of activation in yeastmetabolic gene regulation by TFs. MARS can also createsegments where there is no correlation between TF bindingand transcript levels if that is advantageous for the predic-tions. The most common way that the MARS algorithmscreated splines from TF binding (15/22 cases) is to intro-duce a threshold after which there is a linear relationshipbetween TF binding and transcript level. MARS also foundrelationships between TF binding and transcription thatshow a saturation effect, where more binding does not leadto higher expression (6/22 cases). Of these six cases, fourof them are from the well-known interaction partners Ino2and Ino4, suggesting that nonlinearity between binding andfunctional outcomes may be a general feature of this TFcomplex.
Contributions from several TFs on gene regulation are gen-erally additive, but exploring collinear TFs indicates cases ofmore complex functional interactions
In multiple linear regressions, correlation between explana-tory variables (here TFs) leads to multicollinearity, a redun-dancy in the information contained in explanatory variableswhich may complicate the interpretation of the model, Inour data, collinearity between TFs could be due to TFsinteracting in a common protein complex, or respondingindependently to the same cellular cues that regulate TF-DNA binding. A feature of the TF selection in MARS isthat if there is collinearity between two TFs that are bothstrongly predictive of transcript levels, only the TF thatgives slightly better predictions will be included while theother TF will not be visible. To look for such collinearTFs we first mapped the correlations in binding signal overmetabolic genes and calculated significance of the correla-tions (Figure 3A–D). To find cases in the MARS models ofcollinearity where a TF is not included because there is aslightly better predictor selected, we tested if all includedTFs could be substituted by other TFs with significantlycorrelated binding. Such cases are shown with black bordersin Figure 3A–D and they are TF pairs predicted to regulatesimilar sets of genes in similar ways, indicating overlappingfunctions and/or possibly more complex TF–TF interac-tions. This analysis revealed several known cases of TF in-teractions such as Ino2–Ino4, Gcr1-Gcr2-Tye7 and Cat8-Sip4, but also novel potential interactions such as Gcn4-Rtg1 and Ert1-Ino4.
In the MARS models shown in Figure 2B–E, the contri-bution of TFs binding to each gene is multiplied by a coef-ficient and then added to get the final predicted transcriptlevel for that gene. We further looked for TF-TF interac-tions that contribute to transcriptional regulation in ways
that are numerically more complex than simple addition.All the significantly correlated TFs were tested if the multi-plication of the signal of two collinear TFs give additionalpredictive power compared to addition of the two TFs (Fig-ure 3E–H). Most collinear TF pairs do not show a strongimprovement in predictive power by including a multiplica-tive interaction term, for example the mentioned potentialTF interactions of Cat8-Sip4 and Gcn4-Rtg1 during glu-coneogenic respiration which only gave a 3% and 4% in-crease in predictive power, respectively (Figure 3F, percent-age improvement calculated by (multiplicative R2 increase(y-axis) + additive R2 (x-axis))/additive R2 (x-axis)). TheTF pair that shows the clearest indications of having a morecomplex functional interaction is Ino2–Ino4, having 19%,11%, 39% and 20% improvement (Figure 3E–H) in predic-tive power in the tested metabolic conditions by including amultiplication of the binding signals. TF pairs that togetherexplain >10% of the metabolic gene variation using an onlyadditive regression and also show minimum 10% improvedpredictive power when allowing multiplication are indicatedin red in Figure 3E–H. For Ino2–Ino4, the strongest effectof the multiplication term is seen during fermentative glu-cose metabolism with 39% improved predictive power (Fig-ure 3G). The plot for how the multiplied Ino2–Ino4 signalis contributing to the regression in this condition reveal thatin the genes where both TFs bind strongest together, thereis a predicted reduced activation as compared to interme-diate binding strengths of both TFs, and a similar trend isseen for the Ino2–Ino4 pair for other metabolic conditions(Supplementary Figure S3c).
Clustering metabolic genes based on their relative change inexpression gives a strong enrichment of metabolic processesand improved predictive power of TF binding in linear regres-sions
Linear regressions of metabolic genes with TF selectionthrough MARS defined a small set of TFs that wererobustly associated with transcriptional changes over allmetabolic genes (Figure 2B–E), but TFs that only regu-late a smaller group of genes would be unlikely to get se-lected by this method. We clustered genes by their sum-of-squares normalized expression between conditions to getsmaller clusters of genes with a range of gene expression lev-els that are appropriate for predictive modeling by multiplelinear regressions. The motivation for clustering genes intosmaller groups is to be able to link TFs to specific patterns ofgene expression changes between the tested metabolic con-ditions and to functionally connected groups of genes– thusallowing more detailed predictions about the TFs’ biologi-cal roles. The optimal number of clusters to maximize theseparation of the normalized expression values of metabolicgenes was 16, as determined by Bayesian information crite-rion (Supplementary Figure S4A). Genes were sorted into16 clusters by k-means clustering and we found that mostclusters then show significant enrichment of metabolic pro-cesses, represented by GO categories (Figure 4). We furtherselected four clusters (indicated by black frames in Figure4) that are both enriched for genes of central metabolic pro-cesses and have large transcriptional changes across the dif-ferent metabolic conditions for further studies of how TFs
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
Nucleic Acids Research, 2019, Vol. 47, No. 10 4993
p-value*** < 0.001** < 0.01* < 0.05
p-value*** < 0.001** < 0.01* < 0.05
*
*
*
*
*
*
*
* *
*
*
*
** **
**
**
**
**
***
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Gcn
4R
tg1
Hap
1E
r t1
Sut
1O
af1
Stb
5C
bf1
Ino2
Ino4
Gcr
2G
cr1
T ye7
Cat
8R
ds2
Rgt
1P
ip2
Rtg
3H
ap4
Sip
4
Leu3Gcn4
Rtg1Hap1
Ert1Sut1
Oaf1Stb5
Cbf1Ino2
Ino4Gcr2
Gcr1Tye7
Cat8Rds2
Rgt1Pip2
Rtg3Hap4
*
*
*
* *
**
**
**
** **
***
***
***
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Gcn
4R
tg1
Hap
4Le
u3R
gt1
Sip
4S
ut1
Pip
2E
rt1C
at8
Rds
2G
cr2
Gcr
1Ty
e7C
bf1
Hap
1In
o2In
o4O
af1
Stb
5
Rtg3Gcn4
Rtg1Hap4
Leu3Rgt1
Sip4Sut1
Pip2Ert1
Cat8Rds2
Gcr2Gcr1
Tye7Cbf1
Hap1Ino2
Ino4Oaf1
Interchangable byMARS TF selection
Interchangable byMARS TF selection
A
B
C
D
p-value*** < 0.001** < 0.01* < 0.05
*
*
*
*
*
*
*
*
*
**
**
**
**
***
***
***
***
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Ino4
Cbf
1H
ap1
Leu3
Rtg
1R
tg3
Sut
1H
ap4
Er t1
Gcn
4S
tb5
Oaf
1P
ip2
Gcr
2S
ip4
Gcr
1Ty
e7R
gt1
Cat
8R
ds2
Ino2Ino4
Cbf1Hap1
Leu3Rtg1
Rtg3Sut1
Hap4Ert1
Gcn4Stb5
Oaf1Pip2
Gcr2Sip4
Gcr1Tye7
Rgt1Cat8
Interchangable byMARS TF selection Cbf1−Ert1
Cbf1−Rtg3Ert1−Rtg3
Gcr1−Tye7Ert1−Hap1
Ert1−Gcn4
Ert1−Hap4
Ert1−Ino2
Ert1−Ino4
Gcn4−Hap4
Gcn4−Ino2 Gcn4−Ino4
Hap4−Ino2
Hap4−Ino4
Ino2−Ino4Cat8−Rds2
0.00
0.02
0.04
0.06
0.00 0.05 0.10 0.15 0.20R2 when adding TF pair contributions
R2
incr
ease
allo
win
g TF
pai
r mul
tiplic
atio
n
p-value*** < 0.001** < 0.01* < 0.05
*
**
**
***
***
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Gcr
1Ty
e7C
bf1
Ino2
Ino4
Leu3
Gcn
4R
tg1
Rtg
3P
ip2
Oaf
1H
ap1
Stb
5R
gt1
Sut
1S
ip4
Cat
8R
ds2
Ert1
Hap
4
Gcr2Gcr1
Tye7Cbf1
Ino2Ino4
Leu3Gcn4
Rtg1Rtg3
Pip2Oaf1
Hap1Stb5
Rgt1Sut1
Sip4Cat8
Rds2Ert1
Interchangable byMARS TF selection
Cat8−Sip4
Gcn4−Rtg1
Gcr1−Tye7
Ino2−Ino4
0.00
0.02
0.04
0.06
0.00 0.05 0.10 0.15 0.20R2 when adding TF pair contributions
R2
incr
ease
allo
win
g TF
pai
r mul
tiplic
atio
n
Cat8−Rds2Gcn4−Rtg1
Gcr1−Gcr2Gcr1−Tye7
Gcr2−Tye7
Ert1−Hap1Ert1−Ino2
Ert1−Ino4
Ert1−Oaf1Ert1−Sut1 Hap1−Ino2
Hap1−Ino4
Hap1−Oaf1
Hap1−Sut1
Ino2−Ino4
Ino2−Oaf1
Ino2−Sut1Ino4−Oaf1Ino4−Sut1
Oaf1−Sut1
Hap1−Sip4Ino2−Sip4
Ino4−Sip4
Cbf1−Gcr1Cbf1−Tye70.00
0.02
0.04
0.06
0.00 0.05 0.10 0.15 0.20R2 when adding TF pair contributions
R2
incr
ease
allo
win
g TF
pai
r mul
tiplic
atio
n
Gcn4−Rtg1
Gcr1−Gcr2Gcr1−Tye7
Gcr2−Tye7Hap1−Ino2 Hap1−Ino4
Hap1−Oaf1
Hap1−Stb5
Ino2−Ino4
Ino2−Oaf1Ino2−Stb5
Ino4−Oaf1
Ino4−Stb5Oaf1−Stb50.00
0.02
0.04
0.06
0.00 0.05 0.10 0.15 0.20R2 when adding TF pair contributions
R2
incr
ease
allo
win
g TF
pai
r mul
tiplic
atio
n
E
F
G
H
Aerobic fermentation (nitrogen limited)
Gluconeogenic respiration (ethanol limited)
Respiratory glucose metabolism(glucose limited)
Fermentative glucose metabolism(glucose limited, anaerobic)
Aerobic fermentation (nitrogen limited)
Gluconeogenic respiration (ethanol limited)
Fermentative glucose metabolism(glucose limited, anaerobic)
Respiratory glucose metabolism(glucose limited)
Figure 3. Exploring contributions of collinear TF pairs to transcriptional regulation. (A–D) Correlation plots illustrating Pearsons correlations (in color)between TF binding in promoters of metabolic genes. Significance (Pearson’s product moment correlation coefficient) is illustrated for TF pairs with P< 0.05, by one or several asterisks, as indicated. Pairs of significantly collinear TFs that are interchangeable in the MARS TF selection in Figure 2B–Eare indicated by a stronger border in (A–D). (E–H) Linear regressions of collinear TF pairs were tested with and without allowing a multiplication of TFsignals of the two TFs. TF pairs indicated in red and with larger fonts have an R2 of the additive regression >0.1 and increased performance with includinga multiplication of the TF pairs of at least 10%.
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
4994 Nucleic Acids Research, 2019, Vol. 47, No. 10
Genes clustered (k-means)by their relative transcript levelsacross conditions
#3 - cellular response to oxidative stress, p.adj: 2.5e−3
#8
#6
#16 - fatty acid beta−oxidation, p.adj: 8.8e−12
#2
#7 - ATP synthesis coupled proton transport, p.adj: 7.7e−9
#12 - tricarboxylic acid cycle, p.adj: 7.2e−4
#15 - glycolytic process, p.adj: 4.7e−10
#10
#5
#9 - cellular response to nitrogen starvation, p.adj: 2.1e−3
#11 - tRNA aminoacylation for protein translation, p.adj: 2.5e-4
#1 - maltose metabolic process, p.adj: 3.5e-3
#4 - ergosterol biosynthetic process, p.adj: 1.7e-5
#13
#14
0.2 0.6RSS-normalized transcript
0.4 0.8
Cluster number and most enriched GO-term (FDR-adj. p < 0.01)
Aerobicfermentation
(nitrogenlimited)
Gluconeogenicrespiration
(ethanollimited)
Respiratory glucose
metabolism(glucoselimited)
Fermentativeglucose
metabolism(glucoselimited,
anaerobic)
Figure 4. Clustering genes by their relative change in expression (sum of squares normalization) over the four experimental conditions gives enrichment offunctional groups of genes. For clusters which have one or several significantly (FDR-adj P < 0.01) enriched GO terms, the top GO term is indicated withp.adj-value. Clusters containing central metabolic processes selected for further analysis with linear regressions in Figure 5 are indicated by a black frame.
are affecting gene regulation in these clusters through multi-ple linear regressions. While the introduction of splines washighly stable for linear regressions over all metabolic genes,we found the process of model building with MARS usingsplines to be less stable in smaller groups of genes (meancluster size with 16 clusters is 55 genes). For the multiplelinear regressions in the clusters, we retained TF selection(by variable selection in the MARS algorithm) to define themost important TFs, but without introduction of splines.
Using this framework of multiple linear regression, pre-dictions of transcriptional regulation on the clustered genesgives an improvement in predictive power compared to pre-dictions of all metabolic genes (Figure 5E–H, R2: 0.57–0.68). To compare the importance of different TFs for thepredictions of transcript levels in the groups over differentconditions, we calculate the ‘TF importance’ by multiply-ing R2 of the multiple linear regression predictions with therelative contribution of the TF in the linear regression (0–1, calculated by model construction algorithm) and also acoefficient for activation or repression (+1 or –1, respec-tively). Some TFs were found to regulate a certain processover several conditions, such as Hap1 for Cluster 4, enrichedfor ergosterol biosynthesis genes (Figure 5A), but Cluster4 is generally an example of a cluster with relatively largechanges in importance of different TFs for gene regulationin different conditions. To get information about the com-plete set of TFs regulating these clusters of genes, we alsoincluded collinear TFs that were not initially included in
the variable selection, but could replace a significantly cor-related TF (illustrated by a red link under the TF’s namesin the heatmaps of Figure 5). For Cluster 4, Oaf1 was notselected during TF selection for this cluster and was thusnot used in the predictions illustrated in the prediction plotof Figure 5E, but was included in the heatmap because itwas correlated to the Hap1 binding and when excludingHap1 from the TF selection, Oaf1 was included. Becausethe contribution of each TF is linear in these regressions,the heatmaps give a complete view of how each gene is pre-dicted to be regulated by different TFs. For Cluster 4 in fer-mentative glucose metabolism, the main contributors to er-gosterol genes (ERG27, ERG26, ERG11, ERG25, ERG3)are predicted to be Ert1, Hap1 and Oaf1 (Figure 5E).
Cluster 15 is highly enriched for glycolytic processes andacross conditions we see that the TFs predicted to be mostimportant in several conditions are the well-known gly-colytic regulators Gcr1, Gcr2 and Tye7 (Figure 5B). In res-piratory glucose metabolism, Gcr1, Hap1, Ert1 and Rtg1are included by variable selection and together these TFsare able to explain 66% of the variation in the cluster (Fig-ure 5F). Both Gcr2 and Tye7 were found to be collinearand able to replace Gcr1 if it was excluded and these threeTFs together are predicted to be regulating the glycolyticgenes TPI1, CDC19, TDH2, ENO2, PGK1, PGI1, FBA1,TDH3, GPM1, PFK2, TDH1 and PFK1 (Figure 5F). Wenext focused on Cluster 7, containing genes with relativelyhigher expression in the two respiratory conditions (seen
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
Nucleic Acids Research, 2019, Vol. 47, No. 10 4995
Cluster 4
Gcn
4C
at8
Oaf
1H
ap1
Exp
ress
ion
Er t1
ILV5MPC2ALD5ILV3OLE1ERG3ERG25ERG11SHM2ERG6COX15ERG26SLC1RNR4NCP1ERG7ERG27GUA1ZWF1ADE5ADE13IMD3ADH3KTR1ADE3TAT2HEM2LYP1MTD1ERG24ERG4BNA2SER3SUL2SER2PHO90HMT1YEL047CADE4GPD2HEM12SCT1HEM14HEM1DYS1SAC1COX10PYC1PMT5ADH6PDR11GDB1FMT1CDA1IMD4PMT3
SCT1
FMT1
HMT1SLC1
PMT5
HEM12HEM1
YEL047C
HEM14
ALD5 SER3
COX15
ERG26
ERG4
HEM2
OLE1
PYC1
ADE5
ERG25
RNR4ADE3
SER2
ERG11
NCP1
DYS1
ERG7
MPC2
PDR11
PHO90
ILV3
BNA2
SAC1
MTD1
ERG3
SHM2
SUL2
ERG27
CDA1
ILV5
ADE13IMD3
ERG6
IMD4
ADH3
GUA1
ADE4
ADH6
ZWF1
LYP1
ERG24
TAT2
GPD2
KTR1
PMT3
COX10
GDB1
2.0
2.5
3.0
3.5
1.5 2.0 2.5 3.0 3.5log10(Ana_FPKM)
Pre
dict
ed lo
g10(
Ana
_FP
KM
)
R2: 0.68Ert1, Cat8, Gcn4, Hap1
Cluster 7
Oaf
1
Hap
1
Exp
ress
ion
Hap
4
QCR7INO1RIP1QCR6ATP16ATP14COX12IDH1ADK2COX6COX5AATP1ATP17MHT1ATP20NDI1KGD1SDH1TIM11SDH3SDH2LSC1FEN2ATP7HIS6ATP3CIT1KGD2ATP5BIO4AGP2AQY2QCR2LEU4PPT2IMD2ALP1DIT1ODC1AVT3ACO1SDH4GDH2ATP4MPH2YNK1
ATP1
ATP3
AGP2
FEN2
ATP16
GDH2
MPH2
KGD2
SDH4ATP5
TIM11
ATP17
DIT1
QCR7RIP1
ADK2
QCR6
COX6
IMD2
HIS6
KGD1
INO1 ATP7
YNK1
SDH3
AVT3SDH1
SDH2
AQY2
MHT1
COX12 ATP14
ACO1
NDI1
IDH1
COX5A
LEU4
ALP1
CIT1
BIO4
LSC1
ATP4
ODC1PPT2
ATP20
QCR2
2.0
2.5
3.0
3.5
2 3log10(Eth_FPKM)
Pre
dict
ed lo
g10(
Eth
_FP
KM
)
R2: 0.57Hap4, Hap1, Oaf1
A
Hap1Oaf1
Ino2Ino4
Hap1Gcn4
Rtg1
Hap1Oaf1
Gcn4
Cat8
Ert1Hap1
Gcn4
Rds2
Rtg3
−0.50
−0.25
0.00
0.25
0.50
0.75
Cluster 15 Adjusted p−value4.7e−10glycolytic process6.8e−08vacuolar acidification
3.8e−06ATP hydrolysis coupled proton transport
6.2e−06gluconeogenesis1.2e−05proton transport3.1e−04ion transport
B
Gcr1Tye7
Hap4
Ino2Ino4
Leu3
Rds2
Rtg1Gcr1Tye7
Hap4Ino2
Gcr2
Gcr1
Tye7
Ino2
Gcr2
Rgt1
Gcr1
Tye7
Rtg1
Gcr2
Ert1
Hap1
−0.50
−0.25
0.00
0.25
0.50
0.75
Cat8Ino4
Hap1
Hap4
Oaf1Hap1
Hap4
Oaf1Pip2
Rgt1
−0.50
−0.25
0.00
0.25
0.50
0.75
R2
* R
elat
ive
impo
rtanc
e of
TF
in re
gres
sion
(0−1
) *
corr
. coe
ffici
ent(+
1/-1
)
Adjusted p−value8.8e−12fatty acid beta−oxidation3.1e−03very long−chain fatty acid catabolic process3.1e−03fatty−acyl−CoA transport3.1e−03peroxisomal long−chain fatty acid import8.7e−03fatty acid metabolic process8.7e−03glycerol−3−phosphate metabolic process9.9e−03NADH oxidation
Cluster 16
Gcn4
Ert1Sut1
Gcr1Hap1
Ert1
Gcr1
Cbf1
Ino2Tye7
Leu3
Rtg3
Ino2
Ino4Hap1
Ino2Tye7
Ino4
Cat8
Rgt1
−0.50
−0.25
0.00
0.25
0.50
0.75
CDC19
VMA2
PGI1
TSC10
HIS4
PGK1
PSA1
VMA1
TPI1
EKI1
ITR1
VMA3
BUD16
VMA8
FCY2
FCY21RNR1
GUS1
HXK2
ADH4
UGA1VMA7MUP1
CYS4
TDH3
PFK1
MUP3
VMA10
ENO2
FAA3CYR1
RNR2
TDH1
URA2
ERG20
TDH2
OPI3
FBA1
VMA5
GPM1
SAM1
CDD1
EXG1
FKS1VMA6
ERG13
PFK2
PHO91
MVD1
MET22
VPH1
VMA4
YDC1FAS2
GPH1
2.0
2.5
3.0
3.5
2 3log10(Glu_FPKM)
Pre
dict
ed lo
g10(
Glu
_FP
KM
)
R2: 0.66Gcr1, Hap1, Ert1, Rtg1
Hap
1E
xpre
ssio
nTy
e7G
cr1
Gcr
2R
tg1
Er t1
PFK1TDH1PFK2GPM1TDH3FBA1PGI1PGK1ENO2TDH2CDC19TPI1HXK2RNR2GPH1YDC1VMA7VMA4MUP1CYS4OPI3FAS2ITR1SAM1FCY2FKS1FAA3VMA1UGA1URA2RNR1CDD1EKI1CYR1GUS1VMA10BUD16FCY21TSC10VMA3HIS4VPH1VMA5VMA8MET22ADH4PSA1VMA6VMA2MUP3PHO91MVD1ERG13ERG20EXG1
Sut
1
Gcn
4
Ert1
Gcr
1
Exp
ress
ion
Hap
1
IDP3GPD1ALD4GUT2NDE1STL1PXA2ECI1SFA1TES1MMP1CLD1CTA1HXT5ICL2GUT1JEN1VHT1ATH1POX1YJL045WFMP43POT1SPS19DCI1NDE2FAA2FOX2PXA1
GPD1
NDE2SFA1
CTA1STL1FAA2 POX1
VHT1CLD1
FMP43
GUT1
HXT5
GUT2
POT1
YJL045W
TES1
PXA2
JEN1
FOX2
MMP1
ECI1
NDE1
IDP3
SPS19
DCI1
ALD4
PXA1
ICL2ATH1
1.25
1.50
1.75
2.00
2.25
2.50
1.6 2.0 2.4log10(Nit_FPKM)
Pre
dict
ed lo
g10(
Nit_
FPK
M)
R2: 0.60Gcn4, Hap1, Gcr1
Adjusted p−value1.7e−05ergosterol biosynthetic process1.7e−05steroid biosynthetic process2.7e−04lipid biosynthetic process3.0e−04sterol biosynthetic process3.0e−04purine nucleotide biosynthetic process4.1e−03purine nucleobase biosynthetic process1.5e−02heme biosynthetic process1.9e−02protoporphyrinogen IX biosynthetic process3.6e−02GMP biosynthetic process
Adjusted p−value7.7e−09ATP synthesis coupled proton transport7.8e−06ATP biosynthetic process1.8e−05tricarboxylic acid cycle6.8e−04mitochondrial electron transport, succinate to ubiquinone8.2e−04ion transport1.6e−03proton transport1.9e−03cellular respiration6.6e−03cristae formation8.9e−03obsolete ATP catabolic process1.9e−02hydrogen ion transmembrane transport1.9e−02protein complex oligomerization2.3e−02mitochondrial electron transport, ubiquinol to cytochrome c
E F
C
G
D
H
−4 −2 0 2 4Column Z−Score −2 0 2
Column Z−Score
−3 −1 1 3Column Z−Score
0 −4 −2 0 2 4Column Z−Score
R2
* R
elat
ive
impo
rtanc
e of
TF
in re
gres
sion
(0−1
) *
corr
. coe
ffici
ent(+
1/-1
)
R2
* R
elat
ive
impo
rtanc
e of
TF
in re
gres
sion
(0−1
) *
corr
. coe
ffici
ent(+
1/-1
)
R2
* R
elat
ive
impo
rtanc
e of
TF
in re
gres
sion
(0−1
) *
corr
. coe
ffici
ent(+
1/-1
)
Aerobic fermentation(nitrogen limited)
Gluconeogenic respiration(ethanol limited)
Respiratory glucose metabolism(glucose limited)
Fermentative glucose metabolism(glucose limited, anaerobic)
Cluster 16Cluster 7Cluster 15Cluster 4
Aerobicfermentation
(nitrogenlimited)
Gluconeogenicrespiration(ethanollimited)
Respiratory glucose
metabolism(glucoselimited)
Fermentativeglucose
metabolism(glucoselimited,
anaerobic)
Aerobicfermentation
(nitrogenlimited)
Gluconeogenicrespiration(ethanollimited)
Respiratory glucose
metabolism(glucoselimited)
Fermentativeglucose
metabolism(glucoselimited,
anaerobic)
Aerobicfermentation
(nitrogenlimited)
Gluconeogenicrespiration(ethanollimited)
Respiratory glucose
metabolism(glucoselimited)
Fermentativeglucose
metabolism(glucoselimited,
anaerobic)
Aerobicfermentation
(nitrogenlimited)
Gluconeogenicrespiration(ethanollimited)
Respiratory glucose
metabolism(glucoselimited)
Fermentativeglucose
metabolism(glucoselimited,
anaerobic)
Figure 5. Clustering genes by relative expression gives strong predictive models of the clustered genes. (A–D) All significant (P.adj < 0.05) GO terms for theclustered genes and the relative importance of the TFs selected to give the strongest predictions of transcript levels for the genes in the clusters in differentconditions. Linear regressions (without splines) are used and importance is calculated by R2 (of regression with selected TFs) *relative importance of eachTF (0 to 1) *sign of coefficient (+1 is activation, –1 is repression). (E–H) Prediction plots showing the predicted transcript levels compared to the realtranscript levels from using the selected TFs (written in subtitle of plots). R2 of predicted transcript levels compared to real transcript level is shown inred text. Heatmaps demonstrate the real transcript levels as well as binding signal of each TF normalized column-wise (Z-score). TFs linked by a red lineunder the heatmap have significant collinearity over the cluster genes and were demonstrated to be able replace the other(s) in the variable selection, thushaving overlapping functions in regulation of genes in a given cluster.
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
4996 Nucleic Acids Research, 2019, Vol. 47, No. 10
in Figure 4) and most strongly enriched for genes of mito-chondrial ATP biosynthesis (Figure 5C). The most impor-tant TF for predictions in this cluster for both the condi-tions with mostly respiratory metabolism is the well-knownmitochondrial regulator Hap4 (Figure 5C). During glu-coneogenic respiration, Hap4 and Hap1 are predicted toboth contribute to the regulation of most electron transportchain genes such as SHD1, NDI1, COX5A, COX6, QCR6,RIP1, QCR7 and ATP14 (Figure 5G). Several subunits ofthe ATP synthase (ATP17, ATP1 and ATP16) as well as cer-tain TCA cycle enzymes (KGD1, KGD2) are predicted to beregulated by Hap4 without Hap1. Oaf1 also contributes toregulation of various genes of the cluster to a lesser extent,in some cases alone, or together with Hap1 and/or Hap4.We also demonstrate the same analysis for Cluster 16, mostenriched for containing genes of fatty acid beta-oxidation(Figure 5D). While most of the predicted effects of TF bind-ing on transcriptional regulation explored thus far has beenactivation, from the importance of TFs in the different con-ditions of Cluster 16, three TFs showed negative correla-tion to transcriptional changes during aerobic fermentation(Figure 5D). The TF selection used three TFs – Gcn4, Hap1and Gcr1 to predict 60% of the variation in the cluster dur-ing aerobic fermentation (Figure 5H). Ert1 and Sut1 werefurther included in the analysis because they were foundto be collinear and able to replace Gcn4 in TF selection.Exploring the contributions of the TFs to expression lev-els of beta-oxidation genes in the heatmap supports an in-verse relationship between Sut1-Gcn4-Ert1 binding an ex-pression levels of several beta-oxidation genes, most notablyfor PXA1, DCI1, CTA1, CLD1, PXA2 and IDP3 (Figure5H).
The influence of TFs on gene regulation from different regionsof the promoter changes between metabolic conditions
In the analysis shown in Supplementary Figure S2C we no-ticed that there were apparent differences in importance ofdifferent regions of the promoter between the conditions,most notably a predicted shift towards more consequentialTF binding downstream of the TSS during aerobic fermen-tation. We reasoned that this could be because TFs withmore consequential binding downstream of the TSS couldbe more important at this condition, or it could be becauseTFs shift their importance from one region of the promoterto another between conditions. To distinguish these twopossibilities we performed simple linear regressions usingthe binding signal for each TF individually in 75 bp regionsof the promoter to look for potential changes in TF im-portance in different regions of the promoter between themetabolic conditions. Comparing the resulting profiles ofthe explanatory power of TF binding in the experimentalconditions revealed a distinctly different profile during aer-obic fermentation compared to the other three metabolicconditions (Figure 6A compared to Figure 6B–D). Impor-tantly, the differences during aerobic fermentation do notappear to be driven by one or a few TFs that regulate fromdownstream of the TSS with stronger importance duringthis condition, but rather a shift of where in the promoterseveral TFs are most consequential to regulation (CompareGcr1, Ino2, Ino4, Stb5, Cbf1 in Figure 6A to B–D).
Another striking observation from the importance of TFsduring aerobic fermentation was a negative correlation be-tween transcriptional changes and binding of three TFsfrom an overlapping region upstream of the TSS; Sut1,Gcn4 and Ert1 (Figure 6A). These regressions were of allmetabolic genes, suggesting that the negative correlationseen with these TFs on beta-oxidation genes, as seen in Fig-ure 5H, may be a more general phenomena of aerobic fer-mentation. To see if these three TFs are acting on the samegenes or separately we compared the binding of the threeTFs in the region 250–450 upstream of the TSS togetherwith expression levels over all metabolic genes to look forgroups of genes where binding of Sut1, Gcn4 and Ert1 arecorrelated to each other and anti-correlated to expressionlevels (Figure 6E). In aerobic fermentation, we found twodistinct groups of genes where all three TFs are generallycorrelated to each other, one with low binding of the threeTFs and high expression levels and another group with rel-atively higher binding of all three TFs and relatively lowerexpression levels. Interestingly, the group of genes where thethree TFs have the strongest negative correlation to tran-script levels is slightly enriched for genes of translation pro-cesses (Figure 6E, Group 1), genes that are likely closelycontrolled due to the nitrogen limitation of these cultures.We also looked for similar relationships between binding ofSut1-Gcn4-Ert1 and transcript levels for the other experi-mental conditions (Supplementary Figure S4B–D), but wefound no coordinated changes in the other conditions, sug-gesting this phenomena is specific to aerobic fermentation.We summarize some of our main findings from Figures 5and 6A–E regarding how TFs regulate difference metabolicprocesses in different conditions in Figure 6F.
DISCUSSION
The latest count of known and putative yeast TFs is 264(33). With coverage of 21 TFs we only see part of the sys-tem of gene regulation by TFs, but by selecting the TFs mostenriched to binding metabolic genes and building predic-tive models of metabolic gene regulation we achieved goodTF coverage per gene. Through various types of analysiswe discovered surprisingly large changes between the stud-ied metabolic conditions, first encountered when comparingsets of target genes for the TFs between conditions shown inFigure 1A. To explain the changing sets of gene targets be-tween conditions, we hypothesized that the TFs could havechanges in preference of DNA motifs between different con-ditions. We cannot exclude changes in motif preference incertain cases such as Rtg1 and Gcn4 in aerobic fermenta-tion (Supplementary Figure S2a), but in general the changesin motif preference between conditions are small and wethink this is not a major determinant of changes in whichgenes are being targeted between conditions. We think themost likely alternative explanation is that there are changesin nucleosome occupancy or histone modifications that al-low more or less binding to different sets of genes over thedifferent conditions studied and that binding site preferenceis only partly driven by the recognized DNA motif. Thisidea is generally supported by the complex two-way inter-actions that have been suggested between TF binding andhistones in eukaryotic gene regulation (8).
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
Nucleic Acids Research, 2019, Vol. 47, No. 10 4997
Gcn4
Ert1
Sut1
Hap1
Ino4
Cbf1Ino2
Cat8Stb5
Gcr1
−0.06
−0.03
0.00
0.03
0.06
Pip2
Cbf1
Oaf1
Gcr1
Rtg1
Gcn4
Ino4
Cat8Hap4
Ino2
Stb5
Tye7
Hap1
Rgt1Ert1
0.000
0.025
0.050
0.075
0.100
Gcr1
Cbf1
Ino4Gcn4
Stb5
Sut1
Rtg1
Tye7
Ino2Ert1
Hap1
Cat8
Oaf1Rds2
0.00
0.02
0.04
0.06
R2
* co
rrel
atio
n co
effic
ient
(+1/
−1) Gcr1
Gcn4
Cat8
Stb5Cbf1
Ino4
Rtg1
Tye7
Oaf1
Ino2
Gcr2
Hap1
Ert1
0.000
0.025
0.050
0.075
R2
* co
rrel
atio
n co
effic
ient
(+1/
−1)
R2
* co
rrel
atio
n co
effic
ient
(+1/
−1)
R2
* co
rrel
atio
n co
effic
ient
(+1/
−1)
A B
C D
Exp
ress
ion
Sut
1
Ert1
Gcn
4
−4 0 2 4Column Z−Score
2
Adjusted p−value
2.4e−02tRNA aminoacylationfor protein translation
4.2e−02translation
Adjusted p−value5.1e−13glycolytic process7.5e−07gluconeogenesis3.8e−03glucose metabolic process4.1e−02electron transport chain
Group 1
Group 2
E
Aerobic fermentation(nitrogen limited)
Gluconeogenic respiration(ethanol limited)
Respiratory glucose metabolism(glucose limited)
Fermentative glucose metabolism(glucose limited, anaerobic)
Aerobic fermentation(nitrogen limited)
F
GLUCOSE TRAN PORTG
LYC
OLY
SIS
MITOCHONDRIA
T
A
CY
CLE
OXIDATIVE PHOSPHORYLATION
BETA OXIDATION
Increasing carbohydrate transportenzymes during glucose limitation
ERGOSTEROL BIOSYNTHESIS
Increasing mitochondrial processesduring respiratory metabolism
Increasing ergosterol biosynthesis enzymes in anaerobic metabolism
Increased glycolytic enzymes during fermentation
Increasing beta oxidationenzymes in respiratory metabolism
Oaf1 Hap4Hap1Pip2Cat8
Cat8
Ert1 Hap1 Gcn4Cat8 Oaf1
Gcr1Gcr2
Tye7
Ino2Ino4
Rgt1
Rds2
Gcr1Hap1
Rgt1
Ino2Ino4 Hap1AEROBICFERMENTATION(NITROGENLIMITED)
Ert1
Gcn4Sut1
Repressive correlation to highlyexpressed glycolytic and respiratory genes
Activating correlation togenes linked to translationprocesses
TSS TSS +750bpTSS -750bp
TF binding signal integrated over 75 bp intervals
TSS TSS +750bpTSS -750bp
TF binding signal integrated over 75 bp intervals
TSS TSS +750bpTSS -750bp
TF binding signal integrated over 75 bp intervals TSS TSS +750bpTSS -750bp
TF binding signal integrated over 75 bp intervals
Rds2
Figure 6. Comparing TF importance in different segments of the promoter shows a distinct pattern of TF importance during aerobic fermentation ascompared to the other conditions. (A–D) Linear regressions, regressing the TF binding signal of each TF individually, in the indicated segments of thepromoters, against the expression levels. Lines are labeled by the TF name near the highest absolute y-axis value. (E) Two groups of genes where bindingof Sut1, Ert1 and Gcn4 in −450 to −250 relative to the TSS is correlated to each other and oppositely correlated to transcript levels. (F) Summary of thestrongest predicted links between TF function and selected metabolic processes. Based on data presented in Figures 4 and 5, Supplementary Figure S4E–Fand Figure 6A–E.
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
4998 Nucleic Acids Research, 2019, Vol. 47, No. 10
Based on the success of multiple linear regressions as pre-dictive models of gene regulation from TF binding data, wepropose that a large portion of transcriptional regulationby TFs in yeast is achieved from a linear effect of TF bind-ing on transcriptional outcome. Our predictions also sug-gest that a large amount of the contributions to gene reg-ulation from several metabolic TFs are additive. However,by allowing multiplication of TF binding signal, we do de-tect cases that indicate more complex contributions frompairs of TFs to gene regulation than a simple addition ofthe two TFs binding signal can capture. Most noteworthyis the relationship between Ino2 and Ino4, where the addi-tive contribution from both TFs seem to saturate at a cer-tain strength of binding (Supplementary Figure S3c). TheIno2–Ino4 relationship is best known for being required forphospholipid biosynthesis and Ino2 is described as most im-portant for transcriptional activation of the targeted genes,but Ino2 also depends on Ino4 for translocation into thenucleus (34). There is additional complexity in the regula-tion of this complex by involvement of the Opi1 repressor,which can bind Ino2 to inhibit activation by the complex(35). We cannot conclude on why we see a saturation effectin the transcriptional activation due to increased binding ofboth Ino2 and Ino4, but it could be due to TF–TF competi-tion of the complex with other components of the transcrip-tional machinery, or other nonlinear relationships betweenbinding of the TFs and transcriptional outcomes. In aero-bic fermentation we detect several additional TF pairs forwhich there may be more complex relationships than simpleaddition of the contributing signal. Of these, both Ert1 andGcn4 have relationships to Ino2 and Ino4 binding wherea multiplication of the binding signal clearly improves pre-dictive power (Figure 3E). It is striking that Ert1 and Gcn4are two of only a few TFs that show this kind of complex-ity seen together with the negative correlation seen betweentranscript levels and binding of Sut1, Gcn4 and Ert1 in aer-obic fermentation, but we do not know if these observationsare related.
The main advantage of using linear regressions, as com-pared to more complex machine learning analysis, is thateach TF’s contribution in the predictions can be fully de-scribed. We highlight this feature of our analysis in Fig-ure 5, where the groups of genes are biologically linked andsmall enough to illustrate the amount of binding from eachselected TF in the individual genes of the clusters. In thisanalysis we confirm several previously demonstrated regu-lators of important cellular processes such as Hap1 activat-ing ergosterol biosynthesis (36), Gcr1-Gcr2-Tye7 activatingglycolytic genes (37) and Hap4 activating respiratory pro-cesses (38). We also propose previously unknown contribu-tions from other TFs to these processes such as Ert1 activat-ing ergosterol biosynthesis at anaerobic conditions (Figure5E).
The minimal media used in all chemostats of this studyare without added amino acids, meaning that the feedbackcontrols activating amino acid biogenesis should generallybe activated. Gcn4 is best known as an activator of aminoacid biosynthesis, but also demonstrated in several stud-ies to be a repressor of ribosomal protein expression by anunknown mechanism (39,40). The mechanistic details onhow Gcn4 can activate some genes and repress others re-
main mostly unknown, but the repression has been linkedto functional interactions with the repressive TF Rap1 andhistone acetyltransferase Esa1 (41). A recent study of theyeast strain Y. lipolytica, describing increased lipid accumu-lation of this strain during nitrogen limitation, found thatbeta-oxidation genes were down-regulated in this condition(42). In that study, they also noted that genes with the Gcn4motif in their promoter tended to be down-regulated dur-ing nitrogen limitation, an observation that they could notexplain, but is in line with the negative correlation we ob-serve between Gcn4 binding and transcript levels for betaoxidation genes during nitrogen limitation (Figure 5H). Wepropose that a similar mechanism as was observed in Y.lipolytica exists in S. cerevisiae and further extend it to af-fect a larger set of genes (Figure 6E) and to also involveSut1 and Ert1. Sut1 is previously described to interact withthe co-repressor complex Cyc8-Tup1 (43) while Ert1 is pre-viously described to function as both an activator and re-pressor with the major described functional role being reg-ulating genes driving the diauxic shift (44). Our data doesnot give any further insight into the mechanism of the Sut1-Ert1-Gcn4 repression, but based on the generally reducednumber of peaks (Supplementary Figure S1C) and changesin importance of different regions of the promoter (Supple-mentary Figure S2C, Figure 6A–D) during aerobic fermen-tation, we speculate that Sut1-Ert1-Gcn4 binding is corre-lated to changes to the nucleosomes on each side of theTSS in certain genes, for example histone modifications orchanges in nucleosome occupancy. If the changes in bind-ing of these three TFs is not driven by changes in their pref-erence for motifs, but instead by nucleosomes, it is also notnecessarily the case that the three TFs are directly repressing– the correlation between TF binding and transcriptionalregulation could potentially be a secondary effect of nucleo-some changes that are also causing transcriptional changes.
In conclusion, our experimental design using chemostatsto capture stable states of metabolism reveals large changesin functional roles of different TFs between metabolicstates. The previously demonstrated difficulties in definingthe regulatory targets of eukaryal TFs through transcrip-tomics after TF deletion could be partly explained by thishighly dynamic nature of eukaryal TF function. If the dele-tion of the TF changes cellular conditions enough to shiftthe regulatory roles of a range of one or several other TFs,the following secondary transcriptional changes could be asource of significant changes in genes not targeted directlyby the deleted TF. Our framework of using multiple lin-ear regressions for full transparency of TF contributions totranscriptional regulation without relying on TF deletionwill be equally applicable for future larger-scale studies asbinding data for more TFs with condition-matched tran-scriptomics accumulate to gradually build a system-levelunderstanding of eukaryotic transcriptional regulation.
DATA AVAILABILITY
ChIP-exo raw sequencing data (.fastq) can be retrievedform Arrayexpress accession E-MTAB-6673: https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6673/. Aprocessed format of the ChIP-exo data, summarized asTF reads per gene promoter is included in Supplementary
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
Nucleic Acids Research, 2019, Vol. 47, No. 10 4999
Data 4. RNA sequencing raw sequencing data (.fastq)can be retrieved from Arrayexpress accession E-MTAB-6657: https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6657/. The number of reads annotated to each geneare found in Supplementary Data 3 and our processing ofread counts to get to FPKM values that are used for allrelevant analysis of this study can be reproduced throughthe R scripts included in Supplementary Data 6.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
ACKNOWLEDGEMENTS
Authors’ contributions: Funding acquisition, J.N.; concep-tualization, P.H. and J.N.; experiments, P.H., D.B., C.S.Band G.L.; analysis, P.H., D.B. and C.S.B; writing – origi-nal draft, P.H.; writing – review & editing, D.B, C.S.B, G.L.and J.N.
FUNDING
European Union’s Horizon 2020 research and innovationprogramme [Marie Skłodowska-Curie grant agreement no.722287]; Knut and Alice Wallenberg Foundation and theNovo Nordisk Foundation [NNF10CC1016517]. Fundingfor open access charge: Internal.Conflict of interest statement. None declared.
REFERENCES1. Le,P.P., Friedman,J.R., Schug,J., Brestelli,J.E., Parker,J.B.,
Bochkis,I.M. and Kaestner,K.H. (2005) Glucocorticoidreceptor-dependent gene regulatory networks. PLoS Genet., 1,0159–0170.
2. Fan,J.-B., Benner,C., Hutt,K.R., Garcia-Bassets,I., Fu,X.-D.,Rosenfeld,M.G., Bibikova,M., Jin,M., Kwon,Y.-S., Ye,Z. et al. (2007)Sensitive ChIP-DSL technology reveals an extensive estrogenreceptor -binding program on human gene promoters. Proc. Natl.Acad. Sci. U.S.A., 104, 4852–4857.
3. Hu,Z., Killion,P.J. and Iyer,V.R. (2007) Genetic reconstruction of afunctional transcriptional regulatory network. Nat. Genet., 39,683–687.
4. Harbison,C.T., Gordon,D.B., Lee,T.I., Rinaldi,N.J., Macisaac,K.D.,Danford,T.W., Hannett,N.M., Tagne,J.B., Reynolds,D.B., Yoo,J. et al.(2004) Transcriptional regulatory code of a eukaryotic genome.Nature, 431, 1–5.
5. Gitter,A., Siegfried,Z., Klutstein,M., Fornes,O., Oliva,B., Simon,I.and Bar-Joseph,Z. (2009) Backup in gene regulatory networksexplains differences between binding and knockout results. Mol. Syst.Biol., 5, 276.
6. Fang,X., Sastry,A., Mih,N., Kim,D., Tan,J., Yurkovich,J.T.,Lloyd,C.J., Gao,Y., Yang,L. and Palsson,B.O. (2017) Globaltranscriptional regulatory network for Escherichia coli robustlyconnects gene expression to transcription factor activities. Proc. Natl.Acad. Sci. U.S.A., 114, 10286–10291.
7. ENCODE consortium. (2012) An integrated encyclopedia of DNAelements in the human genome. Nature, 489, 57–74.
8. Cheng,C., Alexander,R., Min,R., Leng,J., Yip,K.Y., Rozowsky,J.,Yan,K., Dong,X., Djebali,S., Ruan,Y. et al. (2012) Understandingtranscriptional regulation by integrative analysis of transcriptionfactor binding data. Genome Res., doi:10.1101/gr.136838.111.
9. Ouyang,Z., Zhou,Q. and Wong,W.H. (2009) ChIP-Seq oftranscription factors predicts absolute and differential geneexpression in embryonic stem cells. Proc. Natl. Acad. Sci. U.S.A., 106,21521–21526.
10. Zhu,C., Byers,K.J.R.P., McCord,R.P., Shi,Z., Berger,M.F.,Newburger,D.E., Saulrieta,K., Smith,Z., Shah,M. V.,Radhakrishnan,M. et al. (2009) High-resolution DNA-bindingspecificity analysis of yeast transcription factors. Genome Res., 19,556–566.
11. Badis,G., Pena-Castillo,L., Ansari,A.Z., van Bakel,H., Hughes,T.R.,Hasinoff,M.J., Terterov,D., Li Yeo,A., Tillo,D., Nislow,C. et al.(2008) A library of yeast transcription factor motifs reveals awidespread function for Rsc3 in targeting nucleosome exclusion atpromoters. Mol. Cell, 32, 878–887.
12. Hughes,T.R. and de Boer,C.G. (2013) Mapping yeast transcriptionalnetworks. Genetics, 195, 9–36.
13. Bergenholm,D., Liu,G., Hansson,D. and Nielsen,J. (2019)Construction of mini-chemostats for high-throughput straincharacterization. Biotechnol. Bioeng., 116, 1029–1038.
14. Rhee,H.S. and Pugh,B.F. (2012) ChiP-exo method for identifyinggenomic location of DNA-binding proteins withnear-single-nucleotide accuracy. Curr. Protoc. Mol. Biol., 100,21.24.1–21.24.14.
15. Liu,G., Bergenholm,D. and Nielsen,J. (2016) Genome-Wide mappingof binding sites reveals multiple biological functions of thetranscription factor Cst6p in saccharomyces cerevisiae. MBio, 7,1–10.
16. Guo,Y., Mahony,S. and Gifford,D.K. (2012) High resolution genomewide binding event finding and motif discovery reveals transcriptionfactor spatial binding constraints. PLoS Comput. Biol., 8, e1002638.
17. Borlin,C.S., Siewers,V., Nielsen,J., Holland,P., Bergenholm,D.,Cvetesic,N. and Lenhard,B. (2018) Saccharomyces cerevisiae displaysa stable transcription start site landscape in multiple conditions.FEMS Yeast Res., 19, 1–10.
18. Salazar,A.N., van den Broek,M., Daran,J.-M.G., Gorter deVries,A.R., Wijsman,M., Brouwers,N., Brickwedde,A., Abeel,T. andde la Torre Cortes,P. (2017) Nanopore sequencing enablesnear-complete de novo assembly of Saccharomyces cerevisiaereference strain CEN.PK113-7D. FEMS Yeast Res., 17,doi:10.1093/femsyr/fox074.
19. Langmead,B. and Salzberg,S.L. (2012) Fast gapped-read alignmentwith Bowtie 2. Nat. Methods, 9, 357–359.
20. Li,H., Handsaker,B., Wysoker,A., Fennell,T., Ruan,J., Homer,N.,Marth,G., Abecasis,G., Durbin,R. and 1000 Genome Project DataProcessing Subgroup (2009) The Sequence Alignment/Map formatand SAMtools. Bioinformatics, 25, 2078–2079.
21. Quinlan,A.R. and Hall,I.M. (2010) BEDTools: A flexible suite ofutilities for comparing genomic features. Bioinformatics, 26, 841–842.
22. R Core team (2016) R: A language and environment for statisticalcomputing. R Foundation for Statistical Computing. Vienna,https://www.R-project.org/.
23. Liao,Y., Smyth,G.K. and Shi,W. (2014) featureCounts: an efficientgeneral purpose program for assigning sequence reads to genomicfeatures. Bioinformatics, 30, 923–930.
24. Robinson,M.D., McCarthy,D.J. and Smyth,G.K. (2010) edgeR: aBioconductor package for differential expression analysis of digitalgene expression data. Bioinformatics, 26, 139–140.
25. Milborrow,S. (2017) earth: Multivariate Adaptive Regression Splines.https://CRAN.R-project.org/package=earth.
26. Bergenholm,D., Liu,G., Holland,P. and Nielsen,J. (2018)Reconstruction of a global transcriptional regulatory network forcontrol of lipid metabolism in yeast by using chromatinimmunoprecipitation with lambda exonuclease digestion. mSystems,3, e00215-17.
27. Ouyang,L., Holland,P., Lu,H., Bergenholm,D. and Nielsen,J. (2018)Integrated analysis of the yeast NADPH-regulator Stb5 revealsdistinct differences in NADPH requirements and regulation indifferent states of yeast metabolism. FEMS Yeast Res., 18, 91.
28. MacIsaac,K.D., Wang,T., Gordon,D.B., Gifford,D.K., Stormo,G.D.and Fraenkel,E. (2006) An improved map of conserved regulatorysites for Saccharomyces cerevisiae. BMC Bioinformatics, 7, 113.
29. Hashim,Z., Mukai,Y., Bamba,T. and Fukusaki,E. (2014) Metabolicprofiling of retrograde pathway transcription factors Rtg1 and Rtg3knockout yeast. Metabolites, 4, 580–598.
30. Crespo,L., Powers,T., Fowler,B. and Hall,M.N. (2002) TheTOR-controlled transcription activators GLN3, RTG1, and RTG3are regulated in response to intracellular levels of glutamine. Proc.Natl. Acad. Sci. U.S.A., 99, 6784–6789.
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019
5000 Nucleic Acids Research, 2019, Vol. 47, No. 10
31. Sanchez,B., Li,F., Lu,H., Kerkhoven,E. and Nielsen,J. (2016)Yeast-GEM: yeast 7.6.0. doi:10.5281/zenodo.1495468.
32. Friedman,J.H. (2007) Multivariate adaptive regression splines. Ann.Stat., 19, 123–141.
33. de Boer,C.G. and Hughes,T.R. (2012) YeTFaSCo: a database ofevaluated yeast transcription factor sequence specificities. NucleicAcids Res., 40, D169–D179.
34. Kumme,J., Dietz,M., Wagner,C. and Schuller,H.-J. (2008)Dimerization of yeast transcription factors Ino2 and Ino4 is regulatedby precursors of phospholipid biosynthesis mediated by Opi1repressor. Curr. Genet., 54, 35–45.
35. Lai,K. and Mcgraws,P. (1994) Dual control of inositol transport insaccharomyces cerevisiae by irreversible inactivation of permease andregulation of permease synthesis by IN02, IN04, and OPI1. JBC, 269,2246–2251.
36. Tamura,K.I., Gu,Y., Wang,Q., Yamada,T., Ito,K. and Shimoi,H.(2004) A hap1 mutation in a laboratory strain of Saccharomycescerevisiae results in decreased expression of ergosterol-related genesand cellular ergosterol content compared to sake yeast. J. Biosci.Bioeng., 98, 159–166.
37. Nishi,K., Park,C.S., Pepper,A.E., Eichinger,G., Innis,M.A. andHolland,M.J. (1995) The GCR1 requirement for yeast glycolytic geneexpression is suppressed by dominant mutations in the SGC1 gene,which encodes a novel basic-helix-loop-helix protein. Mol. Cell. Biol.,15, 2646–2653.
38. Blom,J., De Mattos,M.J.T. and Grivell,L.A. (2000) Redirection of therespiro-fermentative flux distribution in Saccharomyces cerevisiae by
overexpression of the transcription factor Hap4P. Appl. Environ.Microbiol., 66, 1970–1973.
39. Natarajan,K., Meyer,M.R., Jackson,B.M., Slade,D., Roberts,C.,Hinnebusch,A.G. and Marton,M.J. (2001) Transcriptional profilingshows that Gcn4p is a master regulator of gene expression duringamino acid starvation in yeast. Mol. Cell. Biol., 21, 4347–4368.
40. Mittal,N., Guimaraes,J.C., Gross,T., Schmidt,A., Vina-Vilaseca,A.,Nedialkova,D.D., Aeschimann,F., Leidel,S.A., Spang,A. andZavolan,M. (2017) The Gcn4 transcription factor reduces proteinsynthesis capacity and extends yeast lifespan. Nat. Commun., 8, 457.
41. Joo,Y.J., Kim,J.H., Kang,U.B., Yu,M.H. and Kim,J. (2011)Gcn4p-mediated transcriptional repression of ribosomal proteingenes under amino-acid starvation. EMBO J., 30, 859–872.
42. Pomraning,K.R., Kim,Y.-M.M., Nicora,C.D., Chu,R.K.,Bredeweg,E.L., Purvine,S.O., Hu,D., Metz,T.O. and Baker,S.E.(2016) Multi-omics analysis reveals regulators of the response tonitrogen limitation in Yarrowia lipolytica. BMC Genomics, 17, 138.
43. Regnacq,M., Alimardani,P., El Moudni,B. and Berges,T. (2001)Sut1p interaction with Cyc8p(Ssn6p) relieves hypoxic genes fromCyc8p-Tup1p repression in Saccharomyces cerevisiae. Mol.Microbiol., 40, 1085–1096.
44. Gasmi,N., Jacques,P.E., Klimova,N., Guo,X., Ricciardi,A., Robert,F.and Turcotte,B. (2014) The switch from fermentation to respiration inSaccharomyces cerevisiae is regulated by the Ert1 transcriptionalactivator/repressor. Genetics, 198, 547–560.
Dow
nloaded from https://academ
ic.oup.com/nar/article-abstract/47/10/4986/5446249 by C
halmers U
niversity of Technology user on 16 July 2019