surprisal analysis characterizes the free energy time ... › content › pnas › 111 › 36 ›...

6
Surprisal analysis characterizes the free energy time course of cancer cells undergoing epithelial-to- mesenchymal transition Sohila Zadran a , Rameshkumar Arumugam b , Harvey Herschman c , Michael E. Phelps c , and R. D. Levine b,c,d,1 a Institute for Molecular Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA 90095; b The Fritz Haber Research Center, The Hebrew University, Jerusalem 91904, Israel; and c Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, and d Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095 Contributed by R. D. Levine, August 3, 2014 (sent for review May 19, 2014) The epithelial-to-mesenchymal transition (EMT) initiates the invasive and metastatic behavior of many epithelial cancers. Mechanisms underlying EMT are not fully known. Surprisal analysis of mRNA time course data from lung and pancreatic cancer cells stimulated to undergo TGF-β1induced EMT identifies two phenotypes. Examina- tion of the time course for these phenotypes reveals that EMT re- programming is a multistep process characterized by initiation, maturation, and stabilization stages that correlate with changes in cell metabolism. Surprisal analysis characterizes the free energy time course of the expression levels throughout the transition in terms of two state variables. The landscape of the free energy changes during the EMT for the lung cancer cells shows a stable intermediate state. Existing data suggest this is the previously proposed maturation stage. Using a single-cell ATP assay, we demonstrate that the TGF- β1induced EMT for lung cancer cells, particularly during the matura- tion stage, coincides with a metabolic shift resulting in increased cy- tosolic ATP levels. Surprisal analysis also characterizes the absolute expression levels of the mRNAs and thereby examines the homeosta- sis of the transcription system during EMT. free energy landscape | transcriptions expression profile | maximal entropy | cellular thermodynamics | microarray T he epithelial-to-mesenchymal transition (EMT) is a cellular transition critical for several normal biological events, in- cluding embryonic development and wound healing. EMT has been investigated most exhaustively in cancer progression. An evoked EMT in epithelial cancer cells induces gene expression changes that result in loss of adhesive properties and acquisition of mesenchymal cell traits associated with tumor progression and metastasis, e.g., increased cellular motility, migration, and in- vasion (1, 2). Cultured epithelial tumor cells may be induced by several al- ternative stimuli to undergo an EMT, leading to acquisition of mesenchymal properties (3). Transforming growth factors (TGFs) are the most widely used EMT inducers. Microarray expression profiling enables identification of TGF-β1 EMT-induced molec- ular alterations and mechanisms (4). Gene expression transcriptional profiling is a major tool in analyzing induced changes in cells, yet interpretation of micro- array experiments is faced with challenges (57). Here we use an alternative approach, surprisal analysis (8), to better understand, characterize, and represent gene expression differences critical to the EMT (4, 9). Surprisal analysis assumes that if a system can decrease its free energy, it will do so spontaneously unless constrained. If a system does not attain its minimal free energy state, surprisal analysis seeks to recognize the constraints that prevent a reduction in en- ergy; surprisal analysis identifies the main constraints on a system that has the thermodynamic potential to change spontaneously but is restrained from doing so (10). As in physics and chemistry (11, 12), surprisal analysis in biology (1316) argues that the stable stateof the system, the state of minimal free energy, must first be specified. Only then can we consider constraints responsible for deviations from this state. In principle, one should identify the stable state from first biophysical principles. However, for systems of such molecular complexity, we must apply computational abilities beyond what is currently available. We therefore identify the stable state by using available experimental data. We then have a baseline for our second task, examining deviations from the stable state. We use the term statein the chemistry and physics sense, meaning everything we need to specify to predict how the system will change in response to a small modulation of its circum- stances. By statehere, we mean thermodynamic state.A 0.1-M, 100-mL aqueous saline solution at room temperature and pres- sure at rest is in a thermodynamic state. We can predict its os- motic pressure, but we cannot account for the Brownian motion of individual ions; a mechanical state is needed to specify the position and velocity of all the ions and water molecules. A thermodynamic description is much more parsimonious; many different mechanical states will give rise to essentially the same value for osmotic pressure. Here we describe the change in time of the thermodynamic state (state) of the transcription system during an EMT. We identify variables, analogous to the osmotic pressure, that characterize the transcription system state. The change in value of these variables describes the transition. By analogy to thermodynamics, we refer to these as the state variables. We suggest that quantification of transcripts according to their free energy (rather than fold changes) has distinct value. Pre- viously we considered the stability of the minimal free energy state during cellular processes (13). This thermodynamic con- sideration enables an approach that may be used to make state- ments about alternative cellular states and predictions about how targeted perturbations to a biological system might influence that system (17, 18). Significance Epithelial-to-mesenchymal transition (EMT) has been investi- gated extensively in cancer progression. Tumor cells induced to undergo an EMT acquire mesenchymal, invasive properties. We apply methods developed in chemical reaction dynamics to characterize the changing transcriptional profile during an EMT and trace the temporal unfolding of the transition on a free energy landscape. The analysis identifies three EMT stages, in- cluding passage through an intermediate, low-energy matura- tion state. The uphill energy requirements during the ascent past maturation correlate with an increase in ATP production. A stable cell machinery whose performance remains unperturbed underlies the EMT. Author contributions: H.H., M.E.P., and R.D.L. designed research; S.Z., R.A., and R.D.L. performed research; S.Z. contributed new reagents/analytic tools; S.Z., R.A., H.H., M.E.P., andR.D.L. analyzed data; and S.Z., H.H., M.E.P., and R.D.L. wrote the paper. The authors declare no conflict of interest. 1 To whom correspondence should be addressed. Email: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1414714111/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1414714111 PNAS | September 9, 2014 | vol. 111 | no. 36 | 1323513240 SYSTEMS BIOLOGY CHEMISTRY Downloaded by guest on July 12, 2020

Upload: others

Post on 27-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Surprisal analysis characterizes the free energy time ... › content › pnas › 111 › 36 › 13235.full.pdfSurprisal analysis characterizes the free energy time course of cancer

Surprisal analysis characterizes the free energy timecourse of cancer cells undergoing epithelial-to-mesenchymal transitionSohila Zadrana, Rameshkumar Arumugamb, Harvey Herschmanc, Michael E. Phelpsc, and R. D. Levineb,c,d,1

aInstitute for Molecular Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA 90095; bThe Fritz Haber Research Center, TheHebrew University, Jerusalem 91904, Israel; and cDepartment of Molecular and Medical Pharmacology, David Geffen School of Medicine, and dDepartment ofChemistry and Biochemistry, University of California, Los Angeles, CA 90095

Contributed by R. D. Levine, August 3, 2014 (sent for review May 19, 2014)

The epithelial-to-mesenchymal transition (EMT) initiates the invasiveand metastatic behavior of many epithelial cancers. Mechanismsunderlying EMT are not fully known. Surprisal analysis of mRNA timecourse data from lung and pancreatic cancer cells stimulated toundergo TGF-β1–induced EMT identifies two phenotypes. Examina-tion of the time course for these phenotypes reveals that EMT re-programming is a multistep process characterized by initiation,maturation, and stabilization stages that correlate with changes incell metabolism. Surprisal analysis characterizes the free energy timecourse of the expression levels throughout the transition in terms oftwo state variables. The landscape of the free energy changes duringthe EMT for the lung cancer cells shows a stable intermediate state.Existing data suggest this is the previously proposed maturationstage. Using a single-cell ATP assay, we demonstrate that the TGF-β1–induced EMT for lung cancer cells, particularly during the matura-tion stage, coincides with a metabolic shift resulting in increased cy-tosolic ATP levels. Surprisal analysis also characterizes the absoluteexpression levels of the mRNAs and thereby examines the homeosta-sis of the transcription system during EMT.

free energy landscape | transcriptions expression profile |maximal entropy | cellular thermodynamics | microarray

The epithelial-to-mesenchymal transition (EMT) is a cellulartransition critical for several normal biological events, in-

cluding embryonic development and wound healing. EMT hasbeen investigated most exhaustively in cancer progression. Anevoked EMT in epithelial cancer cells induces gene expressionchanges that result in loss of adhesive properties and acquisitionof mesenchymal cell traits associated with tumor progression andmetastasis, e.g., increased cellular motility, migration, and in-vasion (1, 2).Cultured epithelial tumor cells may be induced by several al-

ternative stimuli to undergo an EMT, leading to acquisition ofmesenchymal properties (3). Transforming growth factors (TGFs)are the most widely used EMT inducers. Microarray expressionprofiling enables identification of TGF-β1 EMT-induced molec-ular alterations and mechanisms (4).Gene expression transcriptional profiling is a major tool in

analyzing induced changes in cells, yet interpretation of micro-array experiments is faced with challenges (5–7). Here we use analternative approach, surprisal analysis (8), to better understand,characterize, and represent gene expression differences criticalto the EMT (4, 9).Surprisal analysis assumes that if a system can decrease its free

energy, it will do so spontaneously unless constrained. If a systemdoes not attain its minimal free energy state, surprisal analysisseeks to recognize the constraints that prevent a reduction in en-ergy; surprisal analysis identifies the main constraints on a systemthat has the thermodynamic potential to change spontaneously butis restrained from doing so (10).As in physics and chemistry (11, 12), surprisal analysis in biology

(13–16) argues that the “stable state” of the system, the state ofminimal free energy, must first be specified. Only then can we

consider constraints responsible for deviations from this state. Inprinciple, one should identify the stable state from first biophysicalprinciples. However, for systems of such molecular complexity, wemust apply computational abilities beyond what is currentlyavailable. We therefore identify the stable state by using availableexperimental data. We then have a baseline for our second task,examining deviations from the stable state.We use the term “state” in the chemistry and physics sense,

meaning everything we need to specify to predict how the systemwill change in response to a small modulation of its circum-stances. By “state” here, we mean “thermodynamic state.” A 0.1-M,100-mL aqueous saline solution at room temperature and pres-sure at rest is in a thermodynamic state. We can predict its os-motic pressure, but we cannot account for the Brownian motionof individual ions; a mechanical state is needed to specify theposition and velocity of all the ions and water molecules. Athermodynamic description is much more parsimonious; manydifferent mechanical states will give rise to essentially the samevalue for osmotic pressure. Here we describe the change in time ofthe thermodynamic state (state) of the transcription system duringan EMT. We identify variables, analogous to the osmotic pressure,that characterize the transcription system state. The change invalue of these variables describes the transition. By analogy tothermodynamics, we refer to these as the state variables.We suggest that quantification of transcripts according to their

free energy (rather than fold changes) has distinct value. Pre-viously we considered the stability of the minimal free energystate during cellular processes (13). This thermodynamic con-sideration enables an approach that may be used to make state-ments about alternative cellular states and predictions about howtargeted perturbations to a biological system might influence thatsystem (17, 18).

Significance

Epithelial-to-mesenchymal transition (EMT) has been investi-gated extensively in cancer progression. Tumor cells induced toundergo an EMT acquire mesenchymal, invasive properties. Weapply methods developed in chemical reaction dynamics tocharacterize the changing transcriptional profile during an EMTand trace the temporal unfolding of the transition on a freeenergy landscape. The analysis identifies three EMT stages, in-cluding passage through an intermediate, low-energy matura-tion state. The uphill energy requirements during the ascentpast maturation correlate with an increase in ATP production. Astable cell machinery whose performance remains unperturbedunderlies the EMT.

Author contributions: H.H., M.E.P., and R.D.L. designed research; S.Z., R.A., and R.D.L.performed research; S.Z. contributed new reagents/analytic tools; S.Z., R.A., H.H., M.E.P.,and R.D.L. analyzed data; and S.Z., H.H., M.E.P., and R.D.L. wrote the paper.

The authors declare no conflict of interest.1To whom correspondence should be addressed. Email: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1414714111/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1414714111 PNAS | September 9, 2014 | vol. 111 | no. 36 | 13235–13240

SYST

EMSBIOLO

GY

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

July

12,

202

0

Page 2: Surprisal analysis characterizes the free energy time ... › content › pnas › 111 › 36 › 13235.full.pdfSurprisal analysis characterizes the free energy time course of cancer

Constraints for biological processes are not static; they maychange in response to cellular and environmental perturbations(19). Stipulating the premise of minimal free energy subject toconstraints, a change to the constraints will be followed by amove of the system toward a new constrained state. At any giventime, a state of minimal free energy is unique and stable—stablein the technical sense that at that time, all other states that areconsistent with the constraints will have a higher free energy.Surprisal analysis ascertains whether, at a given time, a state canbe described as one of minimal free energy subject to constraint.If the description is appropriate, the state at that time is ther-modynamically stable. We also want to characterize a state(s)that is kinetically stable. This presents a stronger requirement;states at nearby time points must have a higher free energy. Forexample, we show below that the (starting) epithelial state isstable both thermodynamically and kinetically.When transcription system constraints change, the system may

move to a new constrained state, one with a different free energyvalue. If the free energy increases, the constraints must do workon the transcription system; the process will not follow unlesswork is done to induce the change. With a constraint change, freeenergy also may decrease. The practical difference is that whenfree energy declines, the state change may occur spontaneously andthe state is not kinetically stable.At the molecular level, state changes are reflected in differ-

ential transcript abundance. Surprisal analysis characterizes bi-ological transitions by considering independent phenotypes; eachphenotype is a set of characteristic genes as well as a separateand independent constraint on the state. Each phenotype is aconstraint that prevents the system from spontaneously decreasingin free energy.The state change with time is described by the weight of the

different phenotypes, where each weight changes with time in itsown way. These time-dependent weights characterize the state;they are the state variables. Surprisal analysis of transcriptome datadetermines the time dependence of the state variables (20). Gen-erally few in number, these variables identify the state at any time.To provide a molecular-level interpretation, surprisal analysis

shows how individual transcript levels are related to phenotypes:each phenotype is a set of transcripts. For any given phenotype,each transcript has a weight. This weight is time independent.Transcripts that constitute a phenotype all change with time inthe same way; this common change is described by the statevariable for that phenotype. The change in time for the expres-sion levels of individual transcripts is represented as the sum overthe contributions from the changes of the different phenotypes.

Results and DiscussionSurprisal analysis was applied as described previously (13–15,21). The input expression level of mRNA i at time t, XiðtÞ, isexpressed as a sum of terms; here two, representing deviationsfrom the balance state. Each deviation term (i.e., phenotype) isa product of a time-dependent weight of the phenotype; the statevariable and the time-independent contribution of the transcriptto that phenotype:

Xoi is the expression level of transcript i in the balance state. λαðtÞ

is the state variable for phenotype α at time t. mRNA expressionlevel during EMT is well described by the two state variables,α = 1,2. α = 1 is the leading term of the deviation. Giα is the time-independent contribution of gene i to the phenotype α remainingthe same throughout the transition.An implication of the analysis is a correlation between the

transcript’s thermodynamic stability and its expression level. Thiscorrelation is most reliable when deviations from the balancedstate are not large. Then, XiðtÞ≈Xo

i . Xoi is the expression level at

thermodynamic equilibrium—the balance state of minimal freeenergy. Therefore, the more stable the gene (i.e., the lower itsfree energy) the higher its expression level in the balance state.The surprisal itself at each point in time is defined as the (log-arithmic) deviation from the balance state.

The Balance State During EMT for TGF-β1–Treated A549 Human LungCancer Cells. Surprisal analysis was applied to the gene microarrayexpression profiles collected, in triplicates, at time points 0, 0.5,1, 2, 4, 8, 16, 24, and 72 h during an EMT-initiated TGF-β1stimulation of A549 cells (22). The first step in EMT surprisalanalysis is to identify a balance state common to epithelial andmesenchymal states and throughout the transition. This balancestate is the reference point at which all cellular processes areassumed balanced in forward and reverse directions and forwhich there is no net change in the surprisal system. It thereforeis of central concern to demonstrate by analysis of expressiondata that the balance state is invariant throughout the transition.The results for the A549 cell balance state at different times

after TGF-β1 treatment are shown in Fig. 1. The expression levelof transcript i in the balance state at time t after treatment isXoi ðtÞ= expð−λ0ðtÞ Gi0Þ. We expect that the level in the balanced

state will not depend on the time point, but this is not assumed inour procedure; analyses at different points in time validate thispoint of principle. λ0ðtÞGi0 is the free energy (in thermal energyunits, RT) of transcript i. Consequently, transcripts with thelowest (i.e., most negative) free energy are those with the highestlevels in the balance state. mRNAs identified with the lowest freeenergy, listed in Table S1, are those most correlated with cellularnetworks that regulate and maintain cellular homeostasis ma-chinery. Our analysis illustrates the robustness of cellular ho-meostasis during EMT (13, 16).A graphical representation of free energy time variations for the

most stable genes is shown in Fig. 1A. The heat map shows(negative) free energy for the 20 transcripts that have the highestweight in the balance state vs. time. The most stable transcripts areat the top, and they hardly vary with time. Fig. 1B shows the statevariable of the stable state, λ0ðtÞ  , vs. time. λ0ðtÞ  variation withtime is less than 0.1%, less than the SD of the state variable. Fig.1B establishes the balance state time invariance during the EMT.Of course, biochemical changes during an EMT may change

the transcript free energies. To fully validate balance state sta-bility, we therefore must show that EMT-induced changes aresmall compared with the transcript free energies of the stablestate. Fig. 1C shows the individual transcript free energies (in

ln XiðtÞ|fflfflfflffl{zfflfflfflffl}

measured expression  level of   transcript  i 

at  time  t

  = ln Xoi

|fflffl{zfflffl}

  balanced  stateexpression  level of   transcript  i

   −    λ1ðtÞzffl}|ffl{

state  variableof   phenotype  1

at  time  t

Gi1

z}|{

the  contribution of   transcript  ito  phenotype  1

 + λ2ðtÞzffl}|ffl{

state  variableof   phenotype  2

at  time  t

Gi2

z}|{

the  contribution of   transcript  ito  phenotype  2

|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}

the  two  deviation  terms  from  the  balanced  stateof   transcript  i  at  time  t

:

13236 | www.pnas.org/cgi/doi/10.1073/pnas.1414714111 Zadran et al.

Dow

nloa

ded

by g

uest

on

July

12,

202

0

Page 3: Surprisal analysis characterizes the free energy time ... › content › pnas › 111 › 36 › 13235.full.pdfSurprisal analysis characterizes the free energy time course of cancer

thermal energy units) in the balance state (abscissa) vs. thechange in the free energy of that transcript (ordinate) during theEMT. No transcript increases or decreases its free energy bymore than 10 units; all balance-state transcripts are stable bymore than −20 units. Consequently, for no transcript does itsfree energy become positive (i.e., fully destabilized) during theEMT. This conclusion is central; we show it in a differentgraphical representation in Fig. 1D, a free energy histogram.Transcript free energy distributions in the stable state are in red.Blue bins show transcript free energy changes due to the EMT.The red bars reiterate the point that there are no transcripts inthe balance state whose free energy change becomes positive.About as many genes are stabilized (have a decrease in their freeenergy) further during EMT as are destabilized, illustrated by thetwo red bars of roughly the same height.

EMT-Specific Differential Gene Expression in A549 Cells During theEMT. Each constraint identifies a transcript expression patternunique to net ongoing biological processes in the system: a pheno-type. The phenotypes are ranked such that the largest deviationfrom the balance state is α= 1. This phenotype has the largestvalue for its state variable, λ1ðtÞ; it is most responsible for thechanges in free energy during the process.Free energy changes during the EMT for genes whose free

energies undergo the greatest deviation (both up and down)from their balance state values are shown in Fig. 2A. Genes thatdominate this phenotype for the 549-cell EMT are listed in TableS2 in order of their contribution to free energy change. Some arehighly up-regulated, and some are very down-regulated. Thesegenes and their weights are time independent but are associatedwith the first time-dependent state variable, λ1ðtÞ (Fig. 2B); allthese genes vary with time in the same way. Genes up-regulated

during the first few hours following TGF-β1 treatment in A549cells are down-regulated during the subsequent time, ending in themesenchymal state at ∼24 h. These data demonstrate that the firstconstraint provides a signature that distinguishes the two EMT endstates; the phenotype that corresponds to this largest deviationdistinguishes the epithelial and mesenchymal cellular states. Be-cause of this first constraint, the mesenchymal free energy state ishigher than for the epithelial state. Note that most transcripts havea negligible contribution to this phenotype (Fig. S1).

On Transcript Regulation. The state variables determine the di-rection of deviation of the process from the balance state. Whenthe state variables change sign, transcripts that were overex-pressed with reference to the balance state become underex-pressed, and vice versa. To avoid unambiguity, we rank thetranscripts according to their time-independent weights, the Giαs(see phenotypes α= 0;1;2 in SI Appendix).From a biological perspective, surprisal analysis weighs the

impact of individual genes differently from the ranking of genesby fold change in expression. For example, the down-regulatedanterior gradient-2 (AGR2) gene, a transcript that decreases inabundance, tops the EMT-specific surprisal A549-cell gene sig-nature list (Table S2), whereas in fold-change microarray anal-ysis, AGR2 levels do not vary significantly. However, AGR2 has

Fig. 1. Stability of the balanced state of the mRNA expression in A549 cellsduring TGF-β1–induced EMT. (A) The heat map shows as the ordinate thelogarithm of expression levels for the 20 most highly expressed transcripts ofthe EMT balance state. Each column is a transcript free energy at that time.(B) λ0ðtÞ  vs. time, where −ln Xo

i ðtÞ= λ0ðtÞ  Gi0. Fig. 1B is to be compared withFigs. 2B and 3B, which show variations of λ1ðtÞ  and λ2ðtÞ with time. (C) Freeenergy of all of the transcripts in the balance state. The ordinate is thechange in the free energy of that transcript during the EMT. (D) Stability ofall transcripts of the balance state, and the free energy changes of alltranscripts during EMT. All transcripts in the balance state are shown in red;free energy changes resulting from the EMT are shown in blue. The ordinateis the number of transcripts per bin. The width of the energy bins is 10 unitsof thermal energy, RT.

Fig. 2. The EMT-specific gene signature distinguishes epithelial and mes-enchymal states in A549 cells. (A) Logarithm of expression levels for the 20uppermost and 20 most down-regulated transcripts of the first phenotype,α= 1. (B) The first state variable, λ1ðtÞ, during the EMT. The λ1ðtÞ sign changeidentifies the time (∼8 h) when the transition from the epithelial to themesenchymal state occurs. This sign change is correlated with the minimumin free energy in A.

Zadran et al. PNAS | September 9, 2014 | vol. 111 | no. 36 | 13237

SYST

EMSBIOLO

GY

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

July

12,

202

0

Page 4: Surprisal analysis characterizes the free energy time ... › content › pnas › 111 › 36 › 13235.full.pdfSurprisal analysis characterizes the free energy time course of cancer

the greatest free energy contribution to the first deviation fromthe balance state. Note also that AGR2 has been implicated asa major player in other studies when observed in proteomic data(23). Near the bottom of the list of phenotype 1 is an up-regu-lated gene by fold-increase analysis, serine protease inhibitor 2/protease-nexin 1 (SERPINE2) (Table S2, opposite column).AGR2and the plasminogen activator SERPINE1 illustrate the resolvingpower of surprisal analysis compared with purely statistical meth-ods. SERPINE2 is known to greatly enhance the local invasivenessof pancreatic tumors, accompanied by a massive increase in ECMproduction in the invasive tumors (24).The list of genes shown in Table S2 is representative of phe-

notype 1, the phenotype that distinguishes the epithelial andmesenchymal states. The state variable sign change of this phe-notype can be seen in Fig. 2B.

TGF-β1–Stimulated A549-Cell EMT Proceeds Through a MaturationStage. Surprisal analysis of TGF-β1 EMT-induced changes inA549 mRNA levels shows a second leading deviation from the

balance state (Fig. 3A). The time dependence of the second statevariable, λ2ðtÞ, exhibits a change in sign (Fig. 3B) that coincideswith the epithelial state transitioning (at about 2 h) to what wecall, following ref. 9, a “maturation state.” This state is a state oflow free energy. The maturation state is followed by a secondchange (at about 24 h) out of the maturation stage and into themesenchymal state (Fig. 3B and Fig. S2). Transition from thematuration stage is costly in free energy and might possibly offera target for inhibiting the EMT.This second phenotype identified by surprisal analysis reveals

that EMT cellular reprogramming is a multistep process thatincludes an intermediate low free energy state. The free energychanges reflected in the sign changes for the state variable sug-gest that this multistep process is correlated with changes in cellmetabolism as the external source of energy. The correspondinggene signature (Table S3) for this second phenotype indeedincludes an increased contribution of metabolic genes.

The EMT Free Energy Landscape. At any time t during an EMT, thework done to bring the system to its current state can be com-puted from the surprisal analysis (10, 19). Two important con-straints for the A549 TGF-β1–induced EMT have been identified;therefore, there are two variables in a plot of the free energychange during the EMT. We chose as the two variables the statevariables λ1ðtÞ and λ2ðtÞ (Fig. 4). After exiting the epithelial state,the system dissipates energy as it moves to a lowerminimal energy,the maturation state. However, work must be done to drive thesystem from the maturation state into the mesenchymal state.Surprisal analysis shows that this EMT proceeds through a low-

energy, intermediate maturation state. It is not the case thata trough invariably must be traversed between two states; a pathalso might exist in which energy is first required and subsequentlyreleased. Careful examination of Fig. 4 shows that here, too, thereare barriers to cross. Transition from the epithelial state is overa small barrier, showing that the epithelial state is stable not only ata given time but also for small changes in time. Similarly, thetransition into the mesenchymal state also crosses a small barrier. Insummary, the EMT involves three states—the epithelial, matura-tion, and mesenchymal states—that not only are thermodynamically

Fig. 3. Surprisal analysis reveals an intermediate maturation stage duringTGF-β1–induced EMT in A549 cells. (A) Free energy for the 20 uppermost and20 most down-regulated transcripts of the second phenotype, α= 2. Be-tween ∼2 and 24 h, the free energy switches signs. We identify this intervalwith a maturation stage (16). (B) During this maturation stage, the secondstate variable, λ2ðtÞ, has a sign that differs from that before or after enteringthe maturation stage; the first sign change coincides with the epithelial stateentering the maturation stage, the second sign change coincides with theexit from the maturation stage into the mesenchymal state.

Fig. 4. The free energy landscape for the TGF-β1–induced EMT in A549 cells.At every point in time, the energetic state of the cell undergoing the EMTmay be characterized by two time-dependent state variables, λ1ðtÞ andλ2ðtÞ. The path of the transition is along the orange arrows and proceedsthrough a stable intermediate, the maturation point. Energy is released intransition from the epithelial state to the maturation point and consumedfrom the maturation point to the higher-in-energy mesenchymal state.

13238 | www.pnas.org/cgi/doi/10.1073/pnas.1414714111 Zadran et al.

Dow

nloa

ded

by g

uest

on

July

12,

202

0

Page 5: Surprisal analysis characterizes the free energy time ... › content › pnas › 111 › 36 › 13235.full.pdfSurprisal analysis characterizes the free energy time course of cancer

stable but also are kinetically stable, that is, stable against smallchanges in time. As in traditional chemical kinetics, there is a bar-rier between any two connected, kinetically stable states.

Error Analysis: Additional Constraints. The TGF-β1–induced EMTA549 microarray study (www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS3710) was conducted in triplicate, permitting evaluationof error bars on the state variables (21). The balance state is robust;error bars are negligible (Fig. 1B). Error bars on the potential λ1ðtÞof the first constraint also are small (Fig. 2B). Error bars on thepotential λ2ðtÞ of the second constraint are more significant(Fig. 3B). Surprisal analysis for this experiment determines a thirdconstraint and a corresponding state variable λ3ðtÞ (Fig. S3).However, error bars for δ  λ3ðtÞ are large; the error bars are com-parable to the value of the variable itself, i.e., δ  λ3ðtÞ≈ λ3ðtÞ at alltime points. Consequently, the value zero is within the range ofpossibility for λ3ðtÞ. The state variable measures the importance ofthe constraint as judged by the search for a minimal free energystate. When δ  λ3ðtÞ≈ λ3ðtÞ, we arrive at the same state, regardlessof whether the third constraint is included in the search for thisminimum. In such cases, the imposition of a third (or any higher)constraint is not supported.

EMT Time Scales. The time course of the TGF-β1–elicited A549-cell EMT is shown through the time dependence of the statevariables λ1ðtÞ and λ2ðtÞ (Figs. 2B and 3B). The TGF-β1–inducedprogression from the epithelial to the mesenchymal state lasts∼24 h. At ∼8 h, the switch from the epithelial to the mesen-chymal state occurs [shown by the sign change for λ1ðtÞ, Fig. 2Band Fig. S1]. Somewhat earlier, at ∼4 h, the maturation phasebegins [as shown in the first sign change for λ2ðtÞ (Fig. 3B)]. Theenergy release along the transition to maturation occurs rapidlyand quite early; in contrast, the energy buildup from the matu-ration point to the mesenchymal end state is far slower [λ2ðtÞ,Fig. 3B], extending beyond the 24-h time point. The energybuildup is complete by 72 h; without mRNA expression leveldata between 24–72 h, it is not possible to conclude when thetransition to the mesenchymal state is finished.

An Increase in Cytosolic ATP Levels Correlates with the MaturationStage of the TGF-β1–Induced A549-Cell EMT. TGF-β1 A549 treat-ment induces mesenchymal-like protrusions and filopodia at 4 and8 h (Fig. 5A). The TGF-β1 maturation stage of this EMT beginswhen the second state variable, λ2ðtÞ, is positive (Fig. 3B). A single-cell enhanced-acceptor fluorescence (EAF)-based ATP biosensor(25, 26) demonstrates that individual cell ATP levels in the mes-enchymal state appear higher than in cells in the epithelial state(Fig. 5B), consistent with the idea that the energy required to bringthe cells from the maturation point to the mesenchymal state isgreater than that for the epithelial state (Fig. 3B and Fig. S1).The data (Figs. 2–4 and Fig. S1) suggest that the minimum

free energy change in the TGF-β1–driven EMT, the maturationpoint, occurs ∼8 h after TGF-β1 application, and that work inputis required to drive an energetically dependent transition fromthe maturation point to the mesenchymal state. By using bothsingle-cell (Fig. 5C) and bulk-cell ATP measurements (Fig. 5D),increased cytosolic ATP levels are observed at 8 h. Transitionfrom the maturation state to the mesenchymal state occurs at∼20 h (Fig. 3B); elevated ATP levels are observed at 20 h (Fig. 5C and D), when cells have left the maturation state and enteredthe mesenchymal state. The data suggest that ATP levels reflect,and possibly may drive, the maturation process.

The Balance State During EMT for TGF-β1–Treated Pancreatic CancerCells. Surprisal analysis also was applied to microarray data forcultured Panc-1 epithelial cells 48 h after TGF-β1 treatment,when the cells attained a mesenchymal state (27). The balancestate of Panc-1 cells, like that of A549 cells, was unaffected byTGF-β1 treatment (Fig. S2A). Like A549 cells, ribosomal genesand structural proteins for Panc-1 cells have the greatest negativefree energy (Table S1). When free energy changes for the most

Fig. 5. Cytosolic ATP levels during the EMT. (A) EMT is induced in A549 cellswith TGF-β1. At the times indicated, cells were collected for analysis. (B) Atthe times shown, the cells were imaged by phase contrast microscopy (20×).Cellular fasciculation is present within 4 h; an increase in filopodia migratoryprotrusions occurs in 8 h. (C) EAF-based single-cell ATP biosensor analysisof A549 cells following TGF-β1 treatment. (Upper) Donor fluorescence wasmonitored (GFP, pseudocolored red for better contrast). (Lower) Fluorescenceimages are overlaid on bright-field images. n = 4 across three independentcultures. (D) Quantification of EAF-based ATP biosensor donor signal. GFPfluorescence was monitored in individual cells at the times shown (n = 5, threecells per field); data are means ± SEM of five independent experiments. *P <0.05, **P < 0.01, using Student t test. (E) Whole-cell ATP levels were moni-tored with a modified luciferin/luciferase-based ATP assay at the time pointsshown (n = 5, three cells per field); data are means ± SEM of five independentexperiments. *P < 0.05, **P < 0.01, using Student t test.

Zadran et al. PNAS | September 9, 2014 | vol. 111 | no. 36 | 13239

SYST

EMSBIOLO

GY

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

July

12,

202

0

Page 6: Surprisal analysis characterizes the free energy time ... › content › pnas › 111 › 36 › 13235.full.pdfSurprisal analysis characterizes the free energy time course of cancer

up- and down-regulated genes in the epithelial and mesenchymalstates, relative to the balance state, are evaluated, a definitive“switch” is observed (Fig. S4B).

Gene Ontogeny Analysis of the Two Major Phenotypes for A549 CellsDuring the EMT. In SI Appendix, we outline and apply a synergisticprocedure for analysis of microarray data. Gene ontogeny (GO)analysis (28) is applied separately to each of the phenotypes thatsurprisal analysis identifies as relevant for the transition and tothe gene list of the balance state (Figs. S5–S7). GO analysis of thesecond constraint suggests that the critical cellular functions thatdominate this phenotypic state are functions involved in receptorbinding mechanisms (Fig. S7). We suggest that this phenotypeintegrates information from the cell environment, through recep-tors and subsequent ligand-dependent signaling pathways, into itscellular decision-making processes. It is expected that during anEMT, the ability of cells to assimilate environmental cues is criticalas they take on a mesenchymal-like state and acquire migratoryproperties (29, 30).In conclusion, we applied surprisal analysis to characterize and

elucidate gene expression behavior responsible for an EMT. Weidentified an intermediate state in the process and observeda correlation of metabolic shifts with progress through the EMT.Surprisal analysis can monitor cell phenotype-specific thermo-dynamic behavior, trace the change of state with time on a freeenergy landscape, and provide an increased understanding of celltransitions and cellular state-dependent processes. Surprisal anal-ysis also begins to provide fundamental insights into the mecha-nisms that regulate the unbalanced processes—by identification of“process-specifying” genes. Surprisal analysis thus suggests that thestate variables are potential drug targets.

Experimental ProceduresEMT in Lung Adenocarcinoma Cells. Expression profiles are available publicallyat www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS3710 and ref. 22. HumanA549 lung adenocarcinoma cells were treated with 5 ng/mL TGF-β for 0, 0.5, 1,2, 4, 8, 16, 24, and 72 h to induce EMT. The experiment was performed in

triplicate. Samples were assayed using Affymetrix HG_U133_plus_2 arrays with54,675 probe sets, following standard techniques.

EMT in Pancreatic Adenocarcinoma Cells. Expression profiles are publicallyavailable at www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE23952 by Zadranet al. (26), who treated human Panc-1 pancreatic adenocarcinoma cells with5 ng/mL TGF-β for 0 and 48 h to induce EMT. The experiment was performed intriplicate. Samples were assayed using Affymetrix HG_U133_plus_2 arrays.

ATP Imaging and Measurement. A549 cells, from D. Shackelford, UCLA, LosAngeles CA, were cultured and maintained as adherent monolayers inDMEM (Cellgro) supplemented with 5% (vol/vol) FBS (Omega Scientific) and100 U/mL penicillin and streptomycin (Gibco) in 5% (vol/vol) CO2 at 37 °C.

Single-Cell ATP Imaging. An EAF-based single-cell ATP biosensor was gener-ated as described previously (24, 25). A549 cells were plated on laminin-coated glass-bottom dishes and transfected with FuGENE6 transfection re-agent (Roche Diagnostics) and the EAF-based ATP biosensor. Cells weresubjected to confocal imaging 2–3 d after transfection. Wide-field obser-vations used a Nikon TE2000-PFS inverted microscope (Nikon Instruments);cells were maintained at 37 °C with a continuous supply of a 95% air and 5%carbon dioxide mixture, by using a stage-top incubator. Untreated A549 cellsexpressing the EAF-based ATP biosensor initially were monitored via con-focal microscopy for donor (GFP) and acceptor (YFP) emission. After TGF-β1(5 ng/mL) treatment, EAF behavior in single cells was monitored over 72 h.The experiment was repeated in triplicate across three independent cul-tures. Image analysis used ImageJ.

Luciferase-Based ATP Analysis. Cellular ATP also was monitored with a mod-ified luciferase/luciferin-based assay (24).

GO Enrichment Analysis. GO analysis was conducted as described previously(ref. 27 and http://cbl-gorilla.cs.technion.ac.il). Threshold P value was 10−11.

ACKNOWLEDGMENTS. We thank the Institute of Molecular Medicine andthe David Geffen School of Medicine at the University of California, LosAngeles, for support to S.Z. This work was supported by a Prostate CancerFoundation Creativity Award to R.D.L. and European Commission FP7 Futureand Emerging Technologies–Open Project BAMBI 618024 (to R.A. and R.D.L.).

1. Kalluri R, Weinberg RA (2009) The basics of epithelial-mesenchymal transition. J ClinInvest 119(6):1420–1428.

2. Mani SA, et al. (2008) The epithelial-mesenchymal transition generates cells withproperties of stem cells. Cell 133(4):704–715.

3. Willis BC, Borok Z (2007) TGF-β-induced EMT: Mechanisms and implications for fibroticlung disease. Am J Physiol Lung Cell Mol Physiol 293(3):L525–L534.

4. De Craene B, Berx G (2013) Regulatory networks defining EMT during cancer initia-tion and progression. Nat Rev Cancer 13(2):97–110.

5. Hoos A, Cordon-Cardo C (2001) Tissue microarray profiling of cancer specimens andcell lines: Opportunities and limitations. Lab Invest 81(10):1331–1338.

6. Berrar DP, Werner D, GranzowM, eds (2003) A Practical Approach to Microarray DataAnalysis (Kluwer, Boston).

7. Quackenbush J (2002) Microarray data normalization and transformation. Nat Genet32(Suppl):496–501.

8. Facciotti MT (2013) Thermodynamically inspired classifier for molecular phenotypes ofhealth and disease. Proc Natl Acad Sci USA 110(948):19181–19182.

9. Samavarchi-Tehrani P, et al. (2010) Functional genomics reveals a BMP-driven mes-enchymal-to-epithelial transition in the initiation of somatic cell reprogramming. CellStem Cell 7(1):64–77.

10. Procaccia I, Levine RD (1976) Potential work: A statistical-mechanical approach forsystems in disequilibrium. J Chem Phys 65(8):3357–3364.

11. Levine RD (2005) Molecular Reaction Dynamics (Cambridge Univ Press, Cambridge,UK).

12. Levine RD, Bernstein RB (1974) Energy disposal and energy consumption in ele-mentary chemical reactions. Information theoretic approach. Acc Chem Res 7(12):393–400.

13. Kravchenko-Balasha N, et al. (2012) On a fundamental structure of gene networks inliving cells. Proc Natl Acad Sci USA 109(12):4702–4707.

14. Kravchenko-Balasha N, et al. (2011) Convergence of logic of cellular regulation in dif-ferent premalignant cells by an information theoretic approach. BMC Syst Biol 5:42.

15. Remacle F, Kravchenko-Balasha N, Levitzki A, Levine RD (2010) Information-theoreticanalysis of phenotype changes in early stages of carcinogenesis. Proc Natl Acad SciUSA 107(22):10324–10329.

16. Zadran S, Remacle F, Levine RD (2013) miRNA and mRNA cancer signatures de-termined by analysis of expression levels in large cohorts of patients. Proc Natl AcadSci USA 110(47):19160–19165.

17. Shin YS, et al. (2011) Protein signaling networks from single cell fluctuations andinformation theory profiling. Biophys J 100(10):2378–2386.

18. Wei W, et al. (2013) Hypoxia induces a phase transition within a kinase signalingnetwork in cancer cells. Proc Natl Acad Sci USA 110(15):E1352–E1360.

19. Gross A, Li CM, Remacle F, Levine RD (2013) Free energy rhythms in S. cerevisiae: Adynamic perspective with implications for ribosomal biogenesis. Biochemistry 52(9):1641–1648.

20. Levine RD (1978) Information theory approach to molecular reaction dynamics. AnnuRev Phys Chem 29(1):59–92.

21. Gross A, Levine RD (2013) Surprisal analysis of transcripts expression levels in thepresence of noise: A reliable determination of the onset of a tumor phenotype. PLoSOne 8(4):e61554.

22. Sartor MA, et al. (2010) ConceptGen: A gene set enrichment and gene set relationmapping tool. Bioinformatics 26(4):456–463.

23. Pohler E, et al. (2004) The Barrett’s antigen anterior gradient-2 silences the p53transcriptional response to DNA damage. Mol Cell Proteomics 3(6):534–547.

24. Buchholz M, et al. (2003) SERPINE2 (protease nexin I) promotes extracellular matrixproduction and local invasion of pancreatic tumors in vivo. Cancer Res 63(16):4945–4951.

25. Imamura H, et al. (2009) Visualization of ATP levels inside single living cells withfluorescence resonance energy transfer-based genetically encoded indicators. ProcNatl Acad Sci USA 106(37):15651–15656.

26. Zadran S, et al. (2013) Enhanced-acceptor fluorescence-based single cell ATP bio-sensor monitors ATP in heterogeneous cancer populations in real time. BiotechnolLett 35(2):175–180.

27. Maupin KA, et al. (2010) Glycogene expression alterations associated with pancreaticcancer epithelial-mesenchymal transition in complementary model systems. PLoS One5(9):e13002.

28. Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z (2009) GOrilla: A tool for discovery andvisualization of enriched GO terms in ranked gene lists. BMC Bioinformatics 10:48.

29. Catalano V, et al. (2013) Tumor and its microenvironment: A synergistic interplay. SemCancer Biol 23(6):522–532.

30. Chu K, Boley KM, Moraes R, Barsky SH, Robertson FM (2013) The paradox of E-cad-herin: Role in response to hypoxia in the tumor microenvironment and regulation ofenergy metabolism. Oncotarget 4(3):446–462.

13240 | www.pnas.org/cgi/doi/10.1073/pnas.1414714111 Zadran et al.

Dow

nloa

ded

by g

uest

on

July

12,

202

0