arxiv:0811.4149v1 [q-bio.mn] 25 nov 2008 · andrew muglery department of physics, columbia...

10
A stochastic spectral analysis of transcriptional regulatory cascades Aleksandra M. Walczak * Princeton Center for Theoretical Science, Princeton University, Princeton, NJ 08544 Andrew Mugler Department of Physics, Columbia University, New York, NY 10027 Chris H. Wiggins Department of Applied Physics and Applied Mathematics, Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10027 The past decade has seen great advances in our understanding of the role of noise in gene regulation and the physical limits to signaling in biological networks. Here we introduce the spectral method for computation of the joint probability distribution over all species in a biological network. The spectral method exploits the natural eigenfunctions of the master equation of birth-death processes to solve for the joint distribution of modules within the network, which then inform each other and facilitate calculation of the entire joint distribution. We illustrate the method on a ubiquitous case in nature: linear regulatory cascades. The efficiency of the method makes possible numerical optimization of the input and regulatory parameters, revealing design properties of, e.g., the most informative cascades. We find, for threshold regulation, that a cascade of strong regulations converts a unimodal input to a bimodal output, that multimodal inputs are no more informative than bimodal inputs, and that a chain of up-regulations outperforms a chain of down-regulations. We anticipate that this numerical approach may be useful for modeling noise in a variety of small network topologies in biology. Transcriptional regulatory networks are composed of genes and proteins, which are often present in small num- bers in the cell [1, 2], rendering deterministic models poor descriptions of the counts of protein molecules ob- served experimentally [3, 4, 5, 6, 7, 8, 9]. Probabilis- tic approaches have proven necessary to account fully for the variability of molecule numbers within a homogenous population of cells. A full stochastic description of even a small regulatory network proves quite challenging. Many efforts have been made to refine simulation approaches [10, 11, 12, 13, 14], which are mainly based on the vary- ing step Monte Carlo or ‘Gillespie’ method [15, 16]. Yet expanding full molecular simulations to larger systems and scanning parameter space is computationally expen- sive. On the other hand the interaction of many protein and gene types makes analytical methods hard to im- plement. A wide class of approximations to the master equation, which describes the evolution of the probabil- ity distribution, focuses on limits of large concentrations or small switches [17, 18, 19]. Approximations based on timescale separation of the steps of small signaling cascades have been successfully used to calculate escape properties [20, 21, 22]. In this paper we introduce a new method for calculat- ing the steady-state distributions of chemical reactants. The procedure, which we call the spectral method, re- lies on exploiting the natural basis of a simpler problem from the same class. The full problem is then solved nu- merically as a an expansion in this basis, reducing the master equation to a set of linear algebraic equations. We break up the problem into two parts: a preprocess- ing step, which can be solved algorithmically; and the parameter-specific step of obtaining the actual probabil- ity distributions. The spectral method allows for huge computational gains with respect to simulations. We illustrate the spectral method for the case of regula- tory cascades: downstream genes responding to concen- trations of transcription factors produced by upstream genes which are linked to external cues. Cascades play an important role in a diversity of cellular processes [23, 24, 25], from decision making in development [26] to quorum sensing among cells [27]. We take a coarse- grained approach, modeling each step of a cascade with a general regulatory function that depends on the copy number of the reactant at the previous step (cf. Fig. 1). While the method as implemented describes arbitrary regulation functions, we optimize the information trans- mission in the case of the most biologically simple reg- ulation function: a discontinuous threshold, in which a species is created at a high or low rate depending on the copy count of the species directly upstream. In the next sections, we outline the spectral method and present in detail our findings regarding signaling cascades. METHOD We calculate the steady-state joint distribution for L chemical species in a cascade (cf. Fig. 1). The approach we take involves two key observations: the master equa- tion, being linear,[40] benefits from solution in terms of its eigenfunctions; and the behavior of a given species arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008

Upload: others

Post on 24-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

A stochastic spectral analysis of transcriptional regulatory cascades

Aleksandra M. Walczak∗Princeton Center for Theoretical Science, Princeton University, Princeton, NJ 08544

Andrew Mugler†Department of Physics, Columbia University, New York, NY 10027

Chris H. Wiggins‡Department of Applied Physics and Applied Mathematics,

Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10027

The past decade has seen great advances in our understanding of the role of noise in gene regulationand the physical limits to signaling in biological networks. Here we introduce the spectral method forcomputation of the joint probability distribution over all species in a biological network. The spectralmethod exploits the natural eigenfunctions of the master equation of birth-death processes to solvefor the joint distribution of modules within the network, which then inform each other and facilitatecalculation of the entire joint distribution. We illustrate the method on a ubiquitous case in nature:linear regulatory cascades. The efficiency of the method makes possible numerical optimizationof the input and regulatory parameters, revealing design properties of, e.g., the most informativecascades. We find, for threshold regulation, that a cascade of strong regulations converts a unimodalinput to a bimodal output, that multimodal inputs are no more informative than bimodal inputs,and that a chain of up-regulations outperforms a chain of down-regulations. We anticipate that thisnumerical approach may be useful for modeling noise in a variety of small network topologies inbiology.

Transcriptional regulatory networks are composed ofgenes and proteins, which are often present in small num-bers in the cell [1, 2], rendering deterministic modelspoor descriptions of the counts of protein molecules ob-served experimentally [3, 4, 5, 6, 7, 8, 9]. Probabilis-tic approaches have proven necessary to account fully forthe variability of molecule numbers within a homogenouspopulation of cells. A full stochastic description of even asmall regulatory network proves quite challenging. Manyefforts have been made to refine simulation approaches[10, 11, 12, 13, 14], which are mainly based on the vary-ing step Monte Carlo or ‘Gillespie’ method [15, 16]. Yetexpanding full molecular simulations to larger systemsand scanning parameter space is computationally expen-sive. On the other hand the interaction of many proteinand gene types makes analytical methods hard to im-plement. A wide class of approximations to the masterequation, which describes the evolution of the probabil-ity distribution, focuses on limits of large concentrationsor small switches [17, 18, 19]. Approximations basedon timescale separation of the steps of small signalingcascades have been successfully used to calculate escapeproperties [20, 21, 22].

In this paper we introduce a new method for calculat-ing the steady-state distributions of chemical reactants.The procedure, which we call the spectral method, re-lies on exploiting the natural basis of a simpler problemfrom the same class. The full problem is then solved nu-merically as a an expansion in this basis, reducing themaster equation to a set of linear algebraic equations.We break up the problem into two parts: a preprocess-

ing step, which can be solved algorithmically; and theparameter-specific step of obtaining the actual probabil-ity distributions. The spectral method allows for hugecomputational gains with respect to simulations.

We illustrate the spectral method for the case of regula-tory cascades: downstream genes responding to concen-trations of transcription factors produced by upstreamgenes which are linked to external cues. Cascades playan important role in a diversity of cellular processes[23, 24, 25], from decision making in development [26]to quorum sensing among cells [27]. We take a coarse-grained approach, modeling each step of a cascade witha general regulatory function that depends on the copynumber of the reactant at the previous step (cf. Fig. 1).While the method as implemented describes arbitraryregulation functions, we optimize the information trans-mission in the case of the most biologically simple reg-ulation function: a discontinuous threshold, in which aspecies is created at a high or low rate depending on thecopy count of the species directly upstream. In the nextsections, we outline the spectral method and present indetail our findings regarding signaling cascades.

METHOD

We calculate the steady-state joint distribution for Lchemical species in a cascade (cf. Fig. 1). The approachwe take involves two key observations: the master equa-tion, being linear,[40] benefits from solution in terms ofits eigenfunctions; and the behavior of a given species

arX

iv:0

811.

4149

v1 [

q-bi

o.M

N]

25

Nov

200

8

Page 2: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

2

s n mA

B

D

C

A’

B’

D’

C’

. . .

. . . qs qn

s n m

FIG. 1: A schematic representation of a general signalingcascade. Interactions between species of interest may includeintermediate processes; we take a coarse-grained approach,condensing these intermediate processes into a single effectiveregulatory function. For example, the regulatory function qn

describes the creation rate of a species with copy count m asa function of the copy count n of the previous species.

should depend only weakly on distant nodes given theproximal nodes.

The second of these observations can be illustrated suc-cinctly by considering a three-gene cascade in which thefirst may be eliminated by marginalization. For threespecies obeying s

qs−→ nqn−→ m as in Fig. 1, we have the

linear master equation

psnm = ρ[gp(s−1)nm − gpsnm + (s+ 1)p(s+1)nm − spsnm

]+qsps(n−1)m − qspsnm + (n+ 1)ps(n+1)m − npsnm

+ρ[qnpsn(m−1) − qnpsnm + (m+ 1)psn(m+1) −mpsnm

].(1)

Here time is rescaled by the first gene’s degradation rate,so that each gene’s creation rate (g, qs, or qn) is normal-ized by its respective degradation rate; ρ and ρ are theratios of the first and third gene’s degradation rate to thesecond’s, respectively.

To integrate out the first species, we sum over s. Wethen introduce gn, the effective regulation of n, by∑

s

qspnms = pnm

∑s

qsps|nm ≈ pnm

∑s

qsps|n ≡ gnpnm.(2)

Here we have made the Markovian approximation thats is conditionally independent of m given n. Gener-ally speaking, the probability distribution depends on allsteps of the cascade. However since there are no loops inthe cascades we consider here, we assume in Eqn. 2 thatat steady-state each species is not affected by species twoor more steps downstream in the cascade. The validity ofthe Markovian approximation is tested using both a non-Markovian tensor implementation of the spectral methodand a stochastic simulation using the Gillespie algorithm[16], as discussed in Supplementary Material. We findthat the approximation produces accurate results for allbut the most strongly discontinuous regulation functions;even in these cases qualitative features such as modalityof the output distribution and locations of the modes are

preserved. Armed with the Markovian approximationthe equation for the remaining two species simplifies to

pnm = gn−1p(n−1)m − gnpnm + (n+ 1)p(n+1)m − npnm

+ρ[qnpn(m−1) − qnpnm + (m+ 1)pn(m+1) −mpnm

].(3)

This procedure can be extended indefinitely for a cascadeof arbitrary length L, in which modules consisting of pairsof adjacent species are each described by two-dimensionalmaster equations.

The distribution for the first two species is obtained bysumming over all other species, which gives an equationof the same form as Eqn. 3 but with gn = g = constant.If instead the input distribution is an arbitrary pn, thedistribution for the first two species is still described byEqn. 3, but with gn calculated recursively from pn viagn = (−npn+gn−1pn−1+(n+1)pn+1)/pn with g0 = p1/p0

to initialize.[41] Describing the start of a cascade (witharbitrary input distribution) and describing subsequentsteps both amount to solving Eqn. 3 with gn given byeither the recursive equation or Eqn. 2 respectively.

We solve Eqn. 3 by defining the generating func-tion [28] G(x, y) =

∑n,m pnmx

nym over complex vari-ables x and y.[42] It will prove more convenient towrite the generating function in a state space as |G〉 =∑

n,m pnm|n,m〉,[43] with inverse transform pnm =〈n,m|G〉, where the states |i〉 and 〈i|, for i ∈ {n,m},along with the inner product 〈i|i′〉 = δii′ , define the pro-tein number basis. With these definitions, Eqn. 3 atsteady-state becomes 0 = H|G〉, where

H = b+n b−n (n) + ρb+mb

−m(n). (4)

Here we have introduced raising and lowering operatorsin protein space [29, 30, 31, 32] obeying b+i |i〉 = |i+ 1〉−|i〉 for i ∈ {n,m}, b−n (n)|n〉 = n|n− 1〉 − gn|n〉 andb−m(n)|n,m〉 = m|n,m− 1〉 − qn|n,m〉,[44] and the regu-lation functions have become operators obeying gn|n〉 =gn|n〉 and qn|n〉 = qn|n〉.

Were the operators b−n (n) and b−m(n) not n-dependent,H would be easily diagonalizable. In fact, this corre-sponds to the uncoupled case, in which there is no regu-lation, and both upstream and downstream gene undergoindependent birth-death processes with Poisson steady-state distributions. We exploit this fact by working withthe respective deviations of gn and qn from some con-stant creation rates g and q. Then H can be partitionedas H = H0 + H1, where

H0 = b+n b−n + ρb+mb

−m (5)

H1 = b+n Γn + ρb+m∆n, (6)

and we define new operators b−n |n〉 = n|n− 1〉 − g|n〉,b−|m〉 = m|m− 1〉− q|m〉, Γn = g− gn, and ∆n = q− qn.Γn and ∆n capture the respective deviations of gn and qnfrom g and q, and H0 is diagonal in the eigenbases |j〉 and

Page 3: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

3

|k〉 of the uncoupled birth-death processes at rates g andq respectively;[45] specifically H0|j, k〉 = (j + ρk)|j, k〉.Projecting Eqn. 4 onto the eigenbasis yields the linearequation of motion

(j+ρk)Gjk+∑j′

Γj−1,j′Gj′k+ρ

∑j′

∆jj′Gj′,k−1 = 0, (7)

where Gjk = 〈j, k|G〉, Γjj′ =∑

n(g − gn)〈j|n〉〈n|j′〉, and∆jj′ =

∑n(q− qn)〈j|n〉〈n|j′〉. Eqn. 7 exploits the subdi-

agonal nature of the k-dependence; it is initialized usingGj0 =

∑n pn〈j|n〉, then solved exactly by matrix in-

version for each subsequent k. The joint distribution isretrieved via the inverse transform as

pnm =∑jk

〈n|j〉Gjk〈m|k〉. (8)

One computational advantage is that the overlap in-tegrals 〈n|j〉 and 〈j|n〉 need only be evaluated explicitlyfor 〈n|j = 0〉 = e−g gn/n! and 〈j|n = 0〉 = (−g)j/j!; allother values can be obtained recursively using the selec-tion rules 〈n|j + 1〉 = 〈n− 1|j〉 − 〈n|j〉 and 〈j|n+ 1〉 =〈j − 1|n〉 + 〈j|n〉.[46] The same holds for 〈m|k〉, takingn→ m, j → k, and g → q. Note that once g and q havebeen chosen,[47] the calculation can be separated into apreprocessing step, in which the matrices 〈n|j〉, 〈j|n〉,and 〈m|k〉 are calculated (and potentially reused at sub-sequent steps of the cascade or for subsequent steps inan optimization), and the actual step of calculating Gjk

via Eqn. 7.By exploiting the basis of the uncoupled system, we

have reduced Eqn. 3 to a set of simple linear algebraicequations,[48] Eqn. 7, which dramatically speeds up thecalculation without sacrificing accuracy (cf. Results andSupplementary Material). The method is applicable forany input function gn and regulation function qn. So-lutions using other bases and further generalizations tosystems with feedback will be discussed in future work.

RESULTS

The spectral method is fast and accurate

To demonstrate[49] the accuracy and computationalefficiency of the spectral method, we compare it bothto an iterative numerical solution of Eqn. 3 and to astochastic simulation using the ‘Gillespie’ algorithm [16]for a cascade of length L = 2 with a Poisson input (gn =g = constant) and the discontinuous threshold regulationfunction

qn =

{q− for n ≤ n0

q+ for n > n0.(9)

The spectral method achieves an agreement up to ma-chine precision with the iterative method in ∼0.01s,

which is ∼1000 times faster than the iterative method’sruntime and ∼108 faster than the runtime necessary forthe stochastic simulation to achieve the same accuracy;see Supplementary Materials for details. The huge gain incomputational efficiency over both the iterative methodand the stochastic simulation makes the spectral methodextremely useful, particularly for optimization problems,in which the probability function must be evaluated mul-tiple times. In the next sections we exploit this feature tooptimize information transmission in signaling cascades.

Information processing in signaling cascades

Linear signaling cascades are a ubiquitous featureof biological networks, used to transmit relevant infor-mation from one part of a cellular system to another[23, 24, 25, 26, 27]. Information processing in a cas-cade is quantified by the mutual information [33], whichmeasures in bits how much information about an inputsignal is transmitted to the output signal in a noisy pro-cess. For a cascade of length L, the mutual informa-tion between an input species (with copy number n1)and an output species (with copy number nL)[50] isI =

∑n1,nL

p(n1, nL) log2[p(n1, nL)/p(n1)p(nL)]. In thisstudy we define the capacity I∗ as the maximum mutualinformation over either regulatory parameters, the inputdistribution, or both. Depending on the signal to noiseratio, a high-capacity cascade functions either where theinput signal is strongest or where the transmission pro-cess is least noisy [34, 35].

We first consider a cascade of length L in which theregulation function qn is a simple threshold (Eqn. 9) withfixed parameters that are identical for each cascade step.It is worth noting that while a threshold-regulated cre-ation rate represents the simplest choice biologically, itis the most taxing choice computationally: as the dis-continuity ∆ = |q+ − q−| increases, we find both that(a) a larger cutoff K in eigenmodes is required for a de-sired accuracy, and (b) the accuracy of the Markovian ap-proximation decreases (cf. Supplementary Material). Theresults herein therefore constitute a stringent numericalchallenge for the spectral method.

We take the input p(n1) to be a Poisson distribution(i.e. gn = g = constant). In the extreme cases, when thethreshold is infinite or zero, the output is a Poisson distri-bution centered at q− or q+, respectively. Similarly, whenthe input median is below or above threshold, the outputmean should be biased toward q− or q+, respectively. Forexample, in Fig. 2A, 〈n1〉 < n0, and the output distri-bution is shifted toward q−. This effect is amplified ateach step of the cascade, such that 〈nL〉 → q− for largeL. Similarly, 〈nL〉 → q+ for large L when 〈n1〉 > n0 (Fig.2C). When 〈n1〉 ∼ n0 (Fig. 2B), the output is balancedbetween q− and q+; if the discontinuity ∆ = |q+ − q−| issufficiently large, the output is bimodal, as discussed in

Page 4: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

4

more detail in the next section.Mutual information I decreases monotonically with L

for all 〈n1〉 (cf. Fig. 2F), as required by the data pro-cessing inequality [36] (i.e. one cannot learn more infor-mation from the output of an (L+1)-gene cascade thanone could from an L-gene cascade with identical regula-tion, only less). I is maximal for 〈n1〉 ∼ n0 which makesintuitive sense, as it corresponds to the input taking ad-vantage of both rates q− and q+ roughly equally in pro-ducing the output. A simple calculation quantifies thisintuition. Approximating the steady-state distributionfor the moment as a strict switch conditional on n0 (i.e.p(nL|n1) = p−(nL) if n1 < n0 and p(nL|n1) = p+(nL) ifn1 ≥ n0 for some distributions p±(nL)), it follows fromthe definition of I that

Iswitch = S−∑±

∑nL

p±(nL) log2

[1 +

π∓p∓(nL)π±p±(nL)

], (10)

where S = −∑± π± log2 π±, π− =

∑n1<n0

p(n1) andπ+ =

∑n1≥n0

p(n1) = 1 − π−. If there is little overlapbetween p−(nL) and p+(nL), then Iswitch ∼ S, which ismaximal when π− = π+, i.e. when the median of theinput distribution p(n1) lies at the threshold n0. Addi-tionally, since the maximal value of S (and Iswitch, sincethe summand of the second term in Eqn. 10 is alwaysnonnegative) is 1 bit, this calculation also suggests thatthe capacity of threshold regulation (in the limit of strictswitch-like behavior) is limited to 1 bit. Again, this resultis intuitive, as the cascade is only passing on the binaryinformation of whether particles in the input distributionare below or above the threshold.

In an experimental setup one might have access toonly the mean response (or “transfer function”), or thevariance in response across cells (or “noise”), of a sig-naling cascade to its input. Since our method yieldsthe full distributions, such summary statistics are read-ily computed. Despite the sharpness of threshold regu-lation, the transfer functions are quite smooth even atL = 2 (cf. Fig. 2D). The effect of the intrinsic noise is tosmooth out a sharp discontinuity in creation rates, pro-ducing a continuous mean response. The transfer func-tions shown are least-squares fit to Hill functions of theform 〈nL|n1〉 = α− + (α+ − α−)(n1)h/[(n1)h + (kd)h].As one would expect, for all L, best fit values of α−, α+

are near the rates q− and q+ respectively, and best fitkd values are near the threshold n0. As L increases, thetransfer function sharpens, and cooperativity h increases(cf. inset in D), due to the amplified migration of the out-put to either q− and q+ in longer cascades (as in Figs.2A and 2C).

The strength of the noise increases with L (cf. Fig. 2E),consistent with the reduction in I with L (cf. Fig. 2F),and the noise is peaked at the threshold. The “switchapproximation” (cf. Eqn. 10) illustrates the gain in infor-mation when the median of the input coincides with the

0 5 10 150

5

10

15

aver

age

outp

ut, <

n L>

A CBD

qn

2 4 6 84

5

6

L

h

0 5 10 150

5

10

nois

e in

out

put, !

(nL)E

0 5 10 150

0.1

0.2

0.3

average input, <n1>

mut

ual i

nfor

mat

ion,

I (b

its)

F

0 10 200

0.1

0.2

0.3

0.4n0q− q+

outp

ut d

istri

butio

ns, p

(nL)

A inputL = 2L = 3L = 4L = 5L = 6L = 7L = 8

0 10 200

0.05

0.1

0.15

0.2

outp

ut d

istri

butio

ns, p

(nL)

copy count

C inputL = 2L = 3L = 4L = 5L = 6L = 7L = 8

0 10 200

0.05

0.1

0.15

0.2

outp

ut d

istri

butio

ns, p

(nL)

B inputL = 2L = 3L = 4L = 5L = 6L = 7L = 8

FIG. 2: Transfer functions and noise in a signaling cascade.A-C: Plots of input distribution p(n1) (black) and outputdistributions p(nL) (colors; see legend) for various cascadelengths L. Input distribution is a Poisson centered at 〈n1〉 = 4(in A), 〈n1〉 = n0 = 6 (in B), or 〈n1〉 = 10 (in C). Theregulation function qn for all steps is a threshold (Eqn. 9)shown in D (black line with dots), with parameters q−, q+,and n0 overlaid as dashed lines in A-C. The degradation rateratio is ρ = 1, and Eqn. 3 is solved using the spectral methodwith q = 10 and g = 〈gn〉 for each step in the cascade. D-F:Transfer functions (average output 〈nL〉) (D), noise (standarddeviation of the output σ(nL)) (E), and mutual informationI (F) as functions of average input 〈n1〉 for various cascadelengths L (colors as in A-C). As in A-C, input is Poisson atevery 〈n1〉; dashed lines correspond to the specific 〈n1〉 valuesin A, B, and C. Inset in D: Cooperativity h as a function ofL.

threshold; the “small noise approximation” [34, 35], how-ever, illustrates the loss in information when the peak ofthe input coincides with the peak of the noise. The trade-off between these two trends thwarts information trans-mission with unimodal input distributions (e.g., thoseused in Fig. 2) and suggests an input distribution withtwo or more modes should be able to transduce more in-formation. Such distributions are the subject of studybelow, and, in related work [35, 37] are shown to be theoptimal strategy and to be observed in biology for a regu-latory system in which peak noise and threshold coincide.

Bimodal output from a unimodal input

A striking feature of Fig. 2B is that the unimodal inputis converted to a bimodal output for cascades of lengthL = 3 or longer. Bimodality can arise from a system withtwo genes whose proteins repress each other or from a sin-gle gene whose proteins activate its own expression. Herewe demonstrate that cascades with sufficiently strongregulation constitute an information-optimal mechanism

Page 5: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

5

for a cell to achieve bimodality.Recall that Fig. 2B corresponds to the case where the

input distribution is optimally matched with the regula-tion function, i.e. the bimodal output represents optimalinformation transmission. By optimizing over the meang of a Poisson input distribution, we find that the mostinformative output distribution in a cascade with uni-modal input can be unimodal or bimodal, depending onregulatory parameters and the length of the cascade. Fig.3A shows examples of regulation functions which produceoutput distributions that are unimodal, bimodal for cas-cades as long or longer than some L∗ (which we term“persistent” bimodality), and bimodal for short cascadesbut unimodal at both initial and final nodes for longercascades (which we term “localized” bimodality).

Bimodality is found both in cascades in which each stepis down-regulating, which we call “AC” cascades, and inthose in which each step is up-regulating, which we call“DC” cascades. In DC cascades, as seen in the insets ofpanels 1-3 in Fig. 3A, the average output either mono-tonically decreases or monotonically increases with L. Inthe former case, since q− < n0, the probability that theoutput is below the threshold given that the input is be-low threshold is large. Successive such regulations drivethe probability of being below the threshold towards 1,successively decreasing 〈n〉 at each step in the cascade. Inthe latter case, since q+ > n0, the same picture holds, and〈n〉 monotonically increases with L. Whether the mono-tonically increasing or decreasing behavior is the moreinformative is determined by the relationship among q+,q−, and n0. In AC cascades, an analogous picture holdsbut with alternation: π− < π+ for the even-numberedlinks (cf. Eqn. 10), and the AC condition q− > q+ leadsto π+ < π− for the odd-numbered links, as illustratedin the insets of panels 4-6 in Fig 3A. These behaviorsmotivate the names “AC” and “DC,” analogous to al-ternating and direct current flow. Performance of ACand DC cascades is compared in more detail in the nextsection.

Fig. 3B shows a phase diagram of optimal outputmodality as a function of the rates q− and q+: bimodalityis found at high values of the discontinuity ∆ = |q+−q−|(specifically, for ∆ & 11 in AC circuits and ∆ & 12 inDC circuits when n0 = 8). Intuitively, since the weightof the output is distributed between q− and q+ for longcascades, increasing their separation spreads the weightsapart and creates bimodal distributions. Furthermore,as ∆ increases, the bimodality becomes more robust: itgoes from localized to persistent, and its onset occurs ata smaller cascade length L∗. The inset of Fig. 3B showsthat capacity I∗ also increases with ∆; cascades with bi-modal output therefore have higher capacities than thosewith unimodal output. As ∆ increases, the informationtransmission properties of a regulatory cascade are betterapproximated by simple switch-like regulation (cf. Eqn.10). In short, summarizing the input distribution by π+

and π− is a more informative summary of the distribu-tion as the regulation becomes more discontinuous.

Channel capacity in AC/DC cascades

Our setup provides a way to ask quantitativelywhether a cascade with down-regulating steps (AC) cantransmit information with more or less fidelity than a cas-cade with down-regulating steps (DC). Since a cell mustexpend time and energy to make proteins, a fair com-parison between cascade types can only be made whenthe species involved in each type are present in equalcopy number. As in [38], we introduce the objectivefunction L = I − λ〈n〉 where I is mutual informationand 〈n〉 =

∑L`=1〈n`〉/L is an average copy count over all

species in the cascade. Here λ represents the metaboliccost of making proteins, and optimizing L for differentvalues of λ allows a comparison of AC and DC capacitiesI∗ at similar values of 〈n〉.

For both AC and DC cascades, I∗ increases with 〈n〉 asmore proteins are made available to encode the signal,[51]and I∗ decreases[52] with L at all 〈n〉 (cf. Fig. 4A).Both AC and DC capacities converge to an L-dependentasymptotic value at high copy count, but DC cascadesattain higher capacities per output protein than AC cas-cades. The difference is most pronounced at low copycount (〈n〉 . 7), and more pronounced still for longercascades. The difference is easily explained: AC andDC cascades of the same length with the same discon-tinuity ∆ = |q+ − q−| have the same capacity but havedifferent mean numbers of proteins. Recall from Fig. 3Bthat large ∆ leads to high-capacity, bimodal solutions.The difference between AC and DC cascades is in theplacement of their optimal distributions for a given ∆.We observe that optimal AC cascades tend to exhibit〈n〉 & n0, while optimal DC cascades tend to exhibit〈n〉 . n0. Ultimately, this allows DC cascades to achievethe same capacity for the same regulation parameters(cf. Fig. 3B inset), but use fewer proteins. These resultssuggest that DC cascades transmit with higher fidelityper protein than AC cascades when protein productionis costly.

The most informative input to a threshold isbimodal

If the first species is governed by more than a sim-ple birth-death process, the input to a cascade will notbe a simple Poisson distribution. To investigate the roleof input multimodality in information transmission, weconsider inputs defined by a mixture of Poisson distribu-tions, p(n1) =

∑Zi=1 πie

−gign1i /n1!, (with

∑Zi=1 πi = 1).

As before, we expect information to increase with copy

Page 6: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

6

0

0.05

0.1

0.15

0.2

optim

al o

utpu

t, p*

(nL)

unimodal

1

DC

A

0 5 10 15 20 250

10

copy count

q n

2 4 6 82

4

6

L

<nL>

bimodal, persistent

2

0 5 10 15 20 25copy count

2 4 6 85

6

7

L

<nL>

bimodal, localized

3

inputL = 2L = 3L = 4L = 5L = 6L = 7L = 8

0 5 10 15 20 25copy count

2 4 6 88

10

12

L

<nL>

0

0.05

0.1

0.15

0.2

optim

al o

utpu

t, p*

(nL) 4

AC

0 5 10 15 20 250

10

copy count

q n

2 4 6 84

6

8

L

<nL>

5

0 5 10 15 20 25copy count

2 4 6 86789

L

<nL>

6

inputL = 2L = 3L = 4L = 5L = 6L = 7L = 8

0 5 10 15 20 25copy count

2 4 6 85678

L

<nL>

1 2 3 4 5 6 7 8 9 10 11 12 13 14 151

2

3

4

5

6

7

8

9

10

11

12

13

14

15

q−

q +

xxxxxxxxx

xxxxxxx

xxxxxx

xxx

xxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx

xxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxx

xxxxxxxx

xxxxxx

xxxxxxx

xxxxxxx

xxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xx

B

o 1

o2

o 3

o 4o 5o 6

UNIMODAL

BIMODAL

BIMODAL

DC

AC

L*0

1

2

3

4

5

6

7

8

9

10

FIG. 3: Optimal output modality in cascades with unimodalinput. A: Plots of optimal input distribution p∗(n1) (black)and optimal output distributions p∗(nL) (colors; see legend)for different cascade lengths L (optimal input distributions arequalitatively the same; only that for L = 2 is shown), corre-sponding to regulation functions qn (identical for each step)plotted underneath (ρ = 1, and solutions used q = 10 andg = 〈gn〉). Mutual information I is optimized as a function ofthe mean g of the Poisson input distribution. Magenta num-bers on plots correspond to magenta points in B. Insets showplots of average output 〈nL〉 vs. cascade length L. In the firstcolumn of A, the output is always unimodal; in the secondcolumn, the output is bimodal for cascade lengths L ≥ L∗ forsome L∗ (“persistent” bimodality); in the third column, theoutput is bimodal for a range of L values, then unimodal oncemore for large L (“localized” bimodality). The first row shows“DC” cascades, in which each step is up-regulating, and thesecond row shows “AC” cascades, in which each step is down-regulating. B: Phase diagram of optimal output modality asa function of q− and q+ (n0 = 8). White is unimodal, andcolor is bimodal, with color corresponding to L∗. Distinctionbetween persistent (no ‘x’) and localized (‘x’) bimodality isshown up to L = 10. Dashed line separates DC cascades fromAC cascades. Inset: Capacity I∗ in bits as a function of q−and q+ for the same data.

number, and we use the objective function L when opti-mizing over the input distribution p(n1).

All optimal input distributions with Z ≥ 2 are bi-modal, with one mode on either side of the threshold (cf.Fig. 4B). When Z is 3 or more, either all but two πi valuesare driven by the optimization to 0, or all the gi valueswith nonzero weights are driven to one of two uniquevalues. The two modes are roughly equally weighted (i.e.π1 ≈ π2 ≈ 0.5), consistent once more with our calcula-tion that the median of the optimal input distributionfalls roughly at the threshold (cf. Eqn. 10). As an addi-tional verification, a plot of capacity I∗ vs. Z in the insetof Fig. 4B reveals that I∗ remains constant for Z = 2and beyond. A threshold regulation function presents abinary choice, and the optimal input is a bimodal distri-bution that equally utilizes both sides. Lastly, we pointout that the capacities in the inset of Fig. 4B are below 1bit. Even with a bimodal input distribution, a short cas-cade (L = 2), and strong regulation functions (i.e. largediscontinuities ∆), we do not find capacities above 1 bit.This is consistent with our calculation under the approx-imation of threshold regulation as a switch (cf. Eqn. 10)and with the intuition that a threshold represents a bi-nary decision.

CONCLUSIONS

We have introduced a method, the spectral method,which exploits the linear algebraic structure of the mas-ter equation, and expands the full problem in terms of itsnatural eigenfunctions. We have illustrated our methodby probing the optimal transmission properties of signal-ing cascades with threshold regulation. We have shownthat sufficiently long cascades with sufficiently strong reg-ulation functions optimally convert a unimodal input to abimodal output. A bimodal input is optimal for informa-tion transmission across a threshold, and a multimodalinput offers no further processing power. Sustained bi-modality of the input distribution requires large disconti-nuities ∆ between the production rates below and abovethe threshold. The value of ∆ controls the maximum in-formation transmitted by a cascade with threshold reg-ulation in a similar way for cascades of up-regulations(DC) and cascades of down-regulations (AC), but a DCcascade outperforms an AC cascade by using fewer av-erage copies of its species. We emphasize that the ap-plication of the spectral method to signaling cascadesrepresents only a beginning. Variations on the naturalbases in which to expand, and extensions of the methodto other small network topologies, will be the subject offuture work. More generally, however, we anticipate thatthe method will prove useful in the direct solution of alarge class of master equations describing a wide varietyof biological systems.

Page 7: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

7

2 4 6 8 10 12 14 160.05

0.1

0.15

0.2

0.25ca

pacit

y, I*

(bits

)

mean copy count, <n>

A

L = 3L = 4L = 5L = 6L = 7L = 8DCAC

0

0.1

0.2

0.3

0.4

0.5

optim

al in

put,

p*(n

1)

B Z = 1Z = 2Z = 3Z = 4Z = 5Z = 6Z = 7Z = 8

0 5 10 15 20 25 3005

1015

q n

copy count, n1

0 2 4 6 80

0.2

0.4

0.6

0.8

number of input Poissons, Z

capa

city,

I* (b

its)

FIG. 4: A: Capacities of AC and DC cascades. Capacity I∗

vs. average copy count 〈n〉 (over all species) for AC (circles)and DC (plus signs) cascades of different lengths L (color),with Poisson input distributions. Results were obtained byoptimizing the objective function L over all parameters (q−,q+, n0, and g) for λ = 1× 10−6 − 3× 10−2 (n0 is constrainedto be an integer, the regulation function is the same for everystep, ρ = 1, and solutions used q = 10 and g = 〈gn〉). Linesshow convex hulls. B: Optimal input distributions with differ-ent numbers Z of Poisson distributions. The cascade lengthis L = 2, the degradation rate ratio is ρ = 1, and the regula-tion function qn is plotted in the bottom panel; solutions usedq = 10 and g = 〈gn〉. The objective function L is optimizedwith λ = 10−4 over all input parameters gi and πi. InsetCapacity I∗ vs. Z, averaged at each Z over 7 optimizationswith different initial conditions.

SUPPLEMENTARY MATERIAL

The spectral method is fast and accurate

To demonstrate the accuracy and computational effi-ciency of the spectral method, we compare it both to

an iterative numerical solution and to a stochastic sim-ulation (using the varying step Monte Carlo or ‘Gille-spie’ method [15, 16]) of Eqn. 3. Fig. 5A shows theagreement among output distributions pm generated bythe three methods for a cascade of length L = 2 witha Poisson input (gn = g = constant) and the thresh-old regulation function in Eqn. 9. The spectral methodachieves accuracy up to machine precision in ∼0.01 s,which is ∼1000 times faster than the iterative method’sruntime and ∼108 faster than the runtime necessary forthe stochastic simulation to achieve the same accuracy(cf. Fig. 5B). As a measure of error we use the Jensen-Shannon divergence [39] (a measure in bits between twoprobability distributions) between the distribution pnm

generated by the iterative method and that generatedby either the spectral method or the stochastic simula-tion. We plot this measure against the runtime of eithermethod, scaled by the runtime of the iterative method,in Fig. 5B.

Validity of the Markovian approximation

For a cascade of length L, we reduce an L-dimensionalmaster equation to a set of 2-dimensional equationsby employing the Markov approximation, i.e. that eachspecies is conditionally independent of distant nodesgiven proximal nodes (cf. main text). Here we compareresults under this approximation with those from a solu-tion of the full master equation, using both a stochasticsimulation [16] and a non-Markovian implementation ofthe spectral method.

The non-Markovian spectral method is implementedas follows. The full master equation for the process

n1q2(n1)−−−−→ n2

q3(n2)−−−−→ . . .qL(nL−1)−−−−−−→ nL is

p(~n) = gp(~n− e1)− gp(~n) + (n1 + 1)p(~n+ e1)− n1p(~n)

+L∑

`=2

ρ` [q`(n`−1)p(~n− e`)− q`(n`−1)p(~n)

+(n` + 1)p(~n+ e`)− n`p(~n)] , (11)

where time is rescaled by the degradation rate of thefirst species, g is the creation rate of the first species, q`is the creation rate of the `th species, creation rates arenormalized by corresponding degradation rates, ρ` is theratio of the degradation rate of the `th species to thatof the first, ~n = (n1, . . . , nL), and e` represents a 1 inn`th direction. Denoting |j1, . . . , jL〉 as the eigenstate ofspecies ~n at constant rates g, q2, . . . , qL respectively, Eqn.

Page 8: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

8

0

0.05

0.1

0.15

mar

gina

l

A pn (input)pm: spectralpm: iterativepm: stochastic

0 5 10 15 20 25 3005

1015

q n

copy count

10−4 10−2 100 102 10410−8

10−6

10−4

10−2

100

Jens

en−S

hann

on d

iver

genc

e (b

its)

runtime / iterative method’s runtime

Bspectralstochastic

FIG. 5: Demonstration of the spectral method’s accuracy andefficiency. A: Output distributions pm obtained by solvingEqn. 3 with gn = g = constant using the spectral methoddescribed in the main text (blue triangles), an iterative nu-merical method (red circles), and a stochastic simulation [16](green squares). Also shown is the input pn (a Poisson distri-bution with mean g = 8), and the regulation function qn (Eqn.9) with q− = 1, q+ = 13, and n0 = 8; the degradation rateratio is ρ = 1. Spectral solution used q = 10 and mode num-ber cutoffs J = K = 50, and all solutions used copy countcutoffs N = M = 50. B: Jensen-Shannon divergence [39]between pnm obtained using the iterative numerical methodand that obtained using the spectral method (blue triangles)or stochastic simulation (green squares), plotted against thelatter two methods’ respective runtimes. Runtimes are scaledby iterative method’s runtime, 15.9 s. Spectral method dataobtained by varying K from 3 to 12, 589; plateau begins atK ≈ 50. Stochastic simulation data obtained by varying in-tegration time from 100 to 2 × 107, in units of the upstreamgene’s reciprocal degradation rate.

(11) has the spectral decomposition

0 =

(j1 +

L∑`=2

ρ`j`

)Gj1,...,jL

+L∑

`=2

ρ`

∑j′`−1

∆j`−1,j′`−1Gj1,...,j′`−1,j`−1,...,jL , (12)

where Gj1,...,jL = 〈j1, . . . , jL|G〉 andthe deviation operator ∆j`−1,j′`−1

=∑n`−1

[q` − q`(n`−1)] 〈j`−1|n`−1〉〈n`−1|j′`−1〉. Eqn.(12) is solved by iteration, initialized withGj1,...,jL = δj1,0 . . . δjL,0. The joint distribution isobtained via inverse transform:

p(n1, . . . , nL) =∑

j1,...,jL

〈n1|j1〉 . . . 〈nL|jL〉Gj1,...,jL . (13)

For a cascade of length L = 4, with two different val-ues of the discontinuity ∆ = |q+ − q−| (cf. Eqn. 9), Fig-ure 6 compares the marginal distributions calculated un-der the Markov approximation (using both the spectralmethod and an iterative numerical solution) with thosecalculated from the full master equation (using both thenon-Markovian spectral method and a stochastic simula-tion). When ∆ is small there is full agreement betweenthe Markovian and non-Markovian distributions (cf. Fig.6A, with ∆ = 4.5); as ∆ grows, the Markovian distribu-tions begin to deviate from the non-Markovian distribu-tions (cf. Fig. 6B, with ∆ = 8.5). The deviation is morepronouned at higher `, i.e. for species more downstreamin the cascade. We emphasize, however, that importantqualitative features of the distributions, such as modal-ity and locations of the modes, are retained under theapproximation.

A note on degradation rates

The ratio ρ of a downstream gene’s degradation rate tothat of an upstream gene is fixed to 1 in all results in themain text. Increasing ρ is a computationally straight-forward way to obtain, e.g., a bimodal output, since itcorresponds to the case in which the downstream geneequilibrates more quickly than the upstream gene, suchthat the output is simply a weighted sum of distributionspeaked at each of the threshold values q− and q+ (cf. Eqn.9). Specifically, in the ρ → ∞ limit, for the two-genecascade n

qn−→ m, pm = π−e−q−qm

− /m! + π+e−q+qm

+ /m!,where π− =

∑n≤n0

pn and π+ =∑

n>n0pn. However

most degradation rates are dominated by cell division,so degradation rate ratios far from than ∼1 are unreal-istic. Our results demonstrate that even with all speciesoperating on the same timescale, relatively short cascadeswith strong enough regulation provide a information-optimal mechanism of converting a unimodal signal toa bimodal signal.

Page 9: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

9

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35m

argi

nal

A p(n1) (input)p(n2)p(n3)p(n4) (output)

Markov: spectralMarkov: exactfull: spectralfull: Gillespie

0 2 4 6 8 10 12 14 16 18 200

5

10

15

q n

copy count

0

0.05

0.1

0.15

mar

gina

l

B p(n1) (input)p(n2)p(n3)p(n4) (output)

Markov: spectralMarkov: exactfull: spectralfull: Gillespie

0 5 10 15 20 25 300

5

10

15

q n

copy count

FIG. 6: Marginal distributions from a Markovian and a non-Markovian stochastic description of a cascade. A: Marginaldistributions for the species in a cascade of length L = 4 (col-ors indicate order of species in the cascade; see legend), withthreshold regulation function qn shown in the bottom panel(q− = 0.5, q+ = 5, and n0 = 7). Distributions were obtainedusing the Markov approximation (cf. main text), either viathe spectral method (dots) or via an iterative numerical so-lution (plus signs), and using the full master equation (Eqn.11), either via the non-Markovian spectral method (circles)or via stochastic simulation [16] (squares). B: As A, but withq+ = 9.

It is a pleasure to acknowledge Jake Hofman andMelanie Lee for their help in this project. C.W.was supported by NIH 5PN2EY016586-03 and NIH1U54CA121852-01A1; C.W. and A.M. were supported byNSF ECS-0332479; A.M. was supported by NSF DGE-0742450.

[email protected][email protected][email protected]

[1] Hooshangi, S, Thiberge, S, Weiss, R (2005) Ultrasensitiv-ity and noise propagation in a synthetic transcriptional

cascade. Proc Natl Acad Sci USA 102:3581–6.[2] Thattai, M, van Oudenaarden, A (2002) Attenuation

of noise in ultrasensitive signaling cascades. Biophys J82:2943–50.

[3] Paulsson, J, Berg, OG, Ehrenberg, M (2000) Stochasticfocusing: fluctuation-enhanced sensitivity of intracellularregulation. Proc Natl Acad Sci USA 97:7148–53.

[4] Elowitz, MB, Levine, AJ, Siggia, ED, Swain, PS (2002)Stochastic gene expression in a single cell. Science297:1183–6.

[5] Ozbudak, EM, Thattai, M, Kurtser, I, Grossman, AD,van Oudenaarden, A (2002) Regulation of noise in theexpression of a single gene. Nat Genet 31:69–73.

[6] Swain, PS, Elowitz, MB, Siggia, ED (2002) Intrinsic andextrinsic contributions to stochasticity in gene expres-sion. Proc Natl Acad Sci USA 99:12795–800.

[7] Acar, M, Becskei, A, van Oudenaarden, A (2005) En-hancement of cellular memory by reducing stochastictransitions. Nature 435:228–32.

[8] Pedraza, JM, van Oudenaarden, A (2005) Noise propa-gation in gene networks. Science 307:1965–9.

[9] Thattai, M, van Oudenaarden, A (2001) Intrinsic noisein gene regulatory networks. Proc Natl Acad Sci USA98:8614–9.

[10] van Zon, JS, Morelli, MJ, Tanase-Nicola, S, ten Wolde,PR (2006) Diffusion of transcription factors can dras-tically enhance the noise in gene expression. Biophys J91:4350–67.

[11] van Zon, JS, ten Wolde, PR (2005) Green’s-function reac-tion dynamics: a particle-based approach for simulatingbiochemical networks in time and space. J Chem Phys123:234910.

[12] Allen, RJ, Warren, PB, ten Wolde, PR (2005) Samplingrare switching events in biochemical networks. Phys RevLett 94:18104.

[13] Macnamara, S, Burrage, K, Sidje, RB (2007) Multiscalemodeling of chemical kinetics via the master equation.Multiscale Model & Simul 6:1146–1168.

[14] Ushikubo, T, Inoue, W, Yoda, M, Sasai, M (2006) Test-ing the transition state theory in stochastic dynamics ofa genetic switch. Chem Phys Lett 430:139–43.

[15] Bortz, AB, Kalos, MH, Lebowitz, JL (1975) A new algo-rithm for Monte Carlo simulation of Ising spin systems.J Comput Phys 17:10–8.

[16] Gillespie, DT (1977) Exact stochastic simulation of cou-pled chemical reactions. J Phys Chem 81:2340–2361.

[17] Paulsson, J (2004) Summing up the noise in gene net-works. Nature 427:415–8.

[18] Tanase-Nicola, S, Warren, PB, ten Wolde, PR (2006)Signal detection, modularity, and the correlation be-tween extrinsic and intrinsic noise in biochemical net-works. Phys Rev Lett 97:68102.

[19] Walczak, AM, Onuchic, JN, Wolynes, PG (2005) Abso-lute rate theories of epigenetic stability. Proc Natl AcadSci USA 102:18926–31.

[20] Lan, Y, Papoian, GA (2006) The interplay between dis-crete noise and nonlinear chemical kinetics in a signalamplification cascade. J Chem Phys 125:154901.

[21] Lan, Y, Wolynes, PG, Papoian, GA (2006) A variationalapproach to the stochastic aspects of cellular signal trans-duction. J Chem Phys 125:124106.

[22] Lan, Y, Papoian, GA (2007) Stochastic resonant signal-ing in enzyme cascades. Phys Rev Lett 98:228301.

[23] Gomperts, BD, Kramer, IM, Tantham, PER (2002) Sig-

Page 10: arXiv:0811.4149v1 [q-bio.MN] 25 Nov 2008 · Andrew Muglery Department of Physics, Columbia University, New York, NY 10027 Chris H. Wigginsz Department of Applied Physics and Applied

10

nal transduction (San Diego, CA: Academic Press).[24] Ting, AY, Endy, D (2002) Decoding NF-kappaB signal-

ing. Science 298:1189–90.[25] Detwiler, PB, Ramanathan, S, Sengupta, A, Shraiman,

BI (2000) Engineering aspects of enzymatic signal trans-duction: photoreceptors in the retina. Biophys J 79:2801–17.

[26] Bolouri, H, Davidson, EH (2003) Transcriptional regu-latory cascades in development: initial rates, not steadystate, determine network kinetics. Proc Natl Acad SciUSA 100:9371–6.

[27] Bassler, BL (1999) How bacteria talk to each other: reg-ulation of gene expression by quorum sensing. Curr OpinMicrobiol 2:582–7.

[28] van Kampen, NG (1992) Stochastic processes in physicsand chemistry (Amsterdam: North-Holland).

[29] Mattis, DC, Glasser, ML (1998) The uses of quantumfield theory in diffusion-limited reactions. Rev Mod Phys70:979–1001.

[30] Zel’Dovich, YB, Ovchinnikov, AA (1978) The mass ac-tion law and the kinetics of chemical reactions with al-lowance for thermodynamic fluctuations of the density.Sov JETP 47:829.

[31] Doi, M (1976) Second quantization representation forclassical many-particle system. J Phys A: Math Gen9:1465–77.

[32] Sasai, M, Wolynes, PG (2003) Stochastic gene expres-sion as a many-body problem. Proc Natl Acad Sci USA100:2374–9.

[33] Shannon, CE (1949) Communication in the presence ofnoise. Proc IRE 37:10–21.

[34] Tkacik, G, Callan, CG, Bialek, W (2008) Informationcapacity of genetic regulatory elements. Phys. Rev. E78:11910.

[35] Tkacik, G, Callan, CG, Bialek, W (2008) Informationflow and optimization in transcriptional regulation. ProcNatl Acad Sci USA 105:12265–70.

[36] Cover, TM, Thomas, JA (1991) Elements of InformationTheory (New York, NY: John Wiley and Sons).

[37] Gregor, T, Tank, DW, Wieschaus, EF, Bialek, W(2007) Probing the limits to positional information. Cell130:153–64.

[38] Ziv, E, Nemenman, I, Wiggins, CH (2007) Optimal sig-nal processing in small stochastic biochemical networks.PLoS ONE 2:e1077.

[39] Lin, J (1991) Divergence measures based on the Shan-non entropy. Information Theory, IEEE Transactions on37:145–151.

[40] A master equation in which the coordinate appears ex-plicitly (e.g. through qs or qn in Eqn. 1) is sometimesmistakenly termed “nonlinear” in the literature, perhapsdiscouraging calculations which exploit its inherent lin-ear algebraic structure. We remind the reader that thisequation is perfectly linear.

[41] The recursion can equivalently be performed in the re-

verse direction, with gN = 0, gN−1 = NpN/pN−1, andgn−1 = (gnpn + npn − (n + 1)pn+1)/pn−1, where N is acutoff in n. In several test cases we found that reverse re-cursion is more numerically stable than forward recursionat large N .

[42] Setting x = eik1 and y = eik2 makes clear that the gen-erating function is simply the Fourier transform.

[43] G(x, y) can be recovered by projecting onto positionspace 〈x, y|, with 〈x|n〉 = xn and 〈y|m〉 = ym.

[44] The adjoint operations are 〈i|b+i = 〈i− 1| − 〈i| for

i ∈ {n,m}, 〈n|b−n (n) = (n + 1)〈n+ 1| − gn〈n|, and

〈n,m|b−m(n) = (m+ 1)〈n,m+ 1| − qn〈n,m|.[45] In position space the eigenfunctions are 〈x|j〉 = (x −

1)jeg(x−1) and 〈y|k〉 = (y− 1)keq(y−1). The operators b+

and b− raise and lower in eigenspace: b+n |j〉 = |j + 1〉and b−n |j〉 = j|j − 1〉 (or 〈j|b+n = 〈j − 1| and 〈j|b−n =(j + 1)〈j + 1|), and similarly for n→ m and j → k.

[46] The selection rules are derived by starting with 〈n|b+n |j〉or 〈j|b+n |n〉 and allowing b+n to act both to the left andto the right. Alternatively, one may use b−n , obtaining〈n+ 1|j〉 = (j〈n|j − 1〉+ g〈n|j〉)/(n+ 1) and 〈j + 1|n〉 =(n〈j|n− 1〉− g〈j|n〉)/(j+1), initialized with 〈n = 0|j〉 =(−1)je−g and 〈j = 0|n〉 = 1. We find the latter relationsyield smoother distributions pnm for large cutoffs N andJ .

[47] The choices of g and q can affect the numerical stabilityof the method, as will be discussed in future work.

[48] More precisely, we have turned an N2×N2 matrix solve(whereN is a cutoff in copy count) intoK length-J vectorsolves (where J and K are cutoffs in eigenmodes j and krespectively).

[49] All numerical procedures in the paper are implementedin MATLAB.

[50] Although the spectral method only involves the calcula-tion of joint distributions between adjacent species in thecascade, the input-output distribution can be obtainedusing p(n1, n`) =

Pn`−1

p(n1, n`−1)p(n`−1, n`)/p(n`−1),

initialized with ` = 3 and run up to ` = L. This assumesp(n1|n`−1, n`) = p(n1|n`−1), which at worst (at ` = 3) isequivalent to the Markovian approximation, Eqn. 2.

[51] There is a slight decrease in I∗ with 〈n〉 beginning near〈n〉 ≈ 8 that is more pronounced at higher L. This islikely due to the decrease in accuracy of the Markovianapproximation with increasing ∆ (cf. Supplementary Ma-terial), since large 〈n〉 requires large ∆. Calculations withthe full joint distribution (via stochastic simulation) atL = 3 and 4 give qualitatively similar results, but withI∗ increasing monotonically with 〈n〉.

[52] The decrease of I∗ with L is consistent with, but not adirect consequence of, the data processing inequality [36],as each p∗(n1, . . . , nL) results from a separate optimiza-tion for each subsequent choice of L.