arxiv:2105.08681v3 [physics.bio-ph] 23 jul 2021

Estimating entropy production from waiting time distributions

Dominic J. Skinner1 and Jorn Dunkel1

1Department of Mathematics, Massachusetts Institute of Technology, Cambridge Massachusetts 02139-4307, USA(Dated: July 26, 2021)

Living systems operate far from thermal equilibrium by converting the chemical potential of ATPinto mechanical work to achieve growth, replication or locomotion. Given time series observationsof intra-, inter- or multicellular processes, a key challenge is to detect non-equilibrium behavior andquantify the rate of free energy consumption. Obtaining reliable bounds on energy consumption andentropy production directly from experimental data remains difficult in practice as many degreesof freedom typically are hidden to the observer, so that the accessible coarse-grained dynamicsmay not obviously violate detailed balance. Here, we introduce a novel method for bounding theentropy production of physical and living systems which uses only the waiting time statistics ofhidden Markov processes and hence can be directly applied to experimental data. By determininga universal limiting curve, we infer entropy production bounds from experimental data for generegulatory networks, mammalian behavioral dynamics and numerous other biological processes.Further considering the asymptotic limit of increasingly precise biological timers, we estimate thenecessary entropic cost of heartbeat regulation in humans, dogs and mice.

Living systems break time reversal symmetry [1, 2].Growth [3], self-replication [2] and locomotion [4] are allirreversible processes, violating the principle of detailedbalance, which states that every forward microstate tra-jectory is as likely to occur as its time reversed coun-terpart [1, 5–7]. Irreversibility places thermodynamicconstraints on sensing [8, 9], reproduction [2], and sig-naling [10], and trade-offs between free energy consump-tion and the precision, speed, or accuracy of perform-ing some function [8, 11]. To quantify these constraints,one has to bound the rate at which free energy isconsumed, or equivalently the entropy production rate(EPR) [5, 12, 13]. However, most experimental measure-ments can only observe a small number of degrees of free-dom [13–15], making such inference challenging. In par-ticular, many systems are only accessible at the coarsestpossible level of two states, such as a gene switching [16],or a molecule docking and undocking to a sensor [9, 17].In these cases, commonly used inference tools cannot beapplied, due to the absence of observable coarse-grainedcurrents [18–21], and as the forward and reverse trajec-tories need not be asymmetric [22–25], even if the under-lying system operates far from equilibrium.

Here, we introduce a broadly applicable method thatmakes it possible to estimate the EPR of a coarse-grainedtwo-state dynamics by measuring the variance of timespent in a state. This is achieved by finding a canonicalformulation and then solving a numerical optimizationproblem to obtain a universal limiting curve, above whichthe EPRs of all systems with the observed waiting timestatistics must lie. We illustrate that this method outper-forms a recently introduced thermodynamic uncertaintyrelation (TUR) [9, 18, 20, 26] on synthetic test data, anddemonstrate its practical usefulness in applications to ex-perimental data for a gene regulatory network [16], thebehavioral dynamics of cows [27], and several other bio-logical processes (Table S1 [28]). Furthermore, by consid-

ering a stochastic timer [29–32], we derive an asymptoticformula relating the waiting time variance and the EPR.This analytical result can be used to bound the EPR re-quired to maintain precise biophysical timing processes,which we illustrate on data from experimental measure-ments [33, 34] of heartbeats of humans, dogs and mice.

We start from the standard assumption [5] that meso-scale systems in contact with a heat bath can be de-scribed by a Markovian stochastic dynamics on a finiteset of discrete states {1, . . . , NT }. Transitions from statei to j occur at rate Wij , so a probability distribution overthe states, pi(t), evolves according to

d

dtpi =

∑

j

pjWji, (1)

with Wii = −∑j 6=iWij [5, 35]. Assuming that the sys-tem is irreducible, meaning there exists a path with non-zero probability between any two states, Eq. (1) has aunique stationary state π = (πi), satisfying 0 = πW , andwhich every initial probability distribution tends to [36].The EPR at steady state π is [20],

σ = kB∑

i<j

(πiWij − πjWji) log

(πiWij

πjWji

), (2)

which for isothermal systems, when multiplied by thebath temperature, is the rate of free energy dissipation re-quired to maintain the state away from equilibrium [37].The system is in thermal equilibrium when the EPR van-ishes, which is exactly when detailed balance is satisfied,πiWij = πjWji, for all i, j.

Only a coarse-grained view of the system is typi-cally available in experiments [24, 39], with the observeddynamics taking place on a set of metastates, wherea metastate I may contain several underlying states i(Fig. 1). Here, we focus on the common case where onlytwo complementary metastates, A and B, are accessible

arX

iv:2

105.

0868

1v3

[ph

ysic

s.bi

o-ph

] 2

3 Ju

l 202

1

2

A

Time spent in A

Time spent in B

Prob

abili

tyde

nsity

Prob

abili

tyde

nsityB

FIG. 1. A Markovian process on discrete states (black cir-cles), observed at a coarse-grained level, with only the metas-tates A and B visible to the observer (left). From observingthis system we can deduce the distribution of time spent inmetastates A and B (right). A non-monotonic waiting timedistribution signals [38] out-of-equilibrium dynamics.

and the observed trajectory jumps between them (Fig. 1).After observing many such jumps, the distributions fA(t)and fB(t) of time spent in A and B can be empirically re-constructed (Fig. 1). As the coarse-grained observationsare non-Markov [40], there may be additional informa-tion in conditional statistics [15, 39], but these are hardto measure experimentally and we do not consider themhere. At first glance, the distributions fA(t) and fB(t)alone may not appear to contain much information aboutthe underlying system; however, for an equilibrium sys-tem both fA(t) and fB(t) must decay monotonically [38],implying that the schematic example in Fig. 1 reflects anout-of-equilibrium dynamics.

To quantify the extent to which an observed two-statesystem is out-of-equilibrium, we reformulate the problemof EPR estimation within an optimization framework, ex-tending the approach introduced in Ref. [39]. Calling theunderlying system S, the EPR is σ(S), and the observedwaiting time distributions are fA(t,S) and fB(t,S). IfR describes some other underlying system with the sameobservables, we cannot know which is the true system.However, we can find a lower bound on the EPR by min-imizing over all such R,

σ(S) ≥ minR{σ(R)|fA(t,S) = fA(t,R),

fB(t,S) = fB(t,R)}, (3)

which holds, since S is contained in the set on the right.By construction, this is the tightest possible lower boundgiven the observed distributions fA and fB . Limited ex-perimental data may prevent us from precisely measur-ing the full distribution, but we can typically measurethe first few moments 〈tn〉A and 〈tn〉B . We therefore

introduce the estimators σ(n)T ,

σ(S) ≥ minR{σ(R)| 〈tk〉I,R = 〈tk〉I,S ,

for I = A,B, k = 1, . . . n} ≡ σ(n)T , (4)

where σ(n)T ≤ σ(n+1)

T , since increasing n decreases the sizeof the set over which minimization is performed.

Instead of specifying the transition rates of a sys-tem R, we can alternatively specify the net transitionrates, nij = πiWij , for i 6= j, provided they satisfy massconservation at each vertex,

∑j nij =

∑j nji, and inde-

pendently specify the stationary state, πi. With nij fixed,modifying πi does not affect the EPR. In particular, sup-pose that we have found the optimal πi, nij , subject tothe constraints, then the fraction of time spent in A isrA =

∑i∈A πi, and similarly rB =

∑i∈B πi. Rescaling

πi 7→ πi/2rA for i ∈ A and πi 7→ πi/2rB for i ∈ B,does not affect the EPR, but makes the fraction of timespent in each metastate exactly 1/2. The statistics arerescaled as 〈tk〉I 7→ (1/2rI)k〈tk〉I for I = A,B. Theaverage time spent in A or B in the rescaled system isτ = (〈t〉A + 〈t〉B)/2. A second rescaling nij 7→ τnij withπ fixed, changes EPR as σ 7→ τσ, and the moments as〈tk〉I 7→ 〈tk〉I/τk. Therefore, we can rewrite Eq. (4) as

σ(S) ≥ 1

τminR{σ|〈tk〉I,R = 〈tk〉I,S/〈t〉kI,S , (5)

for I = A,B, k = 2, . . . , n},allowing us to optimize over a single canonical systemand rescale to bound any other non-canonical system.

After the canonical rescaling, to express constraintsin terms of the transition rates, we label the statesso that the first N belong to the metastate A, with1 ≤ N < NT , and define (WA)ij = Wi,j for i, j ≤ N ,so WA represents the transitions within A. We alsowrite π = (πA, πB), with πA the first N componentsof the stationary distribution. For any such Marko-vian process in the stationary distribution, we have thatfA(t) = πAW

2A exp(WAt)1

T /(−πAWA1T ) [28, 36], and

similar for fB(t). For the canonical system, 〈t〉A,R = 1,so −πAWA1

T = 1/2, and higher moment constraints be-come

〈tk〉A,R = 2(−1)k+1k!πAW1−kA 1T , (6)

and similar for B. While this provides a general mini-mization framework, from now on we will only imposethe constraint on 〈t2〉A,R, which allows us to minimizeover systems with B consisting of a single state [28],and results in a single curve. Specifically, we minimizeover πi > 0, nij ≥ 0, subject to the linear constraintsnii = −∑j 6=i nij ,

∑j 6=i nij =

∑j 6=i nji, together with

〈t〉A,R = 〈t〉B,R = 1, and 〈t2〉A,R fixed, finding a curvekBΛ(〈t2〉A,R) in the σ–〈t2〉A,R plane, or equivalently theσ–Var tA,R plane, above which all Markovian systemsmust lie [Fig. 2(a)]. For arbitrary systems, this boundbecomes

σT =

[2kB

〈t〉A + 〈t〉B

]Λ

(Var tA〈t〉2A

), (7)

which we use throughout to bound the EPR. The func-tion Λ(x) can be computed numerically [28]; it converges

3

Normalized entropy production rate στ/kB

Wai

ting

time

varia

nce

Var

(tA) ⁄

<t>

A2

N=2

N=3

0 1 2 3 4 5

0.4

0.6

0.8

1

Time

ActiveInactive

mRNA

Protein

0 2 40

0.8

1.6Indoor beef cowOutdoor beef cowDairy cow

Lying time (h)

Prob

abili

ty d

ensi

ty

(b)

(c)

(a)

Bmal1

GlutaminaseDairy cow

Outdoor beef cow

Indoor beef cow

N=4

0.2

0.4

0.6

Prob

abili

ty d

ensi

ty

Time in inactive state (h)00 2 4 6

Bmal1Glutaminase

FIG. 2. Bounding the EPR from a single limiting curve. (a) All realizable systems lie above the curve Λ (black) whichrepresents the optimal variance-entropy production trade-off. Minimizing over increasing numbers of internal states leads torapid convergence, with results shown for up to 8 internal states (green). Given measured values of the waiting time variance, alower bound on entropy production can be read off, as demonstrated here for experimental data in (b, dashed) and (c, dotted).(b) Genes stochastically switch from being active to inactive, mRNA is produced when the gene is active, proteins are producedwhen mRNA is present and both stochastically degrade (left). The distribution of time the gene spends in the inactive state asmeasured in recent experiments, for the genes Glutaminase and Bmal1 (right, see also Fig. 2C in Ref. [16]). (c) Distributionof time cows spent lying before standing across three experiments from Ref. [27].

rapidly as the number N of internal states of A is in-creased [Fig. 2(a)]. Given the tabulated values of Λ, wecan apply Eq. (7) directly to experimental data. Beforedoing so, it is instructive to demonstrate the usefulnessof σT in simulations of an active sensor.

To perform vital functions like chemotaxis, cellssense chemical concentration levels within their environ-ment [8] through the stochastic binding and unbindingof molecules at surface receptors [9] [Fig. 3(a)]. Recentworks [8, 9, 41] showed that active sensors can overcomethe equilibrium Berg-Purcell sensing limit by expendingenergy, with a trade-off between increased sensing accu-racy, and energy expended. We apply Eq. (7) to a simplemodel of an active sensor [9], with the receptor systemmodeled as a ring with 5 states, one of which correspondsto an unbound receptor, and with 4 internal states cor-responding to a bound receptor [Fig. 3(a)]. The rate ofclockwise transitions is taken to be W+, with counter-clockwise transitions occuring at rate W− < W+, result-ing in an EPR of σ = (W+ −W−) log (W+/W−). The σTbound is reasonably close to the exact value σ in interme-diate regimes of entropy production [Fig. 3(c)]. For smallvalues of σ, the same variance could be generated by anequilibrium process, whereas for large values the variancecould be generated at much lower cost by a system withmore internal states [28]. The original thermodynamicuncertainty relation (TUR) [18], whilst a powerful toolfor inference [19, 20], cannot gain nontrivial bounds forthis system, as there are no observed currents. A morerecently introduced TUR [9, 28], relating the mean andvariance of the fraction of time spent in a metastate to

the EPR, is the only existing estimator which we areaware of that can yield a nontrivial bound. Figure 3(c)shows that the TUR bound is substantially lower, andfor a finite sample size has a much larger variance, thusrequiring a large amount of data for a reliable predictioncompared to σT .

As the first of many experimental applications of σT ,we consider stochastic gene regulatory networks [42, 43].Cells regulate the intra-cellular concentration of proteins,changing the concentration in response to external stim-uli, or maintaining a constant level by balancing pro-duction and degradation [43]. Proteins are created byenzymes translating mRNA, which in turn is transcribedby enzymes from DNA instructions or genes [44]. Genesswitch between active and inactive metastates, the rateof switching regulated by protein concentration, amongstother things, and whilst active mRNA is transcribed [42][Fig. 2(b)]. By examining the distribution of timeswhich a gene spends inactive, we can deduce that theregulatory system is out-of-equilibrium. Recent experi-ments [16] on mammalian gene transcription measuredthe times for which certain genes were active or inactive.From the histogram of inactivity periods [Fig. 2(b)] forthe Glutaminase gene, one finds 〈t2〉off/〈t〉2off = 1.6 andτ = 0.9h, so that σ ≥ 2.2kB/h, whereas for the Bmal1promotor, 〈t2〉off/〈t〉2off = 1.5 and τ = 1.1h, resulting inσ ≥ 2.4kB/h.

From complex neural networks to simple feedbackloops, living systems make decisions on which behaviorto execute, and such decision making requires the expen-diture of free energy [45]. Recent experiments identified

4

(a)

Time spent bound

Prob

abili

ty d

ensi

ty

Exact entropy production rate

Estim

ated

ent

ropy

pr

oduc

tion

rate

00

2

2

4

4

(b) (c)

00 5 10

0.25

0.5 0

1

234

W- W+

Bound

Unbound

TUR

Tσ

FIG. 3. Bounding the EPR of a model active sensor. (a) Cellssense through receptors on their surface, which detect concen-tration of some molecular species (black squares) by switchingbetween unbound (top left) and bound (bottom left) configu-rations. We consider a simple model of a sensor, with 4 inter-nal states for the bound receptor configuration [9]. Clockwisetransitions occur at a rate W+, counter-clockwise with rateW−. (b) Distribution of time spent bound for σ = 0, 1, 2, 3, 4with kB = 1 = W+, W− ≤ 1. (c) Estimates of the EPR asa function of the exact EPR, σ, as calculated from 200 tra-jectories of length T = 2000 for each value of σ. The σT andTUR estimators are shown with 95% range of predictions.

behavioral states of different organisms [46, 47]. Fromdynamics on these metastates, it is possible to boundthe EPR [47]. By attaching sensors to cows, the au-thors of Ref. [27] recorded whether the cows were stand-ing or lying, as well as the waiting time distributionof each [Fig. 2(c)]. From the time the cows spend ly-ing (metastate A), we can calculate a non-zero boundon the EPR. Specifically, for the three experiments in-volving pregnant indoor-housed beef cows, out-winteredbeef cows, and indoor-housed dairy cows [27], we findσT = 4.1kB/h, 3.2kB/h, 2.4kB/h, respectively. Assum-ing kBT ≈ 9.9× 10−22 Cal, we can therefore deduce thatthe cows consume at least 2.4×10−21 Cal/h, in decidingwhether to lie or stand. While this is a significant under-estimate, the σT -estimator does not assume any specificmodel of the decision making process, and so does notrequire the cows to ‘possess spherical symmetry’ [48].

The systems considered so far could be interpreted astimers; by paying an energetic cost they control the timespent in a subset of states more precisely than an equi-librium system would allow. Previous studies of biolog-ical clocks assume measurements of time are performedby counting the number of full cycles around a circu-lar network topology [29], although such analysis can beextended to more general biological oscillators [30, 32].However, the resulting TUR bound cannot be straight-forwardly converted into bounds on Var tA, and moreover

the networks which saturate the TUR bound do no betterthan equilibrium systems at minimizing Var tA [28]. Toelucidate the relationship between σ and Var tA, one canneglect the dynamics of metastate B by letting 〈t〉B → 0,so that σ ≥ (2kB/〈t〉A)Λ(Var tA/〈t〉2A). Our particulargoal is to estimate a bound on the EPR of precise timersexhibiting small variance Var tA/〈t〉2A � 1, such as heartbeats.

Unfortunately, it is not feasible to explore the limitVar tA → 0 numerically, as for a fixed number, N , ofstates in A, the largest precision possible is Var tA/〈t〉2A ≥1/N , and numerical minimization becomes prohibitivelyexpensive for N & 20 [28]. To gain analytical insight,we consider the continuum limit of an infinite numberof internal states, N → ∞, such that the dynamics inA can be represented by a Langevin equation [28, 50].A continuous system can be approximated arbitrarilywell by a discrete system [28, 39, 51], but a general dis-crete system need not be well approximated by a con-tinuous Langevin equation [28, 52–54], so by searchingover the space of Langevin dynamics we will find an up-per bound on the true optimal precision vs. entropytrade-off curve [28]. In the limit of infinite precision,Var tA/〈t〉2A → 0, we find analytically [28] the asymp-totic relation, Var tA/〈t〉2A = 1/σ+4 ln σ/σ2 +o(ln σ/σ2),where σ = σ〈t〉A/2kB ; achieving this in practice wouldrequire many internal states, a common trade-off instochastic networks [30, 55, 56]. We apply this asymp-totic prediction to heart beat data [57] for humans [34]and other species [33]. Although the precision of the ob-served beating patterns renders numerical σT -estimatesunfeasible, the asymptotic formula implies entropic costsof at least 280kB/s, 480kB/s, 270kB/s, 2360kB/s to main-tain the heartbeats for young humans, older humans,dogs, and mice, respectively (Fig. 4).

In addition to the systems discussed above, non-equilibrium waiting time distributions have been mea-sured for switching processes in cell fate decision mak-ing [58], dwell times of kinesin motors [59, 60], clusterlifetimes of RNA polymerase [61], the directional switch-ing of the bacterial motor [38, 62], swim-turn dynamics ofalgae and bacteria [63, 64], direction reversal in swarm-ing bacteria [65–67], the repolarization times of migrat-ing cancer cells [68], visual perception switching betweenmetastable states [69], and the duration of animal andinsect flights [70, 71]. The framework introduced heremakes it possible to bound the extent to which thesesystems are out-of-equilibrium (Table S1) and, thereby,to quantify the trade-offs which biological systems areforced to make.

This work was supported by a MathWorks Fellow-ship (D.J.S.), a James S. McDonnell Foundation Com-plex Systems Scholar Award (J.D.), and the Robert E.Collins Distinguished Scholar Fund (J.D.). We thankthe MIT SuperCloud and Lincoln Laboratory Supercom-puting Center for providing HPC resources. The source

5

Wai

ting

time

varia

nce

Var

(tA) ⁄

<t>

2 A

Normalized entropy production rate στ/kB

0 1 2

0

2

4

6

0 1 2

-0.2

0

0.2

0.4

0 1 2-2

-1

0

0 0.5 1 1.50

1

2

3

4

0 0.5 1 1.50

1

2

3

4

0 0.5 1 1.50

1

2

3

4

Human Dog Mouse(b)

(c)

(a)

Normalized time between beats

Time (s)



Time (s) Time (s)

Prob

abili

ty d

ensi

ty

Prob

abili

ty d

ensi

ty

Prob

abili

ty d

ensi

ty

Am

plitu

de (m

V)

Am

plitu

de (m

V)

Am

plitu

de (m

V)

N=2

N=3N=4N=5

N=19

Discrete Markov

ExtrapolationAsymptotic

10-19 year old human heartbeat

40-49 year old human heartbeat

mouse heartbeat

100 101

dog heartbeat

10-2

10-1

100

102

FIG. 4. EPR in the small variance limit. (a) Numerical optimization (green) is only feasible for small σ, although extrapolatingto N = ∞ extends this range (purple). Alternatively, the asymptotic limit of a continuous Langevin system extends beyondthe range of finite networks (dashed). Using the asymptotic relation, we can examine the entropic cost of building an oscillatoras precise as a heartbeat across different species (grey dotted). (b) Typical ECG measurements of heartbeats for humans, dogsand mice [33, 49]. (c) Distribution of time between beats, scaled so the mean time is 1, from a single experiment from eachspecies.

code is available at https://github.com/Dom-Skinner/EntropyProductionFromWaitingTimes.

[1] J. M. R. Parrondo, C. V. den Broeck, and R. Kawai, NewJ. Phys. 11, 073008 (2009).

[2] J. L. England, J. Chem. Phys. 139, 121923 (2013).[3] P. Wang, L. Robert, J. Pelletier, W. L. Dang, F. Taddei,

A. Wright, and S. Jun, Curr. Biol. 20, 1099 (2010).[4] J. A. Nirody, Y.-R. Sun, and C.-J. Lo, Adv. Phys.: X 2,

324 (2017).[5] U. Seifert, Rep. Prog. Phys. 75, 126001 (2012).[6] C. Maes, F. Redig, and M. Verschuere, J. Stat. Phys.

106, 569 (2002).[7] C. Battle, C. P. Broedersz, N. Fakhri, V. F. Geyer,

J. Howard, C. F. Schmidt, and F. C. MacKintosh, Science352, 604 (2016).

[8] G. Lan, P. Sartori, S. Neumann, V. Sourjik, and Y. Tu,Nat. Phys. 8, 422 (2012).

[9] S. E. Harvey, S. Lahiri, and S. Ganguli,arXiv:2002.10567.

[10] K. Thurley, S. C. Tovey, G. Moenke, V. L. Prince,A. Meena, A. P. Thomas, A. Skupin, C. W. Taylor, andM. Falcke, Sci. Signal. 7, ra59 (2014).

[11] Y. Cao, H. Wang, Q. Ouyang, and Y. Tu, Nat. Phys. 11,772 (2015).

[12] U. Seifert, Phys. Rev. Lett. 95, 040602 (2005).[13] U. Seifert, Ann. Rev. Cond. Matter Phys. 10, 171 (2019).[14] M. Esposito, Phys. Rev. E 85, 041125 (2012).[15] G. Teza and A. L. Stella, Phys. Rev. Lett. 125, 110601

(2020).[16] D. M. Suter, N. Molina, D. Gatfield, K. Schneider,

U. Schibler, and F. Naef, Science 332, 472 (2011).[17] M. Skoge, S. Naqvi, Y. Meir, and N. S. Wingreen, Phys.

Rev. Lett. 110, 248102 (2013).

[18] T. R. Gingrich, J. M. Horowitz, N. Perunov, and J. L.England, Phys. Rev. Lett. 116, 120601 (2016).

[19] J. Li, J. M. Horowitz, T. R. Gingrich, and N. Fakhri,Nat. Commun. 10, 1666 (2019).

[20] J. M. Horowitz and T. R. Gingrich, Nat. Phys. 16, 15(2020).

[21] M. Polettini and M. Esposito, Phys. Rev. Lett. 119,240601 (2017).

[22] F. S. Gnesotto, G. Gradziuk, P. Ronceray, and C. P.Broedersz, Nat. Commun. 11, 5378 (2020).

[23] E. Roldan and J. M. R. Parrondo, Phys. Rev. Lett. 105,150607 (2010).

[24] I. A. Martınez, G. Bisker, J. M. Horowitz, and J. M. R.Parrondo, Nat. Commun. 10, 3542 (2019).

[25] E. Roldan, J. Barral, P. Martin, J. M. R. Parrondo, andF. Julicher, arXiv:1803.04743.

[26] T. Van Vu, V. T. Vo, and Y. Hasegawa, Phys. Rev. E101, 042138 (2020).

[27] B. J. Tolkamp, M. J. Haskell, F. M. Langford, D. J.Roberts, and C. A. Morgan, Appl. Animal Behav. Sci.124, 1 (2010).

[28] Supplementary Information.[29] A. C. Barato and U. Seifert, Phys. Rev. X 6, 041053

(2016).[30] R. Marsland, W. Cui, and J. M. Horowitz, J. Royal Soc.

Interface 16, 20190098 (2019).[31] A. N. Pearson, Y. Guryanova, P. Erker, E. A. Laird,

G. A. D. Briggs, M. Huber, and N. Ares, Phys. Rev. X11, 021029 (2021).

[32] C. del Junco and S. Vaikuntanathan, J. Chem. Phys.152, 055101 (2020).

[33] J. A. Behar, A. A. Rosenberg, I. Weiser-Bitoun,O. Shemla, A. Alexandrovich, E. Konyukhov, andY. Yaniv, Front. Physiol. 9, 1390 (2018).

[34] K. Umetani, D. H. Singer, R. McCraty, and M. Atkinson,J. Am. Coll. Cardiol. 31, 593 (1998).

[35] N. G. Van Kampen, Stochastic processes in physics and

6

chemistry, Vol. 1 (Elsevier, 1992).[36] G. Rubino and B. Sericola, J. Appl. Prob. 26, 744 (1989).[37] H. Ge and H. Qian, Phys. Rev. E 87, 062125 (2013).[38] Y. Tu, Proc. Natl Acad. Sci. U.S.A. 105, 11737 (2008).[39] D. J. Skinner and J. Dunkel, Proc. Nat. Acad. Sci. U.S.A.

118, e2024300118 (2021).[40] A. M. Jurgens and J. P. Crutchfield, J. Stat. Phys. 183,

32 (2021).[41] A. H. Lang, C. K. Fisher, T. Mora, and P. Mehta, Phys.

Rev. Lett. 113, 148103 (2014).[42] C. Jia, P. Xie, M. Chen, and M. Q. Zhang, Sci. Rep. 7,

16037 (2017).[43] P. C. Bressloff, J. Phys. A 50, 133001 (2017).[44] J. Paulsson, Phys. Life Rev. 2, 157 (2005).[45] P. Mehta and D. J. Schwab, Proc. Natl Acad. Sci. U.S.A.

109, 17978 (2012).[46] N. Cermak, S. K. Yu, R. Clark, Y.-C. Huang, S. N.

Baskoylu, and S. W. Flavell, eLife 9, e57093 (2020).[47] K. Y. Wan and R. E. Goldstein, Phys. Rev. Lett. 121,

058103 (2018).[48] S. D. Stellman, Science 182, 1296 (1973).[49] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Haus-

dorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B.Moody, C.-K. Peng, and H. E. Stanley, Circulation 101,e215 (2000).

[50] G. A. Pavliotis, Stochastic processes and applications,Vol. 60 (Springer, 2014).

[51] V. Y. Chernyak, M. Chertkov, and C. Jarzynski, J. Stat.Mech. 2006, P08001 (2006).

[52] D. M. Busiello, J. Hidalgo, and A. Maritan, New J. Phys.21, 073004 (2019).

[53] T. Herpich, K. Shayanfard, and M. Esposito, Phys. Rev.E 101, 022116 (2020).

[54] J. M. Horowitz, J. Chem. Phys. 143, 044111 (2015).[55] I. Lestas, G. Vinnicombe, and J. Paulsson, Nature 467,

174 (2010).

[56] J. Yan, A. Hilfinger, G. Vinnicombe, and J. Paulsson,Phys. Rev. Lett. 123, 108101 (2019).

[57] M. Costa, A. L. Goldberger, and C.-K. Peng, Phys. Rev.Lett. 95, 198102 (2005).

[58] T. M. Norman, N. D. Lord, J. Paulsson, and R. Losick,Nature 503, 481 (2013).

[59] H. Isojima, R. Iino, Y. Niitani, H. Noji, andM. Tomishige, Nat. Chem. Biol. 12, 290 (2016).

[60] C. L. Asbury, A. N. Fehr, and S. M. Block, Science 302,2130 (2003).

[61] I. I. Cisse, I. Izeddin, S. Z. Causse, L. Boudarene,A. Senecal, L. Muresan, C. Dugast-Darzacq, B. Hajj,M. Dahan, and X. Darzacq, Science 341, 664 (2013).

[62] F. Bai, R. W. Branch, D. V. Nicolau, T. Pilizota, B. C.Steel, P. K. Maini, and R. M. Berry, Science 327, 685(2010).

[63] M. Theves, J. Taktikos, V. Zaburdaev, H. Stark, andC. Beta, Biophys. J. 105, 1915 (2013).

[64] M. Polin, I. Tuval, K. Drescher, J. P. Gollub, and R. E.Goldstein, Science 325, 487 (2009).

[65] Y. Wu, A. D. Kaiser, Y. Jiang, and M. S. Alber, Proc.Natl Acad. Sci. U.S.A. 106, 1222 (2009).

[66] O. Sliusarenko, J. Neu, D. R. Zusman, and G. Oster,Proc. Natl Acad. Sci. U.S.A. 103, 1534 (2006).

[67] A. Be’er, S. K. Strain, R. A. Hernandez, E. Ben-Jacob,and E.-L. Florin, J. Bacteriol. 195, 2709 (2013).

[68] F. Zhou, S. A. Schaffer, C. Schreiber, F. J. Segerer,A. Goychuk, E. Frey, and J. O. Radler, PLOS ONE 15,

e0230679 (2020).[69] K. Krug, E. Brunskill, A. Scarna, G. M. Goodwin, and

A. J. Parker, Proc. Royal Soc. B 275, 1839 (2008).[70] K. H. Elliott, R. D. Bull, A. J. Gaston, and G. K. Da-

voren, Behav. Ecol. Sociobiol. 63, 1773 (2009).[71] N. E. Raine and L. Chittka, Entomol. Gen. 29, 179

(2007).

Supplementary information:Estimating entropy production from waiting time distributions

Dominic J. Skinner1 and Jorn Dunkel1

1Department of Mathematics, Massachusetts Institute of Technology, Cambridge Massachusetts 02139-4307, USA(Dated: July 26, 2021)

CONTENTS

Introduction and preliminaries 1

Non-monotonic waiting times imply out of equilibrium dynamics 3

Numerical minimization 4

Asymptotic solution 6Small entropy asymptotics 6Large entropy asymptotics 8

Numerical convergence 10

Thermodynamic Uncertainty Relation 11Active sensor 11Random transition rates 12

Continuous Langevin system 12Numerical minimization 15Analytic gradients 16Asymptotic limit 20

Further examples from the literature 21

References 21

INTRODUCTION AND PRELIMINARIES

We start with Markovian dynamics on a finite set of discrete states, (1, . . . , NT ), where a stochastictransition between states i and j occurs at a rate Wij ≥ 0. It follows that a probability distribution, pi(t)evolves through the master equation

d

dtpi =

∑

j

pjWji, (S1)

with Wii = −∑j 6=iWij . For irreducible, recurrent systems, meaning there exists a path of non-zero proba-bility between any two states, the system has a unique steady-state distribution, πi, satisfying

∑j πjWji = 0

for all i, and all initial conditions will tend to this distribution. The steady state average entropy productionrate is

σ = kB∑

i,j

πiWij logπiWij

πjWji. (S2)

arX

iv:2

105.

0868

1v3

[ph

ysic

s.bi

o-ph

] 2

3 Ju

l 202

1

Entropy production can be defined along single trajectories for systems starting with arbitrary probabilitydistributions, but here we only consider this steady state expression [S1]. While we formally assume amathematical non-equilibrium steady state (NESS), our bound on entropy production rate will extend tosystems which are not a true NESS, but instead represent a quasi-stationary state (QSS) [S2, S3]. For anisothermal NESS at temperature T , Tσ can be interpreted as a physical heat, but in an isothermal QSS thisinterpretation no longer holds [S2]. However, Tσ retains the interpretation of a rate of free energy dissipationin both [S2]. Therefore, when there is a time scale separation between Markov dynamics on the discretestates and dynamics longer than the observation period, over which the transition rates may change, ourestimator will be able to bound the entropy production rate, and TσT will be a lower bound for the freeenergy dissipation rate [S2, S3].

An equivalent form of Eq. (S2), is given by

kB∑

i,j

πiWij logπiWij

πjWji= kB

∑

i,j

πiWij logWij

Wji+ kB

∑

i,j

πiWij logπiπj

(S3)

= kB∑

i,j

πiWij logWij

Wji+ kB

∑

i

πi

∑

j

Wij

log πi − kB

∑

j

[∑

i

πiWij

]log πj

= kB∑

i,j

πiWij logWij

Wji,

since∑jWij =

∑i πiWij = 0, and this form is often stated as the entropy production rate [S1]. We also note

here that upon arrival at a state i, the distribution of time spent in that state is exponentially distributedwith parameter λ = −Wii.

Once the NT (NT − 1)/2 transition rates Wij ≥ 0, i 6= j are specified, the steady state distribution followsfrom the eigenvalue problem, and the entropy production rate can then be computed. It will later be useful totake an alternative formulation where we specify NT (NT −1)/2 rates of mass transfer, nij = πiWij , togetherwith the steady state distribution πi, and NT linear constraints

∑j nij =

∑j nji. After choosing nij and

πi, we easily recover the transition rates as Wij = nij/πi, for i 6= j, Wii = −∑j 6=iWij = −∑j 6=i nij/πi. Wesee that the relation

∑πiWij = 0 is satisfied, since this then becomes equivalent to the linear constraint on

nij which was imposed. From Eq. (S3), we see that now σ depends only on the values of nij , and so πi canbe specified independently.

We now consider a system with two coarse grained metastates. Typically, we will consider the case whereonly two coarse grained states are accessible because of experimental constraints. However, we can alsoapply our framework if more than two metastates are accessible, as we are free to coarse grain further intojust two states. Calling these metastates A and B, we suppose that the first N micro-states belong to themetastate A, with 1 ≤ N < NT . We can then write W as a block matrix, such that

W =

(WA WAB

WBA WB

), (S4)

where WA is of size N ×N and represents the within A transition rates. We then have the following lemmafrom Ref. [S4].

Lemma 1. For any such Markovian process in the stationary distribution, upon entering A, the distributionof time spent in A before leaving is given by fA(t) = 1

KπAW2A exp(WAt)1

>, where K = −πAWA1> and 1 is

a vector of ones.

2

From this lemma, it is straightforward to derive the moment statistics of the time spent in A, as

〈tk〉A =

∫ ∞

0

tkfA(t)dt (S5)

=

[tk

1

KπAW

2A exp(WAt)1

>]∞

t=0

−∫ ∞

0

ktk−11

KπAWA exp(WAt)1

>dt

= (−1)kk!

∫ ∞

0

1

KπAW

2−kA exp(WAt)1

>dt

= (−1)k+1k!πAW1−kA 1>/K,

and similar for B.

NON-MONOTONIC WAITING TIMES IMPLY OUT OF EQUILIBRIUM DYNAMICS

We rederive an expression for the waiting time distributions for a system in equilibrium, found earlierin Ref. [S5]. We start with the expression for the waiting time distribution for metastate A, fA(t) =1KπAW

2A exp(WAt)1

>, which holds regardless of whether detailed balance holds. If we further assume that thesystem is in in thermal equilibrium so detailed balance applies, πiWij = πjWji, then W is diagonalizable [S6].Specifically, we can define a new matrix

Dij =

√πAiW

Aij√

πAj=

√πAjW

Aji√

πAi= Dji, (S6)

which is symmetric by construction. Therefore, D admits a spectral decomposition,

Dij =∑

k

λkwki w

kj , (S7)

where λk are the real eigenvalues, and wk are the orthonormal eigenvectors. Since D is related to W througha similarity transformation, D = ΛWΛ−1, with Λij = δij

√πAi, we have that

fA(t) =1

KπAΛ−1D2 exp(Dt)Λ1>, (S8)

where[D2 exp(Dt)

]ij

=∑k λ

2keλktwki w

kj . Therefore,

fA(t) =1

K

∑

i,j

√πAi

(∑

k

λ2keλktwki w

kj

)√πAj

=1

K

∑

k

λ2keλkt〈√πA, wk〉2. (S9)

As this represents a waiting time probability distribution which is normalizable, fA → 0 as t → ∞, and soλi < 0, but λ2k〈

√πA, w

k〉2 ≥ 0, so fA(t) is the sum of decaying exponentials, each with a non-negative weight.There are several natural corollaries of this result. As noted by Ref. [S5], this implies that waiting time dis-tributions of equilibrium systems are monotonically decaying functions, and all derivatives are monotonic aswell (see Eq. (1) and (2) in Ref [S5]). Here, we also note that the requirement that fA =

∑i pi|λi| exp(−|λi|t),

for pi > 0, λi < 0,∑i pi = 1, implies the moments 〈tn〉A = n!

∑i pi/|λi|n, and hence the Cauchy-Schwarz

inequality implies that(∑

i

pi/|λi|)2

≤(∑

i

pi/λ2i

)(∑

i

pi

), (S10)

and so 2〈t〉2A ≤ 〈t2〉A for equilibrium systems.

3

NUMERICAL MINIMIZATION

As discussed in the main text, we focus on the numerical minimization problem of finding the curve

Λ(θ) = minR{σ(R)/kB |〈t〉A,R = 〈t〉B,R = 1, 〈t2〉A,R = θ}, (S11)

which yields the bound

σ ≥ 2kB〈t〉A + 〈t〉B

Λ

( 〈t2〉A〈t〉2A

), (S12)

for an arbitrary system. More specifically, finding Λ results in the following constrained minimizationproblem,

Λ(θ) = minn,π

∑

i<j

(nij − nji) log

(nijnji

), (S13)

s.t. πA1> = πB1

> = −1nA1> = 1/2,

− 4πAn−1A π>A = θ,

π ≥ 0, nij ≥ 0, nii = −∑

j 6=inij = −

∑

j 6=inji,

where the minimum is taken over all possible network topologies. We consider the subproblem ΛN,M (θ),where the minimum is taken over N internal states of A (we take these as i = 1, . . . , N), and M internalstates of B. We make the following claim

Lemma 2. The minimum is achieved when only 1 internal state of B, (M = 1), is used.

To see this, start from any solution that satisfies the constraints with M internal states of B. Then take(πN+1, . . . , πN+M ) 7→ πN+1 = 1/2, and for i ≤ N , (ni,N+1, . . . , ni,N+M ) 7→ ni,N+1 =

∑j ni,N+j , similar for

nN+1,i, and nN+1,N+1 = −∑i≤N ni,N+1. This new system satisfies the constraints, as the θ constraint doesnot depend on nB . To see that the entropy production rate does not increase, note that

∑

i<j

(nij − nji) log

(nijnji

)≥

∑

i<j≤N+

∑

i≤N,j>N

(nij − nji) log

(nijnji

)(S14)

≥∑

i<j≤N(nij − nji) log

(nijnji

)+∑

i≤N

∑

j>N

nij −∑

j>N

nji

log

(∑j>N nij∑j>N nji

)

=∑

i<j≤N(nij − nji) log

(nijnji

)+∑

i≤N(ni,N+1 − nN+1,i) log

(ni,N+1

nN+1,i

),

where we have used the convexity of the function f(x, y) = (x− y) log (x/y). Therefore, the new system nproduces entropy at a rate less than or equal to the original system, and so it is unneccessary to minimize overmultiple internal states of B. Therefore, from now on, we compute numerically ΛN (θ), where the minimumis taken over N internal states of A and a single internal state of B.

To perform numerical minimization, we perform a global optimization using MATLAB’s fmincon func-tion [S7], together with a global search strategy [S8], with random initial conditions. Since the curves ΛN (θ)are monotonic, finding it is equivalent to fixing σ and minimizing θ, finding instead a curve θ = ΓN (σ).We chose to solve this problem instead, as the minimization was found to convergence more consistently,although both approaches are ultimately equivalent. We sample feasible initial conditions by first sampling arandom matrix with positive entries, and successively projecting onto constraint subspaces until convergence,

4

where projection onto the entropy constraint is done approximately with a Newton step. We also supply theminimization algorithm with the gradient and Hessian information of the function and constraints, whichwe briefly derive here. We take nA, πA as our variables, with the remaining variables determined by theconstraints, where nAij = nij = πiWij for i, j ≤ N . We therefore must consider the gradient and Hessian of

σ(nA) =∑

i<j

(nAij − nAji

)log

(nAijnAji

)+∑

i

(ci − di) log

(cidi

), (S15a)

f(nA, πA) = πA(nA)−1π>A , (S15b)

with ci = −∑j nAij , and di = −∑j n

Aji, and we have defined f = −θ/4 for clarity. We first note that

∂[(nA)−1]ij∂nApq

= −[(nA)−1]ip[(nA)−1]qj , (S16)

and define y = (nA)−1πA, z = (nA)−TπA, so that the first order derivatives in f are

∂f

∂πAi=([(nA)−1]ij + [(nA)−1]ji

)πj = yi + zi, (S17a)

∂f

∂nAij= −πAi[(nA)−1]ip[(n

A)−1]qjπAj = −zpyq, (S17b)

and second order derivatives are

∂2f

∂πAi∂nApq= −

[[(nA)−1]ip[(n

A)−1]qj + [(nA)−1]jp[(nA)−1]qi

]πAj (S18a)

= −[(nA)−1]ipyq − zp[(nA)−1]qi

∂2f

∂πAi∂πAj= [(nA)−1]ij + [(nA)−1]ji (S18b)

∂2f

∂nApq∂nArs

= πAi[[(nA)−1]ir[(n

A)−1]sp[(nA)−1]qj + [(nA)−1]ip[(n

A)−1]qr[(nA)−1]sj

]πAj

= zr[(nA)−1]spyq + zp[(n

A)−1]qrys. (S18c)

Moving to σ, for the purposes of numerical stability, we replace log(a/b) with log((a+ε)/(b+ε)), where ε is asmall numerical parameter which ensures that the constraint and derivatives remain finite at the boundaries.We introduce the functions

g(a, b) = g(b, a) = (a− b) log

(a+ ε

b+ ε

), (S19)

and

h(a, b) =∂g

∂a= log

(a+ ε

b+ ε

)+a− ba+ ε

, (S20)

so that,

σ =∑

i<j

g(nAij , nBji) +

∑

i

g (ci, di) , (S21)

where we recall that ci = −∑j nAij , and di = −∑j n

Aji. To calculate the derivatives, we make use of

∂ci∂nAjk

= −δij ,∂di∂nAjk

= −δik, (S22)

5

to find

∂σ

∂nAij= h(nAij , n

Aji)− h(ci, di)− h(dj , cj), (S23)

Defining the second derivative functions

ha =∂h

∂a=a+ b+ 2ε

(a+ ε)2, (S24a)

hb =∂h

∂b= − a+ b+ 2ε

(a+ ε)(b+ ε), (S24b)

we have that

∂2σ

∂nAij∂nApq

=ha(nAij , nAji)δipδjq + hb(n

Aij , n

Aji)δiqδjp

+ δipha(ci, di) + δiqhb(ci, di)δjqha(dj , cj) + δjphb(dj , cj) (S25)

ASYMPTOTIC SOLUTION

Here, we derive analytically the small and large entropy asymptotics for a system with a fixed number,N , internal states within A. We use the asymptotic expansion notation f(x) ∼ a0f0(x) + a1f1(x) to meanthat f − a0f0 = o(f0), and f − a0f0 − a1f1 = o(f1) in the limit.

Small entropy asymptotics

Numerically we find that the small entropy asymptotic bound is achieved with N = 2. Here, we find theasymptotic form θ ∼ 1− σ/16 for σ � 1. We take

nA =

[A11 A12

A21 A22

], (S26)

so that we must minimize

θ =−4

A11A22 −A12A21π>A

[A22 −A12

−A21 A11

]πA, (S27)

subject to

σ = (A12 −A21) log

(A12

A21

)− (A12 −A21) log

(A11 +A12

A11 +A21

)− (A12 −A21) log

(A22 +A12

A22 +A21

)(S28)

−1

2=A11 +A12 +A21 +A22,

in addition to the inequality constraints

A11, A22 ≤ 0, (S29a)

A12, A21 ≥ 0, (S29b)

A11 +A12, A11 +A21 A22 +A21, A22 +A12 ≤ 0. (S29c)

Let

A12 = α+ β, A21 = α− β, (S30)

6

and we assume that α = O(1). We have that

θ =−4

A11A22 − α2 + β2π>A

[A22 −α−α A11

]πA, (S31)

where, since πA1 + πA2 = 1/2, we have that

θ =−A22 + 4(A22 + α)πA2 − 4(A11 +A22 + 2α)π2

A2

A11A22 − α2 + β2,

=2π2

A2 + 4(A22 + α)πA2 −A22

A11A22 − α2 + β2, (S32)

since A11 +A22 + 2α = −1/2. This minimized when πA2 = −(A22 + α), so that

θ =2A11A22 − 2α2

A11A22 − α2 + β2∼ 2− 2β2

A11A22 − α2. (S33)

The leading order expression for the entropy production rate is,

σ =4β2

α− 4β2

A11 + α− 4β2

A22 + α. (S34)

Since we can interchange A11, A22 without changing the function or constraints, the minimum has A11 = A22,so that A11 = −α− 1/4, and

σ = 4β2(1/α+ 8), (S35)

so

θ ∼ 2− 8ασ

(1 + 8α)2, (S36)

which is minimized when α = 1/8, giving

θ ∼ 2− σ/4. (S37)

As a consistency check, we observe

nA =

[− 3

818 +

√δ8

18 −

√δ8 − 3

8

], c =

[1

4−√δ

8,

1

4+

√δ

8

], d =

[1

4+

√δ

8,

1

4−√δ

8

](S38)

which produces entropy at a rate

σ =

√δ

4log

(1 +√δ

1−√δ

)+

√δ

2log

(1 +√δ/2

1−√δ/2

)= δ +O(δ3),

while

θ =2

1 + δ2/8∼ 2− δ2/4 +O(δ4) (S39)

This asymptotic solution agrees well with numerics for σ � 1, Fig. S1.

7

0 1 21.5

1.75

2

Normalized entropy prodution rate, σ

Nor

mal

ized

sec

ond

mom

ent,

θ FIG. S1. Small entropy asymptotics, for σ � 1, predicts θ ∼ 2−σ/4 (black dotted line), for the cannonically rescaledsystem. This agrees well with numerical results shown in shades of red for N = 2, 3, 4, 5 internal states, which rapidlyconverge to a single curve.

Large entropy asymptotics

In the limit of infinite entropy production with a fixed number, N , of internal states in A, we can choosea deterministic path through the internal states, but uncertainty in the waiting time distribution still occursdue to the exponentially distributed dwell times in each internal state. However, the total uncertainty forthe total wait duration can be minimized by choosing a path that sequentially moves through each internalstate, so that

nA =1

2

−1 1−1 1

. . .

−1 1−1

, (S40)

with c = (0, . . . , 0, 1/2), d = (1/2, 0, . . . , 0), and

(nA)−1 = −2

1 1 · · · 11 1

. . ....

1 11

, (S41)

and πA = 1/(2N), so θ = 1 + 1/N . Thus with more states we can reach θ = 1 in the N → ∞ limit,corresponding to zero variance in wait time. Taking a perturbation from this solution, we find that toleading order

δθ = 4π>A(nA)−1δnA(nA)−1πA − 4π>A(nA)−1δπA − 4δπTA(nA)−1πA (S42)

= 4π>A(nA)−1δnA(nA)−1πA + π>A((nA)−1 + (nA)−T )δπA

= 4π>A(nA)−1δnA(nA)−1πA,

since πA = 1/(2N), and [(nA)−1 +(nA)−T ]ij = −2−2δij , so the condition 1>δπA = 0 causes the term linearin δπA to vanish. We can write

δθ = 4∑

i,j

XijδnAij , (S43)

8

where

Xij =1

N2i(N + 1− j). (S44)

That this linear term does not vanish is due to it being on the boundary of the feasible set. The entropyproduction rate to leading order is

σ =

N−1∑

i=1

−1

2log(2δnAi+1,i

)− 1

2log

−2

∑

j

δnA1j

− 1

2log

−2

∑

j

δnAjN

.

We solve the perturbation problem by fixing δθ = ε, and minimize σ, with δnAij = εYij , so that we solve

minY

σ(Y ) =N + 1

2log(1/2ε)− 1

2

N−1∑

i=1

log(Yi+1,i)−1

2log

−

∑

j

Y1j

− 1

2log

−

∑

j

YjN

(S45a)

s.t. 4∑

i,j

XijYij = 1, (S45b)

∑

i,j

Yij = 0 (S45c)

∑

j

Yij ≤ 0, for i = 1, . . . , N − 1 (S45d)

∑

j

Yji ≤ 0, for i = 2, . . . , N (S45e)

Yij ≥ 0, for i 6= j and i 6= j − 1 (S45f)

We can already predict that σ = 12 (N + 1) log(1/2ε) +C, where C = O(1), to find the asymptotic formula,

θ(σ) ∼ 1 +1

N+

1

2exp

(C − σN + 1

)(S46)

in the large σ limit. To find C, we take Yi,i 6= 0, Yi+1,i 6= 0, and Yij = 0 otherwise, and suppose theconstraints on row and column sums on Yij are tight. Then

Y =

−ba −a

a −a. . .

−aa −c

, (S47)

where a, b, c > 0 are some variables to be minimized over. Since b and c can be interchanged withoutchanging the objective or constraints, and the current linearized problem is convex, we may take b = c, sothat b = c = a/2 from the constraint

∑i,j Yij = 0. The constraint 4

∑i,j YijXij = 1 gives

1 =4

N2

[−a

N−1∑

i=1

i(N + 1− i) + aN−1∑

i=1

(i+ 1)(N + 1− i)]

=2a(N2 +N + 2)

N2(S48)

9

10-4

10-610 20 30

Entropy production rate, σ

Di�

eren

ce o

f est

imat

e an

d as

ympt

ote,

ΓN

- 1 -

1/N

40

10-2

2 6 10 14 1810-3

10-2

10-1

100

Number of internal states, N

Estim

ated

err

or Γ

N -

Γ

(a) (b)

FIG. S2. (a) Numerical convergence of ΓN−1−1/N → 0 as σ →∞, against predicted asymptotic formula, Eq. (S51).Curves for 2 ≤ N ≤ 6 are shown from bottom to top (green points) together with respective asymptotic predictions(dotted lines). As N increases, larger values of σ are required for the asymptotic regime to hold. (b) Exponentialconvergence of ΓN to our estimate of Γ, for σ = 5, 10, . . . , 35 bottom to top. Values of ΓN from N = 12, . . . , 18 wereused to fit the estimate of Γ.

so that

a =N2

2(N2 +N − 2), (S49)

and hence

σ =N + 1

2log(1/2ε)− N + 1

2log

(N2

2(N2 +N − 2)

)+ log 2 (S50)

meaning that

θ(σ) ∼ 1 +1

N+

(N2 +N − 2

N2

)22/(N+1) exp

(− 2σ

N + 1

). (S51)

This agrees well with with numerical results, Fig. S2(a), although for larger N , the asymptotic correctionto θ = 1 + 1/N only holds for very large σ. This means that for larger N , this asymptotic formula does notprovide much insight into the intermediate values of σ, and so can not replace performing a full numericalminimization within those regions.

NUMERICAL CONVERGENCE

To determine the exact bound Λ(θ), we need to take the limit, N →∞ of infinitely many internal states.Of course, this is not practical numerically, so we must verify that finite N values converge, and estimatetheir error. Recall that we find sucessive values of θ for fixed σ, calling these ΓN (σ) = Λ−1N (σ). We seeconvergence appears exponential,

ΓN ≈ Γ + Γ0e−kN , (S52)

from which we wish to estimate Γ from finite N data. To do so, we define a residual

R =N=L∑

N=M

(log[ΓN − C1]− C2 − C3N)2, (S53)

10

where by minimizing the residual with respect to C1, C2, C3, we find the best exponential fit, and can estimateΓ ≈ C1. Calling yN = log[ΓN−C1], and xN = N , finding the values of C2, and C3 that minimize the residualis straightforward linear regression,

C3 =

∑N (xN − x)(yN − y)∑

N (xN − x)2,

C2 = y − C3x,

(S54)

The resulting residual is a non-linear function of C1, and so we find the minimum numerically. This servesas an estimate for Γ, and indeed we see exponential convergence to zero of ΓN − Γ, Fig. S2(b). Moreover,the discrepancy between ΓN and our estimate for Γ serves as an order of magnitude estimate for the errorof our Γ estimate. This ranges from 10−3 for σ = 5, to 1.5 × 10−2 for σ = 35, which is small compared tothe other sources of error introduced here by estimates of the mean and second moment of the waiting timedistribution.

THERMODYNAMIC UNCERTAINTY RELATION

Given two states i and j, suppose JT (i, j), counts the net number of transitions i→ j in some time T , thenwe can define a coarse grained current, JT =

∑i<j d(i, j)JT (i, j) for some weights d, with d(i, j) = −d(j, i).

The thermodynamic uncertainty relation (TUR) states that σT ≥ 2kB〈JT 〉2/Var JT , for any time T [S1].Whilst this has proved a valuable tool for inference [S1, S9], for our two macro-state systems, no coarsegrained currents (〈JT 〉 6= 0) can be observed. Recently however, an extension of this inequality has beenderived [S10], which states that for two state systems,

σ ≥ 8〈qA〉2〈qB〉2TVar qA

− 4

〈t〉A + 〈t〉B, (S55)

where qA is the empirically observed fraction of time spent in A for an observation of length T , and 〈qA〉 = rA,Var qA are the average and variance of this quantity respectively, similar for qB . In Ref. [S10], the result isstated in terms of the quantity

Rπ =∑

i∈A,j∈Bnij =

∑

i∈A,j∈Bnji, (S56)

but since

〈t〉A =πA1

>

−πAWA1>=πA1

>

Rπ, (S57)

this can be rewritten as Rπ = 1/(〈t〉A + 〈t〉B), which we have done in Eq. (S55). This TUR does not holdfor finite T , but holds in the limit T →∞.

Active sensor

To compare against this result for the active sensor example, we simulated 200 long trajectories of lengthT = 2000 so the trajectory completes O(400) cycles, and computed this modified TUR bound by calculatingthe average fraction of time spent in A and B, as well as the variance of these across trajectories, as describedin the main text. We find that the σT bound is significantly improved on the TUR bound, both in terms ofthe value of the bound as well as the uncertainty, Fig. 3. It is worth noting that as σ → ∞ in the activesensor, 〈t2〉A/〈t〉2A → 1+1/4, since in our model there are 4 internal states of A, yet such a value of 〈t2〉A/〈t〉2Acould be achieved at a far lower σ by utilizing a system with more internal states, which is why σT results in

11

0 1 2 30

1

2

3

TUR estimate

σ T est

imat

e

FIG. S3. Entropy production bounds for random systems for both TUR and σT . The value of the TUR bound withkB = 1 is plotted against the value of the σT bound. For each randomly generated transition matrix, the boundswere computed from 50 trajectories of length T = 2000 (black points indicate sample medians). Error bars in redshow 95% range of predictions for both TUR and σT , and were calculated by repeatedly calculating the bounds 400times.

a significant underestimate for large σ. For small enough σ, the statistics satisfy 〈t2〉A/〈t〉2A ≥ 2, which couldhave conceivably been generated by an equilibrium process, again resulting in a significant underestimate byσT . However, in the intermediate regime, which the system would operate in, the bound is reasonably close.

Random transition rates

To further compare the TUR estimator with σT , we generated 200 systems each containing three statesand with the transition rates Wij of each system drawn from an exponential distribution with mean 1. Wethen coarse-grained, taking the first two states as part of the metastate A and the remaining state as partof B. For each of these systems, we computed both the TUR bound and the σT bound from 50 trajectoriesof length T = 2000. Plotting the two bounds against each other reveals that the σT bound is typically asignificant improvement on the (median) TUR bound, Fig. S3. Moreover, when repeatedly calculating thesebounds from simulations, the 95% range of predictions is small for σT , whereas for most of the TUR bounds,the 95% range contains zero, and so can not reliably identify the system as producing entropy at all, Fig. S3.

CONTINUOUS LANGEVIN SYSTEM

Gaining some understanding of the behavior of the curve θ = Γ(σ) in the large σ limit is important tounderstanding the precision limits of stochastic clocks and timers. When time is measured by countingthe number of full rotations around some circular topology, the thermodynamic uncertainty relation (TUR)requires that στ ≥ 2kB〈Xτ 〉2/VarXτ , where Xτ is the stochastic variable counting the number of fullrotations in time τ [S11]. This bound can be saturated, for instance consider a periodic interval, S1, wherethe system evolves through a Langevin equation,

dYt = Fdt+√

2DdWt, (S58)

12

for F and D constant. In this case, Yτ ∼ Y0 +Fτ +√

2DN (0, τ), so in the large τ limit, the average numberof rotations is 〈Xτ 〉 ≈ Fτ , whereas the variance of this will be VarXτ ≈ 2Dτ . Then 2kB〈Xτ 〉2/VarXτ ≈2kB(Fτ)2/2Dτ = kBF

2τ/D = στ , since the entropy production rate for such a continuous system iskBF

2/D [S12]. Given any continuous Langevin system, one can construct a discrete Markov system whichapproximates the dynamics of the continuous system arbitrarily well and has an entropy production ratearbitrarily close to that of the continuous system [S3, S13]. Hence we can construct a discrete systemwith precision and entropy production arbitrarily close to the continuous system which saturates the TURbound. However, whilst such a system will also saturate the TUR bound, it will be no more precise than anequilibrium system. To see this with A defined as some subset of S1, note that the Langevin system at smalltimes is dominated by diffusive behavior, with short time trajectories approximating Brownian motion. Fora trajectory which starts in A and ends in B, it will cross the boundary infinitely many times on short timescales before eventually leaving for good, meaning the waiting time statistics are dominated by this shortterm diffusive behavior, which could be generated by an equilibrium system.

We therefore cannot use directly the TUR bound to understand Γ(σ) in the large σ limit. Previously wehave minimized over all unknown discrete network topologies with N unknown states, and for large valuesof θ, or small values of σ, the minimum is accessible for values of N ≤ 20. However the asymptotic regime,θ → 1 would require values of N that are numerically infeasible. Instead, we formally take N → ∞, andsuppose that the region A can be described through continuous coordinates undergoing Langevin dynamics.Importantly, such a coarse graining procedure, in which the limit of infinitely many discrete states is replacedby a continuous Langevin equation, does not necessarily preserve the entropy production rate [S14–S16], andhence it is not known if minimizing over one-dimensional Langevin system approximates the curve θ = Γ(σ)in the σ → ∞ limit. However, we are motivated by the fact that the one-dimensional Langevin systemsaturates TUR for the stochastic clock, which counts by recording the number of full rotations, and forthe stochastic timer, the optimized discrete systems appear Langevin-like in the large σ limit. In any case,minimizing over the Langevin system provides an upper bound for the most optimized finite state Markovprocess, and in particular will allow us to show that we can achieve the performance of the TUR saturatingstochastic clock to leading order in the large σ limit with a stochastic timer.

Formally, we take B to be a single point, with pB = 1/2, and A to be S1 with

pB +

∫ 1

0

dx p(x) = 1, (S59)

as the total probability. Whilst in B, the system jumps to x ∈ A at rate a(x)dx, and whilst at x ∈ A, thesystem jumps to B at rate b(x). Thus,

∫dx p(x)b(x) =

1

2

∫dx a(x) =

1

2. (S60)

Whilst in A, we assume trajectories Xt, follow the stochastic Ito equation,

dXt = Fdt+√

2DdWt, (S61)

where Wt is Brownian motion, and the decision to jump to B or stay in A is made at the start of the timestep, i.e. at b(Xt), not b(Xt+dt). We assume F and D are constants which greatly simplifies the analysis.To calculate the entropy production rate of this system, we use that the entropy production rate is relatedto the relative probability of observing forward and reverse trajectories [S13],

σ = kB limτ→∞

1

τ

∫ τ

0

log

[P({x})P({x})

]P({x})d{x}, (S62)

where P({x}) is the probability of seeing some trajectory {x} in [0, τ ], with {x} its time reversed counterpart.To evaluate this, we discretize time whilst in A, so Xt becomes X0, X1, . . . Xi with a time step ∆t, soXi+1 = Xi + F∆t+

√2D∆Wi, given that no jump to B occured, with ∆Wi ∼ N (0,∆t). Then

P(Xi+1|Xi) =1− b(Xi)∆t√

2π∆texp

[− (∆Xi − F∆t)2

4D∆t

], (S63)

13

where ∆Xi = Xi+1 −Xi. We have therefore that

logP(Xi+1|Xi)

P(Xi|Xi+1)= log

1− b(Xi)∆t

1− b(Xi+1)∆t+

1

4∆t

[(∆Xi + F∆t)2

D− (−∆Xi + F∆t)2

D

]

=F

D∆Xi +O(∆t3/2) =

F 2

D∆t+

F

D

√2D∆Wi +O(∆t3/2) (S64)

in the limit ∆t → 0, and we have neglected higher order terms. In particular, for a continuous trajectory{x} that stays entirely within A,

log

[P({x})P({x})

]= log

[P(x(τ))

P(x(0))

]+F 2τ

D+ F

√2/D

∫ τ

0

dWt. (S65)

and so for a trajectory {x} that starts in B, jumps to A at time t = 0 and returns to B at time t = τ ,

log

[P({x})P({x})

]= log

[a(x(0))b(x(τ))

a(x(τ))b(x(0))

]+F 2τ

D+ F

√2/D

∫ τ

0

dWt. (S66)

Therefore, every jump from B to x ∈ A picks up an entropic contribution of log a(x)/b(x), the reversejump picks up a contribution of log b(x)/a(x), and while in A entropy increases at a rate F 2/D, with thecontribution from the integral

∫ τ0

dWt being zero on average. Taking the long time limit, as the probabilitydistribution tends to the steady state distribution {p(x), pB}, the average rate of entropy production becomes

σ =F 2

2D+

∫dx[p(x)b(x) log(b(x)/a(x)) + pBa(x) log(a(x)/b(x))

]. (S67)

In addition to the normalization conditions, the steady state probability distribution obeys the Fokker-Planckequation,

0 = −∂x(Fp) + ∂2x(Dp) + pBa− pb. (S68)

All that remains to formalize the minimization problem, is to calculate expressions for the first and secondmoments for the distribution of time spent in A. Since the rate at which the process leaves A is b(x), the

probability that a specific trajectory is still in A at time t is exp(−∫ t0b(Xs)ds), and therefore, if v(x, t) is

the probability of still being in A at time t, given that at t = 0, the system was in state x ∈ A, then

v(x, t) = E[e−

∫ t0b(Xs)ds|X0 = x

]. (S69)

The Feynman-Kac formula allows us to instead solve the following PDE to calculate v(x, t) [S17],

∂tv = F∂xv +D∂2xv − bv, (S70)

v(x, 0) = 1,

and therefore, we can find the full wait time distribution given the initial condition by solving a PDE.However, to only calculate the first and second moments, we only need to solve ODEs [S17]. Let T (1)(x) bethe average wait time in A, given the initial condition x, and T (2)(x) be the second moment. Now

T (1)(x) =

∫ ∞

0

t(−∂tv)dt =

∫ ∞

0

vdt, (S71)

but if L = F∂x +D∂2x − b, acts only on x, then

LT (1)(x) =

∫ ∞

0

Lvdt =

∫ ∞

0

∂tvdt = −1. (S72)

14

Similarly, as

T (2) =

∫ ∞

0

t2(−∂tv)dt = 2

∫ ∞

0

tvdt, (S73)

we have that

LT (2) = 2

∫ ∞

0

t∂tvdt = −2T (1). (S74)

We therefore need only solve two second order differential equations to recover the waiting time momentsgiven by

〈t〉A =

∫T (1)adx, 〈t2〉A =

∫T (2)adx. (S75)

The full minimization problem can therefore be described as

min σ = σ [a, b, F,D] subject to (S76a)

〈t2〉A =

∫T (2)adx, (S76b)

L2T (2) = 2, (S76c)∫adx = 1, (S76d)

∫pdx = 1/2, (S76e)

L∗p+1

2a = 0, (S76f)

where L∗f = −∂x(Ff)+∂2x(Df)−bf , and the condition of 〈t〉A = 1, is enforced implicitly by the constraintson a and p.

Numerical minimization

In order to minimize the constrained optimization problem over the infinite dimensional function spaceC(S1), we instead minimize over a finite dimensional Fourier basis, so that

a(x) =

N/2∑

k=−N/2ake

2πikx, (S77)

and similar for b, F,D, where now we minimize over the N/2 + 1 complex coefficients, a0, . . . , aN/2, witha−k = a∗k. Taking a Fourier-Galerkin approach to solving the differential equations, the solution g(x) toLg = f is approximated as the g for which

〈Lg − f, e2πik〉 = 0 (S78)

for −N/2 ≤ k ≤ N/2. Specifically,

Lg − f =∑

k,k′

(Fk′(2πik) +Dk′(2πik)2 − bk′)gke2πi(k+k′)x −

∑

k

fke2πik, (S79)

so we have that

N/2∑

k=−N/2Lrkgk = fr, (S80)

15

where

Lrk = (2πik)Fr−k −Dr−k(2πk)2 − br−k, (S81)

and terms fr−k for |r − k| > N/2 are taken as zero. Since the operator L∗ is the adjoint of L, solving theequation for p uses the Hermitian adjoint of Lrk, as

∑

k

(L†)rkpk = −1

2ar. (S82)

The minimization problem can now be stated as

minσ = σ(ak, bk, Fk, Dk) subject to (S83a)

〈t2〉A =∑

k

T(2)k a−k (S83b)

T (2)r = 2

∑

k

(L−2)rkδk0 (S83c)

a0 = 1 (S83d)

p0 =1

2(S83e)

pr = −1

2

∑

k

(L−†)rkak (S83f)

No explicit expression for σ in terms of the coefficients is possible due to the non-linear terms, but we canevaluate it by numerical integration, which converges exponentially fast for periodic functions. The conditiona, b,D ≥ 0 is not enforced directly, but through a series of trial points xi, so a(xi) ≥ 0, which is a linearconstraint on the coefficients ak. In addition, we take only the first M/2 coefficients of a, b to be non-zero,and for now take F and D to be constant. This ensures that we solve the problem accurately for given a, b,and regularizes the solution.

Analytic gradients

We calculate the following gradients analytically to perform the minimization,

∂σ

∂F= F/D +

∫dx

∂p

∂Fb log(b/a) (S84a)

∂σ

∂D= −F 2/2D2 +

∫dx

∂p

∂Db log(b/a) (S84b)

∂σ

∂b0=

∫dx

[∂p

∂b0b log(b/a) + p(log(b/a) + 1)− pBa/b

], (S84c)

where ∂p/∂b0 can be computed from

∂pk∂b0

=∑

r

(L−†)krpr, (S85)

and

∂pk∂F

=∑

j

(L−†)kj(2πij)pj , (S86a)

∂pk∂D

=∑

j

(L−†)kj(2πj)2pj , (S86b)

16

The rest are more involved as we must differentiate with respect to both the real and imaginary parts ark,and aik respectively. We first note that

∂a

∂ark= 2 cos(2πkx) (S87a)

∂a

∂aik= −2 sin(2πkx) (S87b)

∂σ

∂ark=

∫dx

[∂p

∂arkb log(b/a)− 2pb cos 2πkx

a+ 2pB cos(2πkx)(log(a/b) + 1)

](S87c)

∂σ

∂aik=

∫dx

[∂p

∂aikb log(b/a) +

2pb sin 2πkx

a− 2pB sin(2πkx)(log(a/b) + 1)

], (S87d)

where

∂pk∂arj

= −1

2[(L−†)kj + (L−†)k,−j ], (S88a)

∂pk∂aij

= − i2

[(L−†)kj − (L−†)k,−j ], (S88b)

Next,

∂σ

∂brk=

∫dx

[∂p

∂brkb log(b/a) + 2p cos(2πkx)(log(b/a) + 1)− 2pBa cos(2πkx)/b

](S89a)

∂σ

∂bik=

∫dx

[∂p

∂bikb log(b/a)− 2p sin(2πkx)(log(b/a) + 1) + 2pBa sin(2πkx)/b

], (S89b)

where

∂pk∂brj

=∑

l

(L−†)kl(pl−j + pj+l), (S90a)

∂pk∂bij

= i∑

l

(L−†)kl(pl−j − pj+l). (S90b)

Now, defining

Pl =

∫dxe2πilxb log(b/a), (S91a)

Ql =

∫dxe2πilx[−2pb/a+ 2pB(log(a/b) + 1)], (S91b)

Rl =

∫dxe2πilx[2p(log(b/a) + 1)− 2pBa/b], (S91c)

17

we can write the derivatives as

∂σ

∂F= F/D +

∑

l

Pl∂pl∂F

(S92a)

∂σ

∂D= −F 2/2D2 +

∑

l

Pl∂pl∂D

(S92b)

∂σ

∂ark=∑

l

Pl∂pl∂ark

+ Re(Qk) (S92c)

∂σ

∂aik=∑

l

Pl∂pl∂aik− Im(Qk) (S92d)

∂σ

∂b0=∑

l

Pl∂pl∂b0

+1

2R0 (S92e)

∂σ

∂brk=∑

l

Pl∂pl∂brk

+ Re(Rk) (S92f)

∂σ

∂bik=∑

l

Pl∂pl∂bik− Im(Rk). (S92g)

The only integrals now occur within the terms Pl, Ql, Rl, and can be evaluated by using a fast fouriertransform. If we further identify wj = [

∑l L−1jl P ∗l ]∗, then

∂σ

∂F= F/D + i

∑

j

2πjwj (S93a)

∂σ

∂D= −F 2/2D2 +

∑

j

(2πj)2wj (S93b)

∂σ

∂ark= −1

2(wk + w−k) + Re(Qk) (S93c)

∂σ

∂aik= − i

2(wk − w−k)− Im(Qk) (S93d)

∂σ

∂b0=∑

l

wlpl +1

2R0 (S93e)

∂σ

∂brk=∑

l

wl(pl−k + pl+k) + Re(Rk) (S93f)

∂σ

∂bik= i∑

l

wl(pl−k − pl+k)− Im(Rk). (S93g)

We must also differentiate the constraints, the first of which can be expressed as c1 =∑k T

(2)k a−k − 〈t2〉A,

18

so that c1 = 0. We have immediately that

∂c1∂F

=∑

l

a−l∂T

(2)l

∂F(S94a)

∂c1∂D

=∑

l

a−l∂T

(2)l

∂D(S94b)

∂c1∂ark

= T(2)k + T

(2)−k (S94c)

∂c1∂aik

= −iT (2)k + iT

(2)−k (S94d)

∂c1∂b0

=∑

l

a−l∂T

(2)l

∂b0(S94e)

∂c1∂brk

=∑

l

a−l∂T

(2)l

∂brk(S94f)

∂c1∂bik

=∑

l

a−l∂T

(2)l

∂bik, (S94g)

and

∑

k,l

LjkLkl∂T

(2)l

∂F= −i

∑

l

[(2πj)LjlT (2)

l + Ljl(2πl)T (2)l

](S95a)

∑

k,l

LjkLkl∂T

(2)l

∂D=∑

l

[(2πj)2LjlT (2)

l + Ljl(2πl)2T (2)l

](S95b)

∑

k,l

LjkLkl∂T

(2)l

∂brs=∑

l

[Lj−s,l + Lj+s,l + Lj,l+s + Lj,l−s]T (2)l (S95c)

∑

k,l

LjkLkl∂T

(2)l

∂bis= i∑

l

[Lj−s,l − Lj+s,l + Lj,l+s − Lj,l−s]T (2)l (S95d)

The common form is ∂c/∂y =∑l a−lfl, where f is some vector solving L2f = g. There are many such

y however, and we wish to solve as few linear equations as possible. However, we can rewrite as ∂c/∂y =∑k,l[(L2)−†kl al]

∗gk =∑k v∗kgk, where vk = (L2)−†kl al. We therefore have that

∂c1∂F

= −i∑

l,j

v∗j[(2πj)LjlT (2)

l + Ljl(2πl)T (2)l

](S96a)

∂c1∂D

=∑

l,j

v∗j[(2πj)2LjlT (2)

l + Ljl(2πl)2T (2)l

](S96b)

∂c1∂brs

=∑

l,j

v∗j [Lj−s,l + Lj+s,l + Lj,l+s + Lj,l−s]T (2)l (S96c)

∂c1∂bis

= i∑

l,j

v∗j [Lj−s,l − Lj+s,l + Lj,l+s − Lj,l−s]T (2)l (S96d)

19

We can simplify further by defining uj =∑l LjlT

(2)l , and recall that −2pj =

∑l L−†jl al, so that

∂c1∂F

= −2πi∑

j

j[v∗juj − 2p∗jT(2)j ] (S97a)

∂c1∂D

= (2π)2∑

j

j2[v∗juj − 2p∗jT(2)j ] (S97b)

∂c1∂brs

=∑

j

v∗j (uj−s + uj+s)− 2(p∗l+s + p∗l−s)T(2)l (S97c)

∂c1∂bis

= i∑

l,j

v∗j (uj−s − uj+s)− 2(p∗l+s − p∗l−s)T (2)l (S97d)

Asymptotic limit

Using the continuous Langevin equations, we can reach larger values of σ in our numerical minimizationthan we could by working with the finite state Markov system. However, we still seek an asymptotic formulathat will allow us to extend to arbitrarily large values of σ beyond what is numerically possible.

In the formal limit of infinite precision, a system would have D = 0, F > 0, together with a and b asδ-functions, so that a trajectory would enter A at a point specified by a, travel deterministically with velocityF , before leaving at the point specified by b after spending exactly t = 1 in A. Based on intuition fromthis infinite precision system, we will derive an asymptotic limit as Var tA → 0. Specifically, for analytictractability we take a(x) = a0δ(x0) + a1δ(x1), b(x) = b0δ(x0) + b1δ(x1), allowing us to minimize over asystem specified by a finite number of variables rather than arbitrary functions a(x), b(x). Whilst we expect,and observe numerically, the functions a(x) and b(x) to be sharply peaked in the limit, initially assuming δ-functions rather than finite width Gaussians or arbitrary peaked functions may affect terms in the asymptoticexpansion. However, after we perform this minimization, we find a and b do not show up at leading order,and hence we expect that the effect of changing from a δ-function to a sharply peaked Gaussian will showup at terms smaller than the largest terms involving a and b, which is higher than the leading two termasymptotic expansion derived here.

The problem is then as follows,

minσ[D,F, a0, a1, b0, b1] =F 2

2D+∑

i=1,2

p(xi)bi log(bi/ai) +1

2ai log(ai/bi) subject to, (S98a)

〈t2〉A = a0T(2)(x0) + a1T

(2)(x1) (S98b)

L2T (2) = 2 (S98c)

a0 + a1 = 1 (S98d)∫p dx = 1 (S98e)

L∗p+1

2a = 0, (S98f)

where the presence of δ-functions leads to jump conditions for the differential equations. For instance theequation for p becomes D∂2xp−F∂xp = 0, away from xi, where p is everywhere continuous with discontinuous

first derivative satisfying the jump conditionsD[∂xp]x+i

x−i−bip(xi)+ 1

2ai = 0 for i = 1, 2. Solving these equations

to find p is possible analytically, and moreover it is possible to eliminate a0, a1 from the equations, but theresulting expressions are complex. We therefore provide a Mathematica notebook1, which contains explicit

1 https://github.com/Dom-Skinner/EntropyProductionFromWaitingTimes

20

Wai

ting

time

varia

nce

Var

(tA)

(Normalized) Entropy production rate σ

Full Langevin system2 term asymptotic1 term asymptotic

δ-function Langevin

20 30 40 500

0.02

0.04

0.06

0.08

0.1

FIG. S4. Comparison of asymptotic bounds and numerical minimization for the continuous Langevin system. Theone term, Var tA ∼ 1/σ (grey), and two term Var tA ∼ 1/σ + 4 log σ/σ2 (black), asymptotic results are shown.Minimizing over the full Langevin system (purple), with a, b arbitrary functions, shows the two term asymptoticresult is achievable, although computational constraints prevent extending this curve for larger σ. Performing anumerical minimization with a, b as δ-functions, as was assumed to derive the asymptotics, also agrees with theasymptotics in the limit, but takes longer to converge (green).

expressions for p, a0, a1, and hence σ and 〈t2〉A in terms of the variables bi, F,D, xi. Taking the limit D → 0,and exploring how the remaining parameters scale, we find the asymptotic relation

〈t2〉A ∼ 1 +1

σ+

4 log σ

σ2, (S99)

provides the most precision for a given σ in the σ →∞ limit. We see that the numerical minimization agreeswell with this asymptotic formula for values of σ as low as 20, Fig. S4. We can interpret the resulting systemas essentially the continuous Langevin system which saturates the TUR bound, but with an additional costof having to determine exactly when one rotation has occurred and subsequently changing to macro-stateB [S11, S18]. However, this additional cost is of order O(log σ/σ2), and so is not leading order in the σ →∞limit.

FURTHER EXAMPLES FROM THE LITERATURE

In addition to the examples investigated in the main text, there are a large number of examples in theliterature which measure non-equilibrium waiting time distributions from which we can bound the entropyproduction rate. We include a number of examples in Table S1, where we have computed σT directly fromexperimental waiting time distribution histograms. Where both states of the two macro-state system havenon-equilibrium waiting time distributions we selected A as the one to give the largest σT bound, and where〈t〉B is not explicitly given we use an order of magnitude estimate.

[S1] J. M. Horowitz and T. R. Gingrich, Nat. Phys. 16, 15 (2020).[S2] H. Ge and H. Qian, Phys. Rev. E 87, 062125 (2013).[S3] D. J. Skinner and J. Dunkel, Proc. Nat. Acad. Sci. U.S.A. 118, e2024300118 (2021).[S4] G. Rubino and B. Sericola, J. Appl. Prob. 26, 744 (1989).[S5] Y. Tu, Proc. Natl Acad. Sci. U.S.A. 105, 11737 (2008).

21

TABLE S1. Further examples and σT bound

System 〈t〉A 〈t〉B 〈t2〉A〈t〉2

AσT Reference

Swim-turn dynamics

Pseudomonas putida 1.5s 0.13s 1.4 4.6 kB/s Fig. 4(b) in Ref. [S19]

Chlamydomonas 2.0s 11.2s 1.2 1.4 kB/s Fig. 4(d) in Ref. [S20]

Swarm reversal dynamics

Myxococcus xanthus 520s ≈ 0s 1.2 0.035 kB/s Fig. S1 in Ref. [S21]

Myxococcus xanthus collectivewaves

310s ≈ 0s 1.3 0.042 kB/s Fig. 2(a) in Ref. [S22]

Paenibacillus dendritiformis 20s ≈ 4s 1.3 0.43 kB/s Fig. 3(b) in Ref. [S23]

Migrating breast cancer cell repo-larization time

8.3× 103s ≈ 104s 1.4 5× 10−4kB/s Fig. 4(b) (inset L170) inRef. [S24]

Visual perception switching 4.7s 4.7s 1.5 0.78kB/s Fig. 3(a)(i) in Ref. [S25]

Animal and insect flight duration

Uria lomvia 800s ≈ 500s 1.7 2.6× 10−3kB/s Fig. 4 in Ref [S26].

Bombus terrestris 2.5s ≈ 5s 1.7 0.38kB/s Fig. 5d in Ref [S27].

Escherichia coli flagellar rotationdirection switching

0.020s 0.019s 1.9 33 kB/s Fig. 4(a, inset middle) inRef. [S28]

Bacillus subtilis gene stateswitching

1.2× 104s 9.5× 104s 1.1 3.0× 10−4kB/s Fig. 2(d, f) in Ref. [S29]

Binding-unbinding times of kinesin-1 molecular motor

0.075s 0.019s 1.3 120kB/s Fig. 2(d) in Ref. [S30]

RNA polymerase cluster lifetime 5.1s ≈ 500s 1.6 9.8× 10−3kB/s Fig. 2(e) in Ref. [S31]

[S6] N. G. Van Kampen, Stochastic processes in physics and chemistry, Vol. 1 (Elsevier, 1992).[S7] MATLAB, 9.7.0.1190202 (R2019b) (The MathWorks Inc., Natick, Massachusetts, 2019).[S8] Z. Ugray, L. Lasdon, J. Plummer, F. Glover, J. Kelly, and R. Martı, INFORMS J. on Computing 19, 328

(2007).[S9] U. Seifert, Ann. Rev. Cond. Matter Phys. 10, 171 (2019).

[S10] S. E. Harvey, S. Lahiri, and S. Ganguli, arXiv:2002.10567.[S11] A. C. Barato and U. Seifert, Phys. Rev. X 6, 041053 (2016).[S12] U. Seifert, Rep. Prog. Phys. 75, 126001 (2012).[S13] V. Y. Chernyak, M. Chertkov, and C. Jarzynski, J. Stat. Mech. 2006, P08001 (2006).[S14] D. M. Busiello, J. Hidalgo, and A. Maritan, New J. Phys. 21, 073004 (2019).[S15] T. Herpich, K. Shayanfard, and M. Esposito, Phys. Rev. E 101, 022116 (2020).[S16] J. M. Horowitz, J. Chem. Phys. 143, 044111 (2015).[S17] G. A. Pavliotis, Stochastic processes and applications, Vol. 60 (Springer, 2014).[S18] A. N. Pearson, Y. Guryanova, P. Erker, E. A. Laird, G. A. D. Briggs, M. Huber, and N. Ares, Phys. Rev. X

11, 021029 (2021).[S19] M. Theves, J. Taktikos, V. Zaburdaev, H. Stark, and C. Beta, Biophys. J. 105, 1915 (2013).[S20] M. Polin, I. Tuval, K. Drescher, J. P. Gollub, and R. E. Goldstein, Science 325, 487 (2009).[S21] Y. Wu, A. D. Kaiser, Y. Jiang, and M. S. Alber, Proc. Natl Acad. Sci. U.S.A. 106, 1222 (2009).[S22] O. Sliusarenko, J. Neu, D. R. Zusman, and G. Oster, Proc. Natl Acad. Sci. U.S.A. 103, 1534 (2006).[S23] A. Be’er, S. K. Strain, R. A. Hernandez, E. Ben-Jacob, and E.-L. Florin, J. Bacteriol. 195, 2709 (2013).[S24] F. Zhou, S. A. Schaffer, C. Schreiber, F. J. Segerer, A. Goychuk, E. Frey, and J. O. Radler, PLOS ONE 15,

e0230679 (2020).[S25] K. Krug, E. Brunskill, A. Scarna, G. M. Goodwin, and A. J. Parker, Proc. Royal Soc. B 275, 1839 (2008).

22

[S26] K. H. Elliott, R. D. Bull, A. J. Gaston, and G. K. Davoren, Behav. Ecol. Sociobiol. 63, 1773 (2009).[S27] N. E. Raine and L. Chittka, Entomol. Gen. 29, 179 (2007).[S28] F. Bai, R. W. Branch, D. V. Nicolau, T. Pilizota, B. C. Steel, P. K. Maini, and R. M. Berry, Science 327, 685

(2010).[S29] T. M. Norman, N. D. Lord, J. Paulsson, and R. Losick, Nature 503, 481 (2013).[S30] H. Isojima, R. Iino, Y. Niitani, H. Noji, and M. Tomishige, Nat. Chem. Biol. 12, 290 (2016).[S31] I. I. Cisse, I. Izeddin, S. Z. Causse, L. Boudarene, A. Senecal, L. Muresan, C. Dugast-Darzacq, B. Hajj,

M. Dahan, and X. Darzacq, Science 341, 664 (2013).

23

arxiv:2105.08681v3 [physics.bio-ph] 23 jul 2021

Documents