statistical mechanics and the evolution of polygenic

15
Copyright Ó 2009 by the Genetics Society of America DOI: 10.1534/genetics.108.099309 Statistical Mechanics and the Evolution of Polygenic Quantitative Traits N. H. Barton* ,†,1 and H. P. de Vladar *Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JN, United Kingdom, Institute of Science and Technology Austria, 3400 Klosterneuberg, Austria and Theoretical Biology Group, University of Groningen, 9751 NN, Haren, The Netherlands Manuscript received December 2, 2008 Accepted for publication December 13, 2008 ABSTRACT The evolution of quantitative characters depends on the frequencies of the alleles involved, yet these frequencies cannot usually be measured. Previous groups have proposed an approximation to the dynamics of quantitative traits, based on an analogy with statistical mechanics. We present a modified version of that approach, which makes the analogy more precise and applies quite generally to describe the evolution of allele frequencies. We calculate explicitly how the macroscopic quantities (i.e., quantities that depend on the quantitative trait) depend on evolutionary forces, in a way that is independent of the microscopic details. We first show that the stationary distribution of allele frequencies under drift, selection, and mutation maximizes a certain measure of entropy, subject to constraints on the expectation of observable quantities. We then approximate the dynamical changes in these expectations, assuming that the distribution of allele frequencies always maximizes entropy, conditional on the expected values. When applied to directional selection on an additive trait, this gives a very good approximation to the evolution of the trait mean and the genetic variance, when the number of mutations per generation is sufficiently high (4Nm . 1). We show how the method can be modified for small mutation rates (4Nm / 0). We outline how this method describes epistatic interactions as, for example, with stabilizing selection. P REDICTING the evolution of quantitative charac- ters from first principles poses a formidable chal- lenge. When multiple loci contribute to a quantitative character z, the effects of selection, mutation, and drift are difficult to predict from the observed values of the trait; this is true even in the simplest case of additive effects. The fundamental problem is that the distribu- tion of the trait depends on the ‘‘microscopic details’’ of the system, namely the frequencies of the genotypes con- tributing to the trait. In an asexual population, long- term evolution depends on the fittest genotypes, which may currently be very rare. In sexual populations—the focus of this article—new phenotypes are generated by recombination in a way that depends on their genetic basis. If selection is not too strong, we can assume Hardy–Weinberg proportions and linkage equilibrium (HWLE): this is a substantial simplifica- tion, which we make throughout. Even then, however, we must still know all the allele frequencies, and the effects of all the alleles on the trait, to predict the evolution of a polygenic trait. In this article, we seek to predict the evolution of quantitative traits without following all the hidden variables (i.e., the allele frequencies) that determine its course. For this purpose, several simplifications have been proposed. The central equation in quantitative genetics is that the rate of change of the trait mean equals the product of the selection gradient and the additive genetic variance (Lande 1976). This simple prediction can be surprisingly accurate, since the genetic variance often remains roughly constant for some tens of gen- erations (Falconer and Mackay 1996; Barton and Keightley 2002). However, we have no general un- derstanding of how the genetic variance evolves or, indeed, what processes are responsible for maintaining it (Bu ¨ rger et al. 1989; Falconer and Mackay 1996; Barton and Keightley 2002). Even if we take the simplest view, that variation is maintained by the opposition between mutation and selection, the long- term dynamics of the genetic variance still depend on the detailed distribution of effects of mutations. Lande (1976), following Kimura (1965), approxi- mated the distribution of allelic effects at each locus as a Gaussian distribution. However, this is accurate only when many alleles are available at each locus and when mutation rates are extremely high (Turelli 1984). Barton and Turelli (1987) assumed that loci are close to fixation, but again this approximation has limited application: in particular, it cannot apply when one allele substitutes for another. Some progress has been made by describing a polygenic system by the moments of the trait distribution (Barton 1986; Barton and Turelli 1987, 1991; Turelli and Barton 1990). For additive traits, a closely related description in terms of the cumulants is a more natural way to represent the effects of selection (Bu ¨ rger 1991, 1993; Turelli and 1 Corresponding author: Institute of Science and Technology Austria, Am Campus 1, 3400 Klosterneuberg, Austria. E-mail: [email protected] Genetics 181: 997–1011 (March 2009)

Upload: others

Post on 09-Nov-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Mechanics and the Evolution of Polygenic

Copyright � 2009 by the Genetics Society of AmericaDOI: 10.1534/genetics.108.099309

Statistical Mechanics and the Evolution of Polygenic Quantitative Traits

N. H. Barton*,†,1 and H. P. de Vladar‡

*Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JN, United Kingdom, †Institute of Science and TechnologyAustria, 3400 Klosterneuberg, Austria and ‡Theoretical Biology Group, University of Groningen, 9751 NN, Haren, The Netherlands

Manuscript received December 2, 2008Accepted for publication December 13, 2008

ABSTRACT

The evolution of quantitative characters depends on the frequencies of the alleles involved, yet thesefrequencies cannot usually be measured. Previous groups have proposed an approximation to thedynamics of quantitative traits, based on an analogy with statistical mechanics. We present a modifiedversion of that approach, which makes the analogy more precise and applies quite generally to describethe evolution of allele frequencies. We calculate explicitly how the macroscopic quantities (i.e., quantitiesthat depend on the quantitative trait) depend on evolutionary forces, in a way that is independent of themicroscopic details. We first show that the stationary distribution of allele frequencies under drift,selection, and mutation maximizes a certain measure of entropy, subject to constraints on the expectationof observable quantities. We then approximate the dynamical changes in these expectations, assumingthat the distribution of allele frequencies always maximizes entropy, conditional on the expected values.When applied to directional selection on an additive trait, this gives a very good approximation to theevolution of the trait mean and the genetic variance, when the number of mutations per generation issufficiently high (4Nm . 1). We show how the method can be modified for small mutation rates (4Nm /0). We outline how this method describes epistatic interactions as, for example, with stabilizing selection.

PREDICTING the evolution of quantitative charac-ters from first principles poses a formidable chal-

lenge. When multiple loci contribute to a quantitativecharacter z, the effects of selection, mutation, and driftare difficult to predict from the observed values of thetrait; this is true even in the simplest case of additiveeffects. The fundamental problem is that the distribu-tion of the trait depends on the ‘‘microscopic details’’ ofthe system, namely the frequencies of the genotypes con-tributing to the trait. In an asexual population, long-term evolution depends on the fittest genotypes, whichmay currently be very rare. In sexual populations—thefocus of this article—new phenotypes are generatedby recombination in a way that depends on theirgenetic basis. If selection is not too strong, we canassume Hardy–Weinberg proportions and linkageequilibrium (HWLE): this is a substantial simplifica-tion, which we make throughout. Even then, however,wemust still knowall theallele frequencies,andtheeffectsof all the alleles on the trait, to predict the evolution of apolygenic trait. In this article, we seek to predict theevolution of quantitative traits without following allthe hidden variables (i.e., the allele frequencies) thatdetermine its course.

For this purpose, several simplifications have beenproposed. The central equation in quantitative genetics

is that the rate of change of the trait mean equals theproduct of the selection gradient and the additivegenetic variance (Lande 1976). This simple predictioncan be surprisingly accurate, since the genetic varianceoften remains roughly constant for some tens of gen-erations (Falconer and Mackay 1996; Barton andKeightley 2002). However, we have no general un-derstanding of how the genetic variance evolves or,indeed, what processes are responsible for maintainingit (Burger et al. 1989; Falconer and Mackay 1996;Barton and Keightley 2002). Even if we take thesimplest view, that variation is maintained by theopposition between mutation and selection, the long-term dynamics of the genetic variance still depend onthe detailed distribution of effects of mutations.

Lande (1976), following Kimura (1965), approxi-mated the distribution of allelic effects at each locus as aGaussian distribution. However, this is accurate onlywhen many alleles are available at each locus and whenmutation rates are extremely high (Turelli 1984).Barton and Turelli (1987) assumed that loci are closeto fixation, but again this approximation has limitedapplication: in particular, it cannot apply when oneallele substitutes for another. Some progress has beenmade by describing a polygenic system by the momentsof the trait distribution (Barton 1986; Barton andTurelli 1987, 1991; Turelli and Barton 1990). Foradditive traits, a closely related description in terms ofthe cumulants is a more natural way to represent theeffects of selection (Burger 1991, 1993; Turelli and

1Corresponding author: Institute of Science and Technology Austria, AmCampus 1, 3400 Klosterneuberg, Austria.E-mail: [email protected]

Genetics 181: 997–1011 (March 2009)

Page 2: Statistical Mechanics and the Evolution of Polygenic

Barton 1994; Rattray and Shapiro 2001). Thesetransformations are exact and quite general: they pro-vide a natural description of selection and recombina-tion and extend to include the dynamics of linkagedisequilibria as well as allele frequencies. A moment-based description provides a general framework forexact analysis of models with a small number of genesand for approximating the effects of indirect selection(Barton 1986; Barton and Turelli 1987; Lenormand

and Otto 2000; Kirkpatrick et al. 2002; Roze andBarton 2006); for some problems, simply truncatingthe higher moments or cumulants can give a good ap-proximation (Turelli and Barton 1990; Rouzine et al.2007). However, results are sensitive to the choice ofapproximation for higher moments, and so the approx-imation is to this extent arbitrary. The equations mustbe truncated ‘‘by hand,’’ guided by mathematical trac-tability rather than biological accuracy.

The fundamental problem is to find a way to ap-proximate the ‘‘hidden variables’’ (in this context, theallele frequencies), but we cannot hope to do this incomplete generality. Even with simple directional selec-tion on a trait, the pattern of allele frequencies dependson the frequencies of favorable alleles that may be ex-tremely rare ðp>1Þ and that will take �ð1=sÞlog ð1=pÞð Þgenerations to reach appreciable frequency. Thus,undetectably rare alleles can shape future evolution,without much affecting the current state. Figure 1 showsan example where the trait mean changes as threealleles sweep to fixation at different times and rates. Bychoosing initial frequencies and allelic effects appro-priately, we could produce arbitrary patterns of traitevolution. We can hope to make progress only insituations where the underlying allele frequencies canbe averaged over some known distribution, rather thantaking arbitrary values.

Despite this fundamental difficulty, progress can bemade in two ways. First, we can include random driftand follow the distribution of allele frequencies, ratherthan the deterministic evolution of a single popula-tion. Then, we can hope that the distribution of allelefrequencies, conditional on the observed trait values,will explore the space of possible states in a predictableway. Second, we can allow selection to act only on the

observed traits and assume that the distribution of allelefrequencies spreads out to follow the stationary distri-bution generated by such selection. That makes it muchharder (and perhaps impossible) for populations toevolve into an arbitrary state with unpredictable andidiosyncratic properties. There is an analogy here withclassical thermodynamics, in which molecules mightstart in a special state, such that after some time theyconcentrate in a surprising way: all the gas might rushto one corner, for example. However, if all states withthe same energy are equally likely, this is extremelyimprobable.

An approach along these lines was proposed byPrugel-Bennett and Shapiro (1994, 1997), Rattray

(1995), and Prugel-Bennett (1997), as a generalframework for approximating the dynamics of poly-genic systems. Prugel-Bennett and Shapiro (1994,1997), Rattray (1995), and Prugel-Bennett (1997)drew on analogies with statistical physics to propose anelegant approach to approximating polygenic systems.The distribution of microstates is chosen as that whichmaximizes an entropy measure (see also van Nimwegen

and Crutchfield 2000; Prugel-Bennett 2001; Rogers

2003). Prugel-Bennett and Shapiro (1994, 1997),Rattray (1995), and Prugel-Bennett (1997) definean entropy that is maximized by a multinomial allelefrequency distribution, which is sharply peaked aroundequal allele frequencies. Thus, the procedure is toassume maximum polymorphism, subject to con-straints on the trait distribution. This averaging yieldsa generating function from which observables (e.g.,mean values, variance, and higher moments, as wellas correlations) can be obtained. This leads to goodapproximations for a variety of problems, including mul-tiplicative selection, pairwise epistasis, and stabilizingselection (Rattray 1995); it describes the dynamicsof an arbitrary number of observable variables and pro-vides a consistent way to deal with hidden microscopicvariables.

In this article, we use procedures analogous tostatistical thermodynamics, but adapted to populationgenetics. First, we use an information entropy measure,SH, which is derived from population genetic consid-erations and ensures an exact solution at statisticalequilibrium. This measure, which is proportional tothe quantity H, defined by Boltzmann (1872), wasproposed by Iwasa (1988) and independently by Sella

and Hirsh (2005); see N. H. Barton and J. B. Coe

(unpublished data). Second, we choose to follow a set ofobservable quantities that includes all those acted ondirectly by mutation and selection. This reveals a naturalcorrespondence between the observables and the evo-lutionary forces that act on them, which is analogous toextensive and intensive variables in thermodynamics(N. H. Barton and J. B. Coe, unpublished data). Thesetwo innovations allow us to set out the method in a verygeneral way.

Figure 1.—The frequency of favorable alleles at three loci(left) and the mean of an additive trait, �z (right). Initial fre-quencies are 10�5, 10�6, 10�7, and effects on the trait are 1, 2,12 , so that �z ¼ p1 1 2p2 1 1

2 p3; the selection gradient is b ¼ 0.01(i.e., fitness is eb�z).

998 N. H. Barton and H. P. de Vladar

Page 3: Statistical Mechanics and the Evolution of Polygenic

Throughout, we make the usual approximations ofpopulation genetics, that populations are at HWLE andthat drift, mutation, and selection are weak. Linkageequilibrium is justified if selection and drift are not onlyweak (s; 1=2N >1) but also weak relative to recombina-tion (s; 1=2N >r). We also assume only two alleles ateach locus. These assumptions allow us to describepopulations solely in terms of allele frequencies at n lociðp~¼ p1; . . . ; pnÞ and to use the continuous-time diffu-sion approximation.

We begin by analyzing the stationary distribution,showing the analogy with thermodynamics. Iwasa

(1988) showed that the free fitness, which is the sumof the log mean fitness and the information entropySH (logð �W Þ1 ð1=2N ÞSH ), always increases through timeand reaches a maximum at the classic stationary dis-tribution of allele frequencies under mutation, selection,and drift,

cðp~Þ ¼ 1

Z�W 2N

Yni¼1

ðpiqiÞ4N m�1; ð1Þ

where pi (i¼ 1, . . . , n) are the allele frequencies at n loci,qi ¼ 1 � pi, N is the number of diploid individuals, �W isthe mean fitness, and m is the mutation rate (Wright

1937). The normalizing constant Z plays a key role; it isanalogous to the partition function in statistical me-chanics and acts as a generating function for thequantities of interest, in the sense that its derivativesgive the expectations of the macroscopic quantities(Barton 1989). We then show how the rates of changeof expectations of observable quantities can be approx-imated by averaging over this stationary distribution c.The crucial assumption here is that the distribution ofallele frequencies always has the form of Equation 1.This is accurate provided that the system evolves asa result of slow changes in the parameters, so that ithas time to approach the stationary state. By analogywith thermodynamics, such changes are termed revers-ible (Ao 2008; N. H. Barton and J. B. Coe, unpublisheddata).

After setting out the method in a general way, weapply it to directional selection on an additive trait. Inthis simple case, we can give closed-form expressions forthe generating function Z and hence for observablessuch as the expectations of the trait mean, genotypicvariance, genetic variability, and so on. We then showthat our approximation to the allele frequency distri-bution gives a good approximation to the dynamicalchange in the trait distribution, even when selectionchanges abruptly. However, the method works onlyfor high mutation rates (4Nm . 1) and breaks downwhen 4Nm , 1. Nevertheless, we show how the methodcan be adapted to the case where 4Nm is small. Finally,we outline the application to cases such as stabiliz-ing selection, where there are epistatic interactions forfitness.

GENERAL ANALYSIS

Defining entropy: The key concept is of an entropy,SH, which measures the deviation of the populationfrom a base distribution f—in this case, the densityunder drift alone. Entropy always increases as the pop-ulation converges toward f under drift. With selec-tion and mutation, a ‘‘free energy’’—the sum of theentropy and a potential function—always increases(Iwasa 1988). We show that the stationary distributionmaximizes SH subject to constraints of the expectedvalue of a set of observable quantities. Thus, the dynamicsof these quantities can be approximated by assuming thatthe entropy is always maximized, conditioned on theirvalues.

There is a wide range of definitions, interpretations,and generalizations of entropy (e.g., Renyi 1961; Wehrl

1978; Tsallis 1988); these have been applied to bi-ological systems in various ways. Iwasa (1988) intro-duced the concept of entropy into population genetics,for a diallelic system of one locus under reversible mu-tation and with arbitrary selection; he also considered aphenotypic model of quantitative trait evolution. Iwasa

(1988) used an information entropy, also known as arelative entropy (Gzyl 1995, Chap. 3; Georgii 2003),and defined as

SH ½c�[�ð

c logc

f

� �dp~: ð2Þ

This is a function of the probability distribution ofallele frequencies, c, that evaluates the average entropyof a function with respect to a given base distribution, f,integrated over all possible allele frequencies, denotedby dp~¼ ðdp1dp2 . . . dpnÞ. It can be thought of asthe expected log-likelihood of f, given samples drawnfrom a distribution c; it has a maximum at c ¼ f, whenSH[f] ¼ 0. We denote it by a subscript H because it isessentially the same as the measure introduced byBoltzmann (1872) in his H-theorem.

The variation of SH with respect to small changes inc is

dSH ½c� ¼ �ð

l 1 logc

f

� �� �dcdp~; ð3Þ

where l is a Lagrange multiplier associated with thenormalization condition

Ðc dp~¼ 1 (N. H. Barton and

J. B. Coe, unpublished data) (see supplemental infor-mation A). Note that because c is normalized,

Ðdcdp~¼

0. With no constraints other than this normalization,setting dSH¼ 0 implies that the entropy is at an extremeonly if c ¼ f; this is a unique maximum.

We are interested in a set of observable quantities, Aj,which are functions of the allele frequencies in apopulation. These might, for example, describe the dis-tribution of a quantitative trait—for example, its meanand variance. We need to find the distribution of allele

Approximating Polygenic Evolution 999

Page 4: Statistical Mechanics and the Evolution of Polygenic

frequencies, c, that maximizes the entropy, SH[c], givenconstraints on the expected values of these observables,ÆAj æ:

ÆAj æ ¼ð

Aj cdp~: ð4Þ

With these constraints, the extremum of Equation 3 iscalculated by including the Lagrange multipliers asso-ciated with the Aj’s, defined for convenience as �2Naj:

ðl 1 log

c

f

� ��X

j

2N ajAj

!dcdp~¼ 0: ð5Þ

At the extremum, the term in parentheses shouldbe zero. This implies that the distribution that maxi-mizes entropy, subject to constraints, is the Boltzmanndistribution

cME ¼f

Zexp

Xj

2N ajAj

" #¼ f

Zexp 2N~a � A~

h i; ð6Þ

where we have expressed the Lagrange multiplier l asZ¼ exp(l) and choose Z to normalize the distribution.We show later that Z is a generating function for themoments of the observables, Aj. It will play a major rolein our calculations:

Z ¼ð

f exp 2N ð~a � A~Þh i

dp~: ð7Þ

We show that under directional selection and muta-tion the aj can be identified with the set of selectioncoefficients and mutation rates and the Aj with thequantities on which selection and mutation act (e.g.,trait mean and genetic variability). The potential func-tion

Pj Aj aj ¼ ~a � A~ consists of the log-mean fitness,

logð �W Þ, plus a term representing the effect of muta-tion, 2m

Pj logðpjqjÞ. Then, Equation 6 gives the

classical stationary density of Equation 1. (Note thatalthough ~a � A~ must equal the potential function,which includes all evolutionary processes, apart fromdrift, we still have some freedom to separate this intocomponents in a variety of ways. For example, di-rectional selection on a set of traits could be repre-sented by almost any linear basis. Nevertheless, therewill usually be a natural set of components thatrepresent different evolutionary processes. In addi-tion, we are free to include additional observables thatare not necessarily under selection, and so have ai ¼ 0.These extra degrees of freedom will improve the ac-curacy of our dynamical approximations.)

Note that there is an alternative measure of entropy,SV, defined by the log density of states that are con-sistent with macroscopic variables ÆA~æ (Barton 1989).

N. H. Barton and J. B. Coe (unpublished data) discussthe relation between SV and SH and show that these twomeasures converge when the distribution clusters closeto its expectation.

The generating function, Z: The normalizing con-stant Z, which is a function of ~a, acts as a generatingfunction for quantities of interest. Differentiating withrespect to (w.r.t.) 2N~a we find that

@logðZÞ@ð2N ajÞ

¼ ÆAj æ: ð8Þ

Differentiating w.r.t. population size,

@ logðZÞ@ð2N Þ ¼ Æ~a � A~æ: ð9Þ

Differentiating again w.r.t. the ~a gives the covariancebetween fluctuations in the A~:

@2logðZÞ@ð2N ajÞ@ð2N akÞ

¼ CovðAj ;AkÞ[ Cj ;k : ð10Þ

This covariance matrix, which we denote C, plays animportant role in the dynamical approximation (Le

Bellac et al. 2004, p. 64).Analyzing the dynamics: As the system moves away

from stationarity, it will not in general follow precisely thedistribution that maximizes entropy. [This can be seen bysubstituting the maximum entropy distribution fromEquation 6 with time-varying parameters ~aðtÞ as a trialsolution to the diffusion equation]. However, the distri-bution of microscopic variables may nevertheless stayclose to a maximum entropy distribution (Nicolis andPrigogine 1977; De Groot and Mazur 1984; Goldstein

and Lebowitz 2004). Our key assumption is that themacroscopic variables change slowly enough that thesystem is always close to a local equilibrium.

We show that the Lagrange multipliers,~a, correspondto forces that act on the observables, A~: directionalselection acts on the trait mean, mutation on a measureof genetic diversity U (see below), and so on. Crucially,we assume that changes occur solely through changes inthe parameters ~a; arbitrary perturbations that act di-rectly on the allele frequencies could have arbitraryeffects (as, for example, in Figure 1).

Assume that changes in allele frequency are deter-mined by a potential function ~a � A~, which can be writ-ten as a sum of components akAk. [In physics, energyacts as a potential; in population genetics, mean fitnessplays an analogous role; it defines an adaptive landscapesuch that allele frequencies and their means change atrates proportional to the fitness gradient (Wright

1967; Lande 1976).] Our method works only for systemswhose dynamics can be described by a potential in thisway (although see Ao 2008). In an infinitesimal time dt,the mean and mean-square changes are

1000 N. H. Barton and H. P. de Vladar

Page 5: Statistical Mechanics and the Evolution of Polygenic

Ædpiæ ¼piqi

2

@ð~a � A~Þ@pi

; ð11aÞ

Ædpidpj æ ¼ 0 for i 6¼ j ; ð11bÞ

Ædp2i æ ¼ piqi

2N: ð11cÞ

The first equation, for Ædpæ, is just Wright’s (1937)formula for selection, modified to include mutation.The variance of allele frequency fluctuations, dp2h i, isthe standard formula for random drift. Under thediffusion approximation, this leads to the stationarydistribution of Equation 1, provided that the base dis-tribution is defined as

f ¼Yni¼1

piqi

!�1

: ð12Þ

Under the diffusion approximation, the rate ofchange of ÆAj æ is

@ÆAj æ@t¼Xn

i¼1

@Aj

@piÆdpiæ 1

1

2

Xn

i¼1

Xn

k¼1

@2Aj

@pi@pkÆdpidpkæ

¼X

k

Bj ;kak 11

2NVj ; ð13Þ

where

Bj ;k ¼Xn

i¼1

@Aj

@pi

piqi

2

@Ak

@pi

* +and Vj ¼

Xn

i¼1

piqi

2

@2Aj

@p2i

* +:

ð14Þ

This relationship is exact, provided that the matrix Band the vector V are evaluated at the current distributionof allele frequencies. Equation 13 can also be deriveddirectly, by making a diffusion approximation to multi-variate observables, where the deterministic terms areai ¼

Pkð@Aj=@piÞðpiqi=2Þak , and the diffusion terms are

bi ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffipiqi=2N

p(Ewens 1979; Gardiner 2004). If our

system is described by only one observable, we directlyrecover the formuladerived byEwens (1979,pp. 136–137).

The local equilibrium approximation: In general, asthe system moves away from stationarity, it will notprecisely follow the distribution that maximizes entropy.However, the distribution of microscopic variables maynevertheless stay close to a maximum entropy distribu-tion if the macroscopic variables change slowly enoughsuch that the system remains close to a local equilibriumat all times (Prigogine 1949; Klein and Prigogine

1953; Nicolis and Prigogine 1977; De Groot andMazur 1984; Goldstein and Lebowitz 2004).

We now approximate Bj,k and Vj by B*i;k, Vj*, assumingthe distribution in Equation 6 is evaluated at ~a*. Weknow that at the stationary state, under parameters ~a*,

expectations are constant, and so from Equation 13,Pk B*j ;kak* 1 ð1=2N ÞVj* ¼ 0. Therefore

@ÆAj æ@t�X

k

B*j ;kðak � a*kÞ: ð15Þ

The matrix Bj,k is closely related to the additivegenetic covariance matrix. Making the link withquantitative genetics is not quite straightforward,because the Aj are arbitrary functions of the allelefrequencies and need not be the means of actual traitscarried by individuals. Nevertheless, if we do regardthem as the means of some quantity, then @Aj/@pi istwice the average effect of alleles at locus i. [Since weassume HWLE, average effect is equal to averageexcess (Falconer and Mackay 1996)]. Therefore,Bj,k is the expected additive genetic covariance be-tween Aj and Ak, the expectation being taken over thedistribution of allele frequencies. Moreover, if the ak

contribute to the log-mean fitness (rather than to thecomponent of the potential that describes mutation),then they can be interpreted as selection gradients inthe usual way. Equation 15 thus gives the rates ofchange of the expected trait means as the product ofthe expected additive genetic covariance, and thedifference between the actual selection gradient, ak,and the gradient that would give stationarity at thecurrent expectations, a*k . This interpretation becomesclearer when we consider specific examples, below.

In the theory of nonequilibrium thermodynamics,equations similar to Equation 15 are called phenome-nological equations (van Kampen 1957; De Groot andMazur 1984, Chap. IV). They were first postulated asapproximations to processes that are close to equilib-rium. In such cases, the variables a* represent thedeviation from an equilibrium defined by a. Theseequations are valid as long as a local equilibrium exists,and (as suggested by Equation 13) it holds in generalthat Bk,j ¼ Bj,k (Onsager 1931; Prigogine 1949).

For theoretical purposes, we can follow either theexpectations ÆAj æ themselves or the parameters a*k thatwould give those expectations at stationarity. In numericalcalculations, the latter is more convenient, because thatavoids calculating the a*j from the ÆAj æ (a tricky inverseproblem). The rates of change of the a*k are related to therates of change of the ÆAj æ via the matrix @ÆAj æ=@a*k . Now,since ÆAj æ ¼ @ logðZÞ=@ 2N a*k

� �, we have

@ÆAj æ=@a*k ¼ 2N @2logðZÞ=@ð2N a*jÞ@ð2N a*kÞh i

¼ 2NCjk*:

Thus, the relation between the ÆAj æ and the ak* is viathe covariance of fluctuations, C*j ;k (Equation 10). Inmatrix notation (equivalent to De Groot and Mazur

1984, p. 36),

Approximating Polygenic Evolution 1001

Page 6: Statistical Mechanics and the Evolution of Polygenic

@~a*

@t� 1

2NC*�1 � B* � ð~a�~a*Þ; ð16Þ

where C is the matrix of covariances of fluctuations inthe A~, and B is analogous to the additive geneticcovariance matrix.

Both of these are evaluated at the stationary distribu-tion defined by ~a*. For given ~a*, we find the ÆA~æ byintegrating using the density in Equation 6, or byapplication of Equation 8.

DIRECTIONAL SELECTION, MUTATION, AND DRIFT

Analysis: The stationary distribution: We now apply thismethod to a quantitative trait under directional selec-tion, mutation, and drift. We first define a measure ofgenetic variability (N. H. Barton and J. B. Coe, un-published data),

U ¼ 2Xn

i¼1

logðpiqiÞ; ð17Þ

which is 2n times the log-geometric mean heterozygosityacross loci (plus a constant); n is the number of loci. Therate of change of pi due to symmetric mutation ismðqi � piÞ ¼ ðpiqi=2Þð@ðmU Þ=@piÞ, as required by Equa-tion 11a. Under our assumption of linkage equilibrium,the rate of change of pi due to selection is ðpiqi=2Þ=ð@logð �W Þ=@piÞ (Equation 11a). The log mean fitness,logð �W Þ, is a natural potential for the system and will beexpressed as a sum of components A~ �~a, where the~a area set of selection coefficients. We deal with the verysimplest case of exponential (directional) selection, butnote that the derivation applies to any form of selectionfor which a potential function can be defined—mostobviously, the case where genotypes have fixed fitnesses(see supplemental information A and D). If individualswith trait value z have fitness ebz, then to leading order inb, the mean fitness is �W ¼ e b�z. Wright’s equilibrium densitycan then be written in the form of Equation 6, with A~ ¼f�z;U g and ~a ¼ fb;mg:

c ¼ f

Ze2N ðb�z1mU Þ: ð18Þ

Thus, the stationary distribution under mutation,selection, and random drift is given by maximizing theentropy subject to constraints on the expected geneticdiversity ÆU æ and the expected trait mean Æ�zæ. Theentropy is defined by Equation 2, with baseline distri-bution f ¼

Qni¼1ðpiqiÞ�1. Then, Equation 18 is the sta-

tionary distribution and is equal to Equation 1.We have shown that a population evolving under

mutation, multiplicative selection, and drift will con-verge to a stationary distribution that has maximumentropy, SH, given the expected trait mean and geneticdiversity. As we will see below, other forms of selectioncan be represented by introducing other observables.

Each constrained observable will be conjugated with anatural variable: in this example, the expected mean Æ�zæcorresponds to the strength of directional selection b,and the expected diversity ÆU æ to the mutation rate m.Information about the full distribution of the observ-ables is contained in the normalizing constant Z, whichis a generating function that depends only on the naturalvariables aj. In the next section, we calculate an explicitexpression for it.

The generating function for an additive trait: We have notyet made any assumptions about the genetic basis of thetrait, z; in general, there might be arbitrary dominanceand epistasis. We now assume that it is additive, withlocus i having effect gi,

z ¼Xn

i¼1

giðXi 1 X *i � 1Þ; ð19Þ

where Xi and Xi* represent the allelic states (labeled 0 or1) of the two copies of each of the n loci. With additivity,exponential selection on the trait corresponds tomultiplicative selection on the underlying loci. If weaverage over the population, where pi represents thefrequency of the Xi ¼ 1 allele, and qi ¼ 1 � pi, then themean and genetic variance are

�z ¼Xn

i¼1

giðpi � qiÞ; vz ¼ 2Xn

i¼1

g2i piqi : ð20Þ

More than two alleles could be allowed, but only forspecial mutation rates that give detailed balance (that is,there must be no flux of probability in the stationary state).

The normalization function Z can now be calculatedexplicitly, using Equation 7. In this simple case ofdirectional selection on an additive trait, the integrandseparates out as a product over allele frequencies, and so

Z ¼ð

exp 2N ðb�z 1 mU Þ½ �Yni¼1

piqi

!�1

dp~ ð21aÞ

¼Yn

i¼1

ð1

0e2N b giðp�qÞðpqÞ4N m�1dp

� �ð21bÞ

¼Yni¼1

ffiffiffiffipp

21�8N mG½4N m�0F11

21 4N m; ðN bgiÞ2

� �� �

ð21cÞ

¼Yni¼1

ðffiffiffiffippð4N bgiÞð1=2Þ�4N mG½4N m�I4N m�ð1=2Þð2N bgiÞÞ;

ð21dÞ

where G(�) and 0F1(�, �) are the gamma and the regular-ized confluent hypergeometric functions, respectively.We have also given an equivalent form, in terms of themodified Bessel function of order n, In(�).

1002 N. H. Barton and H. P. de Vladar

Page 7: Statistical Mechanics and the Evolution of Polygenic

Finding the expectations ÆUæ,Æ�zæ: The expectations,variances, and covariances of �z and U can be calculatedeither by direct integration or by taking derivatives oflog(Z) w.r.t. b and m (Equations 8 and 10). Explicitformulas are given in supplemental information A.

Figure 2 shows how the expected values change for arange of mutation rates and selection pressures, for apopulation of individuals with n loci of equal effect, gi¼1. As selection becomes strong relative to mutation, theallele X ¼ 1 tends to fixation, and �zh i tends to n (topright of Figure 2). As mutation becomes strong relativeto drift, allele frequencies tend to 1

2 , and �zh i tends tozero (top left of Figure 2). The expected diversity, ÆU æ,increases with mutation rate (bottom left of Figure 2)and decreases slightly with the strength of selection(bottom right of Figure 2). Figure 2 compares thediffusion approximation with the Wright–Fisher modelfor N¼ 100. There is close agreement for �zh i for all 4Nm

and for ÆU æ when 4Nm . 1. In the discrete model,ÆU æ must be calculated excluding the fixed classes, sinceU would otherwise be infinite. This has negligible effectwhen 4Nm . 1 because fixation is unlikely. However,when 4Nm , 1, there is a substantial probability of beingfixed, even when fixed classes must be dropped. Thus,ÆU æ depends on population size and differs substantiallyfrom the diffusion approximation (compare bottomseries of dots with bottom curve in Figure 2, bottomright). The stationary density is still close to the diffusionapproximation for polymorphic classes, and so for verylarge N, when the probability of actually being fixedbecomes small ð�

Ð 1=2N0 p4N m�1dp>1Þ, ÆU æ in the dis-

crete Wright–Fisher model does converge to the diffu-sion approximation. However, for population sizes in thehundreds, there is still a very large discrepancy. Weconsider the implications of small 4Nm for the maximumentropy method below.

For an additive trait, and equal allelic effects, thedistribution of allele frequencies is the same at eachlocus, and so this simple case is essentially a single-locusanalysis. However, this is no longer the case when weallow unequal allelic effects; more generally, if there isepistasis for fitness, the allele frequency distributions ateach locus are no longer independent, and if there isepistasis for the trait, we can no longer treat macro-scopic variables as sums over loci.

Covariances of fluctuations, C, and additive geneticvariance, B: To approximate the dynamics, we needthe covariances of fluctuations, C, and the additivegenetic covariance, B, defined above. The matrix C,which gives the variances and covariance of U and �z, iscalculated by taking derivatives of the generatingfunction (Equation 10; supplemental information A,Equations A3–A6).

The additive genetic covariance matrix B is definedin Equation 13, in terms of the derivatives @Aj/@pi. Forthe observables fU ;�zg, these are 2 qi � pið Þ= piqið Þ; 2f g.Using the relation ðqi � piÞ2 ¼ 1� 4piqi ,

B ¼Xn

i¼1

piqi

2

@U@pi

2@U@pi

@�z@pi

@U@pi

@�z@pi

@�z@pi

2

0@

1A* +

¼Xn

i¼1

2 1piqi� 4

2giðqi � piÞ

2giðqi � piÞ 2g2i piqi

!* +: ð22Þ

Note that B�z;�z ¼Pn

i¼1ðpiqi=2Þ @�z=@pi

� �2D E

¼Pni¼1 2g2

i piqi

� �is just the expected genetic variance for

the trait z, vzh i, consistent with our interpretation of B asa genetic covariance matrix. For this model, B has asimple form:

B ¼2ð2n 1 4N bÆ�zæÞ

4N m� 1 �2Æ�zæ�2Æ�zæ 2ðN mÞÆ�zæ

N b

!: ð23Þ

Remarkably, B depends only on �zh i and not directly onthe distribution of allelic effects, g. Note that theexpected genetic variance Ævzæ ¼ 2ðN m=N bÞÆ�zæ, evenwith unequal allelic effects. This can be understood byseeing that the rates of change of Æ�zæ due to mutation,�2m �zh i, and due to selection, bÆvzæ, must balance atstatistical equilibrium. In the limit where selectionbecomes weak, both Æ�zæ and Nb tend to zero, and theexpected genetic variance tends to a definite limit.

The coefficient BU,U includes the expectation of1=ðpiqiÞ, which diverges when 4Nm , 1. Because therate of change depends on BU ;U ðm*� mÞ (Equation 23),that implies that m* must be held fixed at its actual value(i.e., m* ¼ m). In effect, therefore, ÆU æ can no longer be

Figure 2.—Dependence of �zh i; ÆU æ on Nm, Nb. The solidcurves show the diffusion approximation, while the dotsshow exact values for the Wright–Fisher model with N ¼100. The plots against Nm (left column) show Nb ¼ 0.1, 1,10; those against Nb (right column) show Nm ¼ 0.1, 1, 10.Agreement between discrete and continuous models is closefor �zh i and for 4Nm . 1, but the diffusion approximationfails to predict ÆU æ when 4Nm ¼ 0.1 (bottom series of dotsat bottom right). (For the discrete model, ÆU æ is calculatedomitting fixed classes.)

Approximating Polygenic Evolution 1003

Page 8: Statistical Mechanics and the Evolution of Polygenic

included in the approximation. We discuss the implica-tions of this constraint below.

Approximating the dynamics: Evolution of the expecta-tions: We can now use Equation 15 to approximate therates of change of the expectations ÆU æ; �zh i:

d

dt

ÆU æÆ�zæ

� ��

2ð2n 1 4N bÆ�zæÞ4N m� 1 �2Æ�zæ�2Æ�zæ 2N m

N bÆ�zæ

!m� m*

b� b*

� �: ð24Þ

These equations are proportional to the differencebetween the actual parameters {m, b} and the parame-ters that would give a stationary distribution with thecurrent expectations, fm*;b*g. To iterate these recur-sions, we would need to find fm*;b*g from ÆU æ; Æ�zæ,which is troublesome. It is more straightforward to workwith the rates of change of fm*;b*g, which are found bymultiplying the rates of change of the expectations(Equation 24) by the inverse of the covariance offluctuations, C (see Equation 16 and supplementalinformation A, Equation A6). However, because Cdepends on the allelic effects in a complex way (seesupplemental information A, Equations A3–A5), thefull dynamics do depend on the distribution of alleliceffects, gi.

In the following sections we test the accuracy of thislocal equilibrium approximation against two situations:an abrupt change in b or m or a sinusoidal change in m

or b. An abrupt change seems the strongest test of ourapproximation, while a sinusoidal change allows us tofind how the accuracy of the approximation decreases aschanges become faster. For the moment, we focus onhigh numbers of mutations (4Nm . 1). We begin byconsidering the case of equal effects, where the distri-butions at all loci are the same. We also discuss resultsfor the case where most loci have small effect, but somehave large effect: the patterns are similar to thesymmetric case of equal effects, and so we detail themin supplemental information C. Throughout, we com-pare the approximation with numerical solutions of thediffusion equation: these are close to solutions of thediscrete Wright–Fisher model provided that 4Nm . 1(Figure 3).

Equal allelic effects: If all loci have equal effects on thetrait, and if selection acts only on the trait, and not onthe individual genotypes, then under directional selec-tion the distribution of allele frequencies will be thesame at each locus and will be independent across loci.Thus, we need follow only a single distribution, whosetime evolution is given either by numerical solution ofthe diffusion equation or as an expansion of eigenvec-tors (Crow and Kimura 1970, p. 396). However, themaximum entropy approximation is still nontrivial,even in this highly symmetric case, since it approximatesthe full distribution by a few degrees of freedom, such as{Æ2 logðpqÞæ, Æp � qæ}. Also, note that with other forms ofselection, the allele frequency distribution is not in-

dependent across loci: for example, with stabilizingselection populations cluster around states where thesum of allele frequencies is close to the optimum.

Abrupt change in Nb: First, assume that a system is atequilibrium with evolutionary forces b0 and m0. Theseforces are then abruptly changed to new values b and m,and the system moves toward its new stationary state.Figure 3 shows that for moderately high mutation rates(Nm ¼ 0.6), and for an abrupt change of selection fromNb ¼ �2 to 12, the approximation is extremelyaccurate, as compared with the numerical solutions ofthe diffusion equation. The expected genetic diversity,ÆU æ, increases as the allele frequencies pass throughintermediate values, but returns to its original value asÆ�zæ moves from �2 to 12 (top left). This transientincrease in diversity is mainly due to the change in meanallele frequencies: there is only a small transient changein m* (bottom left). The distribution of allele frequen-cies predicted by the approximation is always close tothe actual distribution (not shown).

For a lower mutation rate of Nm ¼ 0.3, close to thecritical value of N m ¼ 1

4 , the effective mutation ratehardly changes: it is held close to the actual value ofNm ¼ 0.3 (Figure 4, bottom left). The approximation isstill accurate, although there is an appreciable discrep-ancy in ÆU æ (Figure 4, top left). For a still lower mutationrate of Nm¼ 0.1, below the threshold where BU,U diverges,m* must necessarily be held equal to the current mutationrate, m (Figure 5, top left). Then, there is a poor fit to thetransient increase in expected diversity, ÆU æ, but thedynamical approximation to Æ�zæ remains accurate (Figure5, top right). (Because m* must be held fixed at itsactual value when 4Nm , 1, ÆU æ is not now included

Figure 3.—(Top) Calculation of genetic variability ÆU æ(left) and trait mean Æ�zæ (right) and over time, with Nm ¼0.6 as Nb changes from�2 to 12 at time t¼ 0. The horizontallines show the stationary values. The solid curves show the ap-proximation, and the dashed curves show numerical solutionsto the diffusion equation; these are not distinguishable onthis scale. (Bottom) Changes over time in the parametersm* (left) and b* (right), calculated using the approximationof Equation 16. Time is scaled to 2N generations.

1004 N. H. Barton and H. P. de Vladar

Page 9: Statistical Mechanics and the Evolution of Polygenic

in the approximation, which therefore now dependson fitting one variable, rather than two.)

Abrupt change in Nm: Figure 6 shows the effects of anabrupt change in mutation rate from Nm ¼ 0.3 to 1.Here, the approximation does poorly when mutation rateincreases abruptly, even when 4Nm is always .1 (Figure 6,left side). It does perform better when the mutation ratedecreases abruptly, however (t . 5 in Figure 6).

Fluctuating selection: As well as examining the effects ofan abrupt change in selection, we have also looked atthe effects of oscillating selection. If fluctuations aresufficiently slow, then the maximum entropy approxi-mation converges to the exact solution. At the otherextreme, when fluctuations are rapid, populations ex-perience an average selective force and behave as ifthere was a constant selective pressure. The approxima-tion is accurate over the whole range of fluctuation fre-quencies (supplemental information B).

Unequal allelic effects: So far, we have assumed equalallelic effects. This ensures that the allele frequencydistribution is the same at each locus, so that we areessentially analyzing a single-locus problem. This is notentirely trivial, since we are approximating the full allelefrequency distribution by two variables, �zh i; ÆU æf g.However, we now turn to the more challenging case ofunequal allelic effects at n loci: now, we are summarizingn distinct distributions by two variables. We do, however,assume that the allelic effects are known.

We draw allelic effects at 10 loci from a gammadistribution, with mean 1 and standard deviation 1

2 :

gi ¼ f1:69; 1:47; 1:15; 1:05; 1:04; 1:03; 1:01; 0:81; 0:500; 0:401g:ð25Þ

The maximum range of the trait is 6P

i gi ¼ 10:15,and the maximum genetic variance is vmax ¼12

P10i¼1 g2

i ¼ 11:66. Twenty-five percent of this is contrib-uted by the locus of largest effect, and 54% by the largestthree loci.

Figure 7 shows the response of the mean and thegenetic variance, as selection changes from Nb ¼ �2 to12, with Nm ¼ 0.3 throughout: the approximationmatches well. There is a transient increase in the geneticvariance as allele frequencies pass through intermediatevalues. In Figure 7, the shift is by 3.08 genetic standarddeviations.

In supplemental information C, we show how theaccuracy of the approximations diminishes as Nm

approaches 14 , in a similar way to Figures 3–5.

LOW MUTATION RATES: 4Nm , 1

Failure of the maximum entropy approximation:When the number of mutations produced per genera-tion is small (4Nm , 1), populations are likely to be closeto fixation. The diffusion approximation still workssurprisingly well: it predicts the allele frequency distri-bution accurately even adjacent to the boundariesðp ¼ 1=2N ; 1� 1=2N Þ. The maximum entropy ap-proximation also makes accurate predictions for thechange in trait mean, provided that the mutation rate iskept fixed (Figure 5, top right). However, the approx-imation does not allow changes in m* when 4Nm , 1.Formally, the coefficient BU,U (Equation 23) diverges,which implies that the effective mutation rate mustalways be held equal to the actual mutation rateðm* ¼ mÞ. Thus we lose 1 d.f. from the dynamics. Whatcauses this pathological behavior?

The key point is that near the boundary, the allelefrequency distribution changes on a much faster time-scale than in the center: the characteristic timescale ofrandom drift is determined by the number of copies ofthe allele in question. Thus, the shape of the distributionat the center and the shape at the edge are uncoupled, sothat it may be impossible to adequately approximate thewhole distribution as being close to the stationary state.Near the boundaries, selection is negligibly slow relative

Figure 4.—The accuracy of the approximation for Nm ¼0.3. Nb changes from �0.7 to 10.7 at time t ¼ 0. Otherwise,details are as in Figure 3.

Figure 5.—The accuracy of the approximation for Nm ¼0.1. Nb changes from �0.7 to 10.7 at time t ¼ 0. The effectivemutation rate, Nm*, is held fixed at Nm (bottom left). Other-wise, details are as in Figure 3.

Approximating Polygenic Evolution 1005

Page 10: Statistical Mechanics and the Evolution of Polygenic

to mutation and drift, and the allele frequency distribu-tion rapidly takes the form p4Nm�1, even while the bulk ofthe distribution remains unchanged (Figure 8). Forexample, suppose that 4Nm changes from ,1 to .1. Thedensity at the boundaries immediately falls to zero, andthe distribution takes on a two-peaked shape that cannotbe approximated by any of the family of stationarydistributions. Conversely, when 4Nm falls below thethreshold, small singularities immediately develop atthe boundaries, representing fixed populations, but ittakes a long time for the bulk of the population toapproach fixation. This asymmetry explains why themaximum entropy approximation is much more accu-rate when 4Nm falls than when it rises (Figure 6, t . 5).

We can gain some insight by analyzing the limit of4Nm / 0, when populations are almost always fixed forone of the 2n genotypes. With directional selection, theprobability of fixation of one or the other allele is inde-pendent across loci and equals Pi ¼ 1= 1 1 expð�4N bgiÞð Þ,where gi is the effect of alleles at the ith locus. Pop-ulations will jump from fixation for ‘‘0’’ to ‘‘1’’ as aresult of the fixation of favorable mutations, at a rate4N mbgi= 1� exp �4N bgið Þð Þ, and in the opposite di-rection due to fixation of deleterious alleles, at a ratethat is slower by a factor expð�4N bgiÞ. In this simplecase, it is easy to write down the dynamics at each locus,

dPi

dt¼ 4N mbgi

Qi

1 1 e�4N bgi� Pie

�4N bgi

1 1 e�4N bgi

� �

¼ 4N mbgi

ðPi � PiÞðPi � QiÞ

; where Pi ¼1

1 1 e�4N bgi;

ð26Þ

noting that this does have a sensible limit as b / 0: @tP¼m(1 � 2P), which is correct for neutral alleles. The traitmean changes as

dÆ�zædt¼ 4N mb

Xn

i¼1

2g2i

ðPi � PiÞðPi � QiÞ

: ð27Þ

The maximum entropy approximation simplifies theproblem by assuming that the Pi always follow astationary distribution, determined by a single parame-ter b*, with Pi ¼ 1= 1 1 exp �4N b*gið Þð Þ. Thus, pro-vided we know the allelic effects, we can deduce the Pi

from the observed �zh i, without knowing the distributionat the n loci individually. From Equation 24, assumingthat m ¼ m*, we have

dÆ�zædt¼ 2

N m

N b* Æ�zæðb� b*Þ: ð28Þ

This can be understood by seeing that at equilibrium,selection must balance mutation, so that b*gi piqih i ¼m pi � qih i at each locus if the Pi follow a stationarydistribution with parameter b*. The rate of change ofthe trait mean is

Xi

2ðbg2i Æpiqiæ� mgiÆpi � qiæÞ

¼X

i

2m

b* giÆpiqiæ� 2mÆ�zæ ¼ 2mÆ�zæb

b*� 1

� �;

equal to Equation 28.It is easy to show that the maximum entropy approx-

imation, Equation 28, converges to the exact solution,Equation 27, for small Nbgi; this is confirmed by Figure9, for Nb¼ 0.2, in an example with equal effects, gi¼ 1.However, for stronger selection (Nb ¼ 2, thick lines inFigure 9), the maximum entropy approximation under-estimates the initial rate of increase. That is because theapproximation is that the initial state, in which Pi¼ 0.02at all loci, is caused by strong selection against the 1allele; such selection would necessarily cause low stand-ing variation, and so the prediction is for a slow responsewhen the direction of selection is reversed. However, assoon as selection is reversed, populations fix new favor-able mutations at a rate that is independent of theprevious standing variation. Thus, the method that led

Figure 6.—The mutation rate increases abruptly fromNm ¼ 0.3 to Nm ¼ 1 at t ¼ 0 and then changes back att ¼ 5; throughout, Nb ¼ 1. The horizontal dashed lines showvalues at the stationary states, the dashed curves show numer-ical solutions of the diffusion equation, and the solid curves inthe top row show the approximation.

Figure 7.—The accuracy of the approximation with un-equal allelic effects, g, given by Equation 25. Nb changes from�2 to 12 at time t ¼ 0; Nm ¼ 0.5 throughout. Otherwise, de-tails are as in Figure 3.

1006 N. H. Barton and H. P. de Vladar

Page 11: Statistical Mechanics and the Evolution of Polygenic

to Equation 24, which was developed for polymorphicpopulations, fails as 4Nm / 0.

Maximum entropy for 4Nm / 0: When mutation israre, populations are almost always fixed for one of the2n genotypes, and an ensemble of populations (orequivalently, the probability distribution of a singlepopulation) evolves as a result of jumps betweengenotypes, mediated by fixation of single mutations.The stationary distribution is proportional to �W 2N ,multiplied by a factor that reflects the pattern ofmutation rates (Iwasa 1988; Sella and Hirsh 2005);this can be derived as the limit of Equation 1 for small4Nm (N. H. Barton and J. B. Coe, unpublished data).We can go further and apply the maximum entropymethod to this process. This gives an approximation forthe dynamics of macroscopic quantities such as Æ�zæ, sothat we do not need to follow the full distribution acrossthe 2n genotypes. In the simplest case of directionalselection on an additive trait, with equal allelic effects,this gives no benefit, since the distribution of fixationprobability is independent across loci and, moreover, isthe same at each locus: the problem therefore involvesjust a single variable, P. However, with unequal effects,the maximum entropy approximation does give a usefulsimplification, since we do not need to follow theindividual Pi. With epistasis for fitness or for the trait,the advantage would be greater, since we would thenavoid following the full probability distribution, acrossthe 2n genotypes. (Note that in the limit of 4Nm / 0, the

model applies regardless of the pattern of recombina-tion, because only one locus evolves at a time.)

We now apply the maximum entropy approximationto directional selection on an additive trait, assumingthat 4Nm / 0, but allowing for unequal allelic effects, gi.(This is distinct from the previous section, since we nowapply maximum entropy to the limiting system, ratherthan apply the limit of 4Nm / 0 to the full maximumentropy approximation.) The system is described bya single local variable, b*, defined implicitly byÆ�zæ ¼

Pi gitanh 2N b*gi½ �; the assumption is that at each

locus, ðPi � QiÞ ¼ tanh 2N b*gi½ �, as if the ensemble wereat a local stationary state under a selection gradient b*.Thus

dÆ�zædt¼X

i

2gi

dPi

dt

¼X

i

2gi 4N mbgi

Qi

1 1 e�4N bgi� Pie

�4N bgi

1 1 e�4N bgi

� �� �

¼ 4N mbX

i

g2i 1� tanh 2N b*gi½ �

tanh 2N bgi½ �

� �:

ð29ÞIt is easier to work in terms of b*. Multiplying by

db*=d�z, we obtain a closed equation for b*:

db*

dt¼ 2mb

Pi g2

i 1� tanh 2N b*gi½ �tanh 2N bgi½ �

P

i g2i 1� tanh 2N b*gi½ �2� � : ð30Þ

When selection is weak 2N b*gi>1ð Þ, Equation 29simplifies to 4N m

Pi g2

i b� b*ð Þ. Since 2N b*P

i g2i �

�zh i in this limit, this converges to 4N mP

i g2i

� �b� 2m �zh i.

The exact solution, Equation 27, converges to the same

Figure 9.—Comparison between the exact solution (Equa-tion 27) and the maximum entropy approximation (Equation28), in the limit of low mutation rates (4Nm / 0). Initially, theprobability that a locus is fixed for the ‘‘1’’ allele is P ¼ 0.02 atall loci, so that Æ�zæ=n ¼ �0:99; all alleles have effect g ¼ 1. Se-lection Nb¼ 0.2 or Nb¼ 2 is then applied, and the trait meanshifts to a new equilibrium, in which a fraction P ¼ 1/(1 1exp(�4Nb)) of loci are fixed for the 1 allele. When selectionis weak (Nb ¼ 0.2), the maximum entropy approximation isbarely distinguishable from the exact solution. However, whenselection is strong (Nb¼ 2), the maximum entropy approxima-tion (dashed lines) underestimates the initial rate of change.

Figure 8.—Failure of the maximum entropic distributionof allele frequencies at the borders for changing mutationrates. Top panel, dotted curves: the genuine distribution,given by the diffusion equation; solid curves show the maxi-mum entropic distribution. The bulk of the distributions iswell approximated initially (curves toward the right; t ¼ 0)and close to equilibrium (bell-shaped curves; t ¼ 5). Bottomleft panel: the max-entropic distribution at different times(from t ¼ 0 to t ¼ 5, top to bottom) near the edge p ¼ 1 in-correctly predicts that there is no fixation, compared the dif-fusion equation solutions at different times (from t¼ 0 to t¼ 5,top to bottom) near the edge p ¼ 1, which shows that some ge-notype fixes (bottom right panel). Numerics are as in Figure 6.

Approximating Polygenic Evolution 1007

Page 12: Statistical Mechanics and the Evolution of Polygenic

limit with weak selection. This is as expected, since whenselection is weak, the population approaches a mutation–drift equilibrium, with genetic variance 4N m

Pi g2

i

� �;

the trait mean then changes at a rate ð4N mP

i g2i Þb due

to selection and �2mÆ�zæ due to mutation.Figure 10 shows an example where selection is strong,

changing abruptly from Nb ¼ �4 to 14. The effects of10 loci are drawn from a gamma distribution, as inEquation 25. The predictions for the mean are in-distinguishable (Figure 10, left). There are substantialerrors in the predictions for the underlying allelefrequencies, with the rate of change of alleles of smalleffect being greatly overestimated (Figure 10, bottomcurve at right) and that of alleles of large effect beingslightly underestimated. However, these errors almostprecisely cancel in their effects on the mean.

DISCUSSION

The maximum entropy approximation: A fundamen-tal aim of quantitative genetics is to understand theevolution of the phenotype, without knowing the un-derlying distribution of all possible gene combinations.Assuming linkage equilibrium simplifies the problem,which then depends only on the allele frequencies ratherthan on the full distribution of genotypes. However, if weinclude random drift, as well as selection and mutation, afull description of the stochastic dynamics requires thedistribution of allele frequencies—a formidable task. Weknow that in general, we cannot predict phenotypicevolution without knowing the frequencies of all therelevant alleles: the future response to selection maydepend on the frequencies of alleles that are currently sorare that they have negligible effect on the phenotypeand so are essentially unpredictable. To avoid thisdifficulty, we make the key assumption that selectionand mutation act only on observable quantities. Then,the distribution of allele frequencies tends toward astationary state that depends only on those forces. Ifselection could instead act on individual alleles, it couldsend the population into arbitrary states by picking outparticular alleles (e.g., Figure 1). Selection on individualalleles would be analogous to Maxwell’s demon, whichperturbs individual gas molecules to generate improb-able states that violate the laws of classical thermody-namics (Leff and Rex 2003).

Populations tend toward a stationary state that max-imizes entropy—that is, the distribution of allele fre-quencies spreads out as widely as possible, conditionalon the average values of the quantities that are acted onby selection and mutation. The maximum entropyapproximation to the dynamics amounts to assumingthat the allele frequency distribution always maximizesentropy, given the current values of the observedvariables, even though those variables may be changing.This approximation converges to the exact solution

when changes in mutation and selection (~a) are slow.However, we find that even if selection abruptly changesin direction, predictions for the trait mean are re-markably accurate.

The analogy between the population genetics ofquantitative traits and statistical mechanics is intrigu-ing. As well as suggesting methods for approximatingphenotypic evolution, it also helps us to better un-derstand the scope of statistical mechanics, by showingthat it does not depend on physical principles such asconservation of energy (Ao 2008). Selection can be seenas generating information, by picking out the best-adapted genotypes from the vast number of possibilities,despite the randomizing effect of genetic drift. This isanalogous to the way that a physical system does usefulwork, despite the tendency for entropy to increase. Suchissues are discussed by N. H. Barton and J. B. Coe

(unpublished data) and by H. P. de Vladar and I. Pen

(unpublished data). Here, we concentrate on the use ofmaximum entropy as an approximation procedure.

Provided that the number of mutations, 4Nm, isconstant, and not too small, the method accuratelypredicts the evolution of the trait mean—even whenallelic effects vary across loci, and even when selectionchanges abruptly. This accuracy even when parameterschange rapidly is surprising, because the underlyingallele frequencies may not be well predicted (e.g., Figure10). Indeed, Prugel-Bennett, Rattray, and Shapiro makeaccurate predictions even though they use an arbitraryentropy measure that does not ensure convergence tothe correct stationary distribution. Although we believethat our entropy measure is the most natural forquantitative genetic problems, and it does guaranteeconvergence to the stationary state, it may be that themaximum entropy approximation is an efficientmethod for reducing the dimensionality of a dynamicalsystem, even when an unnatural measure is used.

Figure 10.—The maximum entropy approximation (Equa-tion 29), made assuming that populations jump betweenfixed states, gives an accurate prediction for the change inmean (left): this is indistinguishable from the exact solution(Equation 27). The population is initially at equilibrium withdirectional selection Nb ¼ �4; selection then changes signabruptly. Allelic effects are given by Equation 25. Predictionsfor the underlying allele frequencies are less accurate. (Right)Allele frequencies at the locus with the strongest effectg1 ¼ 1:69ð Þ, with intermediate effect g5 ¼ 1:04ð Þ, and with

the weakest effect g1 ¼ 0:401ð Þ, reading left to right. Solidlines show the maximum entropy approximation (Equation29), and dashed lines show the exact solution (Equation 27).

1008 N. H. Barton and H. P. de Vladar

Page 13: Statistical Mechanics and the Evolution of Polygenic

The maximum entropy method predicts the full allelefrequency distribution from just a few quantities, such asthe expected trait mean, Æ�zæ. We do still need to know thegenetic basis of the trait—for an additive trait, we mustknow the allelic effects. We could hardly expect topredict the evolution of phenotype without knowinganything about its genetic basis. However, we couldapply the method knowing just the distribution of alleliceffects, which could be estimated in a number of ways:by detection of QTL, from evolutionary argumentsabout plausible distributions (e.g., Orr 2003), or fromthe distribution of allele frequencies at synonymous andnonsynonymous sites (e.g., Loewe et al. 2006).

Extension to dominance and epistasis: We haveanalyzed only the simplest case, of directional selectionon an additive trait. An extension to allow dominance isstraightforward, since the loci still fluctuate indepen-dently of each other, and the generating function, Z,can still be written as a product of integrals across loci.Extension to more than two alleles is also possible, butonly under the restrictive condition that mutation ratesallow for detailed balance and hence for an explicitpotential function. Similarly, alleles at different loci mayinteract in their effect on the trait. If such epistasisinvolves nonoverlapping pairs of loci, then calculationscan still be made, but require integrals over pairs ofallele frequencies. Although it is beyond the scope ofthis article to present such calculations, it is importantto point out that, despite the technical difficulties, themethod itself is general and as such it does not dependon the selective scheme, epistatic model, number ofalleles, etc.

It is relatively straightforward to allow for stabiliz-ing selection on an additive trait. In this case, allelefrequency distributions at different loci are no longerindependent. However, they are coupled only via asingle variable, the trait mean: if this lies above theoptimum, then all loci experience selection for lower �z,and vice versa. This simple coupling allows explicitsolutions for the stationary distribution and for the rateof jumps between metastable states. These calculationsare given in Barton (1989) and Coyne et al. (1997,Appendix). We outline the maximum entropy approx-imation to the dynamics of stabilizing selection insupplemental information D.

For complex models, involving epistasis betweenlarge numbers of genes, calculation of the maximumentropy approximation (i.e., of the matrices B*, C*) bynumerical integration would not be feasible. They couldstill be calculated by a Monte Carlo method: one wouldfix the parameters ~a* and simulate the distribution todetermine the expectations ÆA~æ. The matrix C* could befound from the covariance of fluctuations, and thematrix B* from Equation 13. The two matrices, B ~a*ð Þand C ~a*ð Þ, would then give the dynamics on thereduced space of~a*; this would be feasible numericallyfor two or three variables. Of course, this approach

involves the same kind of computation as a directsimulation. Our claim is that the reduced dynamics willbe approached, regardless of the initial allele frequencydistribution: the system is expected to move close to thelower-dimensional space defined by the maximumentropy approximation. The implication is that wecould predict the evolution of the expectations ÆA~æ bya closed set of equations, without knowing the actualallele frequency distribution. This will require that weknow the genetic basis and mutability of the trait andthat selection acts only on that trait.

Low mutation rates (4Nm , 1): We describe mutationand selection by using a potential functionmU 1 log �Wð Þ, where U ¼ 2

Pi log piqið Þ, and include

the variable Uh i together with selected variables such asthe expectation of the trait mean, �zh i. However, thisapproach fails to describe the effects of changes inmutation rate when 4Nm , 1, because then populationsare likely to be close to fixation, in which case U diverges.The fundamental problem is that the distribution at theboundaries changes rapidly as mutation rate changes,while the bulk of the distribution does not. We can,however, extend the method to the case where 4Nm isvery small, because then populations jump betweenfixation for one or the other genotype, through thesubstitution of single mutations. This limit is in factmore general, in that it applies even with linkage or withasexual reproduction. It could be extended to give amore accurate approximation for appreciable 4Nm, bycalculating the probability of a jump between states ofnear fixation, taking into account the polymorphism atother loci (see Barton 1989).

Long-term response to selection: A basic and long-standing puzzle in quantitative genetics is the success ofartificial selection: in moderately large populations,traits respond steadily to selection for $100 genera-tions, with little change in additive genetic variance andoften with concordance between replicates (Barton

and Keightley 2002). This is surprising, because thegenetic variance is expected to change as alleles sweepthrough the population. However, if the distribution ofallele frequencies is proportional to (pq)4Nm�1, as weassume, and if 4Nm is small, then the additive geneticvariance is expected to stay constant for long periodsunder directional selection. This is because the baselinedistribution f(p)¼ (pq)�1 is uniform when transformedto a logit scale [i.e., f(z) ¼ constant for z ¼ log(p/q)].Since log(p/q) increases linearly with time under di-rectional selection, that implies that the increase ingenetic variance due to rare alleles increasing to becomecommon is precisely balanced by the decrease due tocommon alleles approaching fixation. Thus, the re-sponse to standing variation is expected to continuesteadily at a rate d �zh i=dt ¼ 4N mb

Pi g2

i for �(1/s)log (2N) generations, whereas if alleles were typicallypolymorphic (as would be the case if 4Nm . 1), it wouldcontinue for only �(1/s) generations. Of course, the

Approximating Polygenic Evolution 1009

Page 14: Statistical Mechanics and the Evolution of Polygenic

response will continue indefinitely as a result of newmutation, at just the same rate. This is because variationis initially maintained in a balance between mutationand drift; the genetic variance is not affected by di-rectional selection, and so the rate of response stays thesame even as it shifts from alleles that were originallypresent to new mutations.

The stationary density under mutation, selection, anddrift has been exploited before to help understand theevolution of quantitative traits (e.g., Keightley and Hill

1987; Keightley 1991). In this article, we have shownthat the dynamics of polygenic traits can be accuratelyapproximated by assuming that the underlying distri-bution of allele frequencies always takes this stationaryform. We are now starting to get detailed estimates ofthe distribution of allele frequencies and of alleliceffects on traits and on fitness (e.g., Loewe et al. 2006;Boyko et al. 2008): it may be that we will soon be able touse such data to apply the methods developed here tonatural and artificial populations.

We are grateful to Ellen Baake for helping to initiate this project andfor her comments on this manuscript. We also thank Michael Turellifor his comments on the manuscript and I. Pen for discussions andsupport in this project. This project was a result of a collaborationsupported by the European Science Foundation grant ‘‘Integratingpopulation genetics and conservation biology.’’ N.B. was supported bythe Engineering and Physical Sciences Research Council (GR/T11753and GR/T19537) and by the Royal Society.

LITERATURE CITED

Ao, P., 2008 Emerging of stochastic dynamical equalities and steadystate thermodynamics from Darwinian dynamics. Commun.Theor. Phys. 49: 1073–1090.

Barton, N. H., 1986 The maintenance of polygenic variationthrough a balance between mutation and stabilizing selection.Genet. Res. 49: 157–174.

Barton, N. H., 1989 The divergence of a polygenic system understabilizing selection, mutation and drift. Genet. Res. 54: 59–77.

Barton, N. H., and P. D. Keightley, 2002 Understanding quanti-tative genetic variation. Nat. Rev. Genet. 3: 11–21.

Barton, N. H., and M. Turelli, 1987 Adaptive landscapes, geneticdistance, and the evolution of quantitative characters. Genet.Res. 49: 157–174.

Barton, N. H., and M. Turelli, 1989 Evolutionary quantitative ge-netics: How little do we know? Annu. Rev. Genet. 23: 337–370.

Barton, N. H., and M. Turelli, 1991 Natural and sexual selectionon many loci. Genetics 127: 229–255.

Boltzmann, L., 1872 Further studies on the thermal equilibrium ofgas molecules. Sitzungsber. Akad. Wiss. Wien II 66: 275–370.

Boyko, A., S. Williamson, A. Indap and J. Degenhardt,2008 Assessing the evolutionary impact of amino acid muta-tions in the human genome. PLoS Genet. 4: e1000083.

Burger, R., 1991 Moments, cumulants and polygenic dynamics.J. Math. Biol. 30: 199–213.

Burger, R., 1993 Predictions of the dynamics of a polygenic char-acter under directional selection. J. Theor. Biol. 162: 487–513.

Burger, R., G. P. Wagner and F. Stettinger, 1989 How much her-itable variation can be maintained in finite populations by muta-tion selection balance. Evolution 43: 1748–1766.

Coyne, J. A., N. H. Barton and M. Turelli, 1997 A critique ofWright’s shifting balance theory of evolution. Evolution 51: 643–671.

Crow, J. F., and M. Kimura, 1970 An Introduction to Population Genet-ics Theory. Harper & Row, New York.

De Groot, S. R., and P. Mazur, 1984 Non-Equilibrium Thermodynam-ics. Dover, New York.

Ewens, W. J., 1979 Mathematical Population Genetics. Springer-Verlag,Berlin.

Falconer, D. S., and T. F. C. Mackay, 1996 Introduction to Quantita-tive Genetics, Ed. 4. Longmans Green, Essex, UK.

Gardiner, C., 2004 Handbook of Stochastic Methods. Springer-Verlag,Berlin.

Georgii, H., 2003 Probabilistic aspects of entropy, pp. 37–56 in En-tropy, edited by A. Greven. Princeton University Press, Princeton, NJ.

Goldstein, S., and J. Lebowitz, 2004 On the (Boltzmann) entropyof non-equilibrium systems. Physica D 193: 53–66.

Gzyl, H., 1995 The maximum entropy method, pp. 27–40 in Serieson Advances in Mathematics for Applied Sciences, Vol. 29. WorldScientific, Singapore.

Iwasa, Y., 1988 Free fitness that always increases in evolution. J.Theor. Biol. 135: 265–281.

Keightley, P. D., 1991 Genetic variance and fixation probabilitiesat quantitative trait loci in mutation-selection balance. Genet.Res. 58: 139–144.

Keightley, P. D., and W. G. Hill, 1987 Directional selection andvariation in finite populations. Genetics 117: 573–582.

Kimura, M., 1965 A stochastic model concerning the maintenanceof genetic variability in quantitative characters. Proc. Natl. Acad.Sci. USA 54: 731–736.

Kirkpatrick, M., T. Johnson and N. Barton, 2002 General modelsof multilocus evolution. Genetics 161: 1727–1750.

Klein, G., and I. Prigogine, 1953 Sur la mecanique statistique desphenomenes irreversibles I. Physica 19: 74–88.

Lande, R., 1976 Natural selection and random genetic drift in phe-notypic evolution. Evolution 30: 314–334.

Le Bellac, M., F. Mortessagne and G. G. Batrouni, 2004 Equi-librium and Non-Equilibrium Statistical Thermodynamics. CambridgeUniversity Press, Cambridge, UK.

Leff, H. S., and A. F. Rex, 2003 Maxwell’s Demon II. Institute of Phys-ics, Bristol, UK.

Lenormand, T., and S. Otto, 2000 The evolution of recombinationin a heterogeneous environment. Genetics 156: 423–438.

Loewe, L., B. Charlesworth, C. Bartolome and V. Noel,2006 Estimating selection on nonsynonymous mutations. Ge-netics 172: 1079–1092.

Nicolis, G., and I. Prigogine, 1977 Self Organization in Non-Equilibri-um Systems: From Dissipative Structures to Order Through Fluctuations.John Wiley & Sons, New York.

Onsager, L., 1931 Reciprocal relations in irreversible processes.I. Phys. Rev. 37: 405–426.

Orr, H. A., 2003 The distribution of fitness effects among beneficialmutations. Genetics 163: 1519–1526.

Prigogine, I., 1949 Le domaine de validite de la thermodynamiquedes phenomenes irreversibles. Physica 15: 272–284.

Prugel-Bennett, A., 1997 Modelling evolving populations. J. Theor.Biol. 185: 81–95.

Prugel-Bennett, A., 2001 Modeling crossover-induced linkage ingenetic algorithms. Evol. Comp. 5: 376–387.

Prugel-Bennett, A., and J. L. Shapiro, 1994 An analysis of geneticalgorithms using statistical mechanics. Phys. Rev. Lett. 72: 1305–1309.

Prugel-Bennett, A., and J. Shapiro, 1997 The dynamics of a geneticalgorithm for simple random Ising systems. Physica D 104: 75–114.

Rattray, M., 1995 The dynamics of a genetic algorithm under sta-bilizing selection. Complex Syst. 9: 213–234.

Rattray, M., and J. Shapiro, 2001 Cumulant dynamics of a popu-lation under multiplicative selection, mutation, and drift. Theor.Popul. Biol. 60: 17–32.

Renyi, A., 1961 On measures of entropy and information, pp. 547–561 in Proceedings of the Fourth Berkeley Symposium on Mathematicsand Statistical Probabilities, Vol. 1. University of California Press, Ber-keley, CA.

Rogers, A., 2003 Phase transitions in sexual populations subject tostabilizing selection. Phys. Rev. Lett. 90: 158103.

Rouzine, I., E. Brunet and C. Wilke, 2007 The traveling-wave ap-proach to asexual evolution: Muller’s ratchet and speed of adap-tation. Theor. Popul. Biol. 73: 24–46.

Roze, D., and N. H. Barton, 2006 The Hill–Robertson effect andthe evolution of recombination. Genetics 173: 1793–1811.

Sella, G., and A. E. Hirsh, 2005 The application of statistical physicsto evolutionary biology. Proc. Natl. Acad. Sci. USA 102: 9541–9546.

1010 N. H. Barton and H. P. de Vladar

Page 15: Statistical Mechanics and the Evolution of Polygenic

Tsallis, C., 1988 Possible generalization of Boltzmann-Gibbs statis-tics. J. Stat. Phys. 52: 479–487.

Turelli, M., 1984 Heritable genetic variation via mutation-selectionbalance: Lerch’s zeta meets the abdominal bristle. Theor. Popul.Biol. 25: 138–193.

Turelli, M., and N. H. Barton, 1990 Dynamics of polygenic char-acters under selection. Theor. Popul. Biol. 38: 1–57.

Turelli, M., and N. H. Barton, 1994 Genetic and statistical-analysesof strong selection on polygenic traits: What, me normal? Genetics138: 913–941.

van Kampen, N. G., 1957 Derivation of the phenomenological equa-tions from the master equation. Physica 101: 707–719.

van Nimwegen, E., and J. Crutchfield, 2000 Optimizing epochalevolutionary search: population-size independent theory. Comp.Methods Appl. Mech. Eng. 9: 171–194.

Wehrl, A., 1978 General properties of entropy. Rev. Mod. Phys. 50:221–260.

Wright, S., 1937 The distribution of gene frequencies in popula-tions. Proc. Natl. Acad. Sci. USA 23: 307–320.

Wright, S., 1967 ‘‘Surfaces’’ of selective value. Proc. Natl. Acad. Sci.USA 58: 165–172.

Communicating editor: J. Wakeley

Approximating Polygenic Evolution 1011