natural and sexual selection on many loci · 2002. 7. 8. · sexual selection, but this...

27
Copyright 0 1991 by the Genetics Society of America Natural and Sexual Selection on Many Loci N. H. Barton*.’ and Michael Turellit *Department of Genetics and Biometry, University College, London hWI 2HE, England, and ?Department of Genetics and Center for Population Biology, University ofcalfornia, Davis, Calfornia 95616 Manuscript received April 2, 1990 Accepted for publication September 27, 1990 ABSTRACT A method is developed that describes the effects on an arbitrary number ofautosomallociof selection on haploid and diploid stages, of nonrandom mating between haploid individuals, and of recombination. We provide exact recursions for the dynamics of allele frequencies and linkage disequilibria (nonrandom associations of alleles across loci). When selection is weak relative to recombination, our recursions provide simple approximations for the linkage disequilibria among arbitrary combinations of loci. We show how previous models of sex-independent natural selection on diploids, assortative mating between haploids, and sexual selection on haploids can be analyzed in thisframework.Using our weak-selection approximations, we derive newresultsconcerning the coevolution of male traits and female preferences under natural and sexual selection. In particular, we provide general expressions for the intensity of linkage-disequilibrium induced selection experi- enced by loci that contribute to female preferences for specific male traits. Our general results support the previous observation that these indirect selection forces are so weak that they are unlikely to dominate the evolution of preference-producingloci. T WO questions have recently generated much debateamong evolutionary biologists: how is quantitative variation maintained, and how do mating preferences evolve? Both have been investigated using various simplifying assumptions. Quantitative charac- ters are usually assumed to be determined additively, as the sum of effects of many genes, and to have a normal (Gaussian) distribution of breeding values (BULMER 1980). The evolution of mating preferences has been analyzed with two classes of models. Several simple genetic models assume that each trait is gov- erned by a single diallelic locus, and that selection acts on haploids (e.g., KIRKPATRICK 1982a, 1986a,b; SECER 1985; POMIANKOWSKI 1988). Quantitative-genetic models, on the other hand, assume polygenic inherit- ance. They aremade tractable either by ignoring the underlying genetics and assuming that the distribution of breeding values remainsconstantand Gaussian (e.g., LANDE 1981 ; LANDE and KIRKPATRICK 1988), or by following the distributions of allelic effects at individual loci under the assumption that the multi- variate distribution of effects across loci is Gaussian (e.g., LANDE 1981 ; KIRKPATRICK 1986a). Such simplifications may mislead. More general analyses that allow interactions between large num- bers of loci with no assumptions about the number of alleles, or their effects, arerequiredtodetermine whether conclusions reached under simplifying ge- netic assumptions are robust. This is especially desir- versity of Edinburgh, King’s Buildings, Edinburgh EH9 3JT, Scotland. ‘Present address: Institute of Cell, Animal and Population Biology, Uni- Genetics 127: 229-255 (January, 1991) able when datacome from whole genomes, rather than from individual loci. The obvious example is quantitative genetics, where the basic genetic infor- mation comes fromphenotypiccorrelations among relatives, and where we have no a priori reason to suppose that the underlying variation is strictly addi- tive. Even when additivity is assumed, it is generally not known how robust the conclusions are to assump- tions about the number of alleles at individual loci or their distributions of effects. A similar difficulty arises in arguments based on the “genetic load,” which use the distributionofnet fitness tosetconstraintson selection at individual loci. Such arguments are rele- vant to the evolution of mating preferences, because selection on female preferences for “good genes” de- pends on the additive genetic variance in fitness (CHARLESWORTH 1987). Some analyses have shown that similar conclusions emerge from simple haploid models, fixed-parameter quantitative genetic models, Gaussian allelic models and simulations of actual multilocus diploid inherit- ance (e.g., FELSENSTEIN 1979; KIRKPATRICK 1982b). The aim of this paper is to provide a framework for analytical comparison of alternative genetic assump- tions. We extend our previous treatment of selection on polygenic characters (TURELLI and BARTON 1990) by including sexual selection for haploids and by re- laxing the assumptions of additivity and frequency- independent fitnesses. We provide exact recursions thatdescribe the evolutionof allele frequencies at individual loci and also of all associations among loci.

Upload: others

Post on 05-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Copyright 0 1991 by the Genetics Society of America

Natural and Sexual Selection on Many Loci

N. H. Barton*.’ and Michael Turellit *Department of Genetics and Biometry, University College, London h W I 2HE, England,

and ?Department of Genetics and Center for Population Biology, University ofcalfornia, Davis, Calfornia 95616 Manuscript received April 2, 1990

Accepted for publication September 27, 1990

ABSTRACT A method is developed that describes the effects on an arbitrary number of autosomal loci of

selection on haploid and diploid stages, of nonrandom mating between haploid individuals, and of recombination. We provide exact recursions for the dynamics of allele frequencies and linkage disequilibria (nonrandom associations of alleles across loci). When selection is weak relative to recombination, our recursions provide simple approximations for the linkage disequilibria among arbitrary combinations of loci. We show how previous models of sex-independent natural selection on diploids, assortative mating between haploids, and sexual selection on haploids can be analyzed in this framework. Using our weak-selection approximations, we derive new results concerning the coevolution of male traits and female preferences under natural and sexual selection. In particular, we provide general expressions for the intensity of linkage-disequilibrium induced selection experi- enced by loci that contribute to female preferences for specific male traits. Our general results support the previous observation that these indirect selection forces are so weak that they are unlikely to dominate the evolution of preference-producing loci.

T WO questions have recently generated much debate among evolutionary biologists: how is

quantitative variation maintained, and how do mating preferences evolve? Both have been investigated using various simplifying assumptions. Quantitative charac- ters are usually assumed to be determined additively, as the sum of effects of many genes, and to have a normal (Gaussian) distribution of breeding values (BULMER 1980). The evolution of mating preferences has been analyzed with two classes of models. Several simple genetic models assume that each trait is gov- erned by a single diallelic locus, and that selection acts on haploids (e.g., KIRKPATRICK 1982a, 1986a,b; SECER 1985; POMIANKOWSKI 1988). Quantitative-genetic models, on the other hand, assume polygenic inherit- ance. They are made tractable either by ignoring the underlying genetics and assuming that the distribution of breeding values remains constant and Gaussian (e.g., LANDE 1981 ; LANDE and KIRKPATRICK 1988), or by following the distributions of allelic effects at individual loci under the assumption that the multi- variate distribution of effects across loci is Gaussian (e.g., LANDE 198 1 ; KIRKPATRICK 1986a).

Such simplifications may mislead. More general analyses that allow interactions between large num- bers of loci with no assumptions about the number of alleles, or their effects, are required to determine whether conclusions reached under simplifying ge- netic assumptions are robust. This is especially desir-

versity of Edinburgh, King’s Buildings, Edinburgh EH9 3JT, Scotland. ‘Present address: Institute of Cell, Animal and Population Biology, Uni-

Genetics 127: 229-255 (January, 1991)

able when data come from whole genomes, rather than from individual loci. The obvious example is quantitative genetics, where the basic genetic infor- mation comes from phenotypic correlations among relatives, and where we have no a priori reason to suppose that the underlying variation is strictly addi- tive. Even when additivity is assumed, it is generally not known how robust the conclusions are to assump- tions about the number of alleles at individual loci or their distributions of effects. A similar difficulty arises in arguments based on the “genetic load,” which use the distribution of net fitness to set constraints on selection at individual loci. Such arguments are rele- vant to the evolution of mating preferences, because selection on female preferences for “good genes” de- pends on the additive genetic variance in fitness (CHARLESWORTH 1987).

Some analyses have shown that similar conclusions emerge from simple haploid models, fixed-parameter quantitative genetic models, Gaussian allelic models and simulations of actual multilocus diploid inherit- ance (e.g. , FELSENSTEIN 1979; KIRKPATRICK 1982b). The aim of this paper is to provide a framework for analytical comparison of alternative genetic assump- tions. We extend our previous treatment of selection on polygenic characters (TURELLI and BARTON 1990) by including sexual selection for haploids and by re- laxing the assumptions of additivity and frequency- independent fitnesses. We provide exact recursions that describe the evolution of allele frequencies at individual loci and also of all associations among loci.

Page 2: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

230 N. H. Barton and M. Turelli

haploid life cycle

Q

0

0- viability

selection

viability and fecundity selection

@ diploids

mating

newly produced haploids

diploid life cycle

0 viability selection \

zygotes

/OQ - meiosis

(with recombination)

newly produced haploids

0 0 meiosis,

random mating x,x* L x,x* adults zygotes

both life cycles W(X,X*) f(x,x*) = f(x)f(x*) ' f '(x,x*) - f "(x) recombination

generation t generation t+l FIGURE 1 .--Illustration of the alternative life cycles. The bottom portion of the illustration indicates the features of the alternative life

cycles that are captured by the fitness function W(x, x*). The frequencies of haploid and diploid genotypes are given below the stages of each life cycle to which they apply. The symbolf(x) at the bottom right denotes the frequency of haploid genotype x in the newly produced products of meiosis that form generation t + 1 . The symbols X and X* denote arbitrary haploid genotypes. See the text for additional - information and explanation of the symbols.

These recursions lead to explicit approximations for linkage disequilibria of all orders. The usefulness of such approximations is illustrated by various examples involving selection and assortative mating. We focus on general results concerning the evolution of mating preferences which do not depend on restrictive as- sumptions about the number of loci or the distribu- tions of allelic effects.

MODEL AND GENERAL RESULTS

Genotype frequencies: Our analysis treats two dis- tinct life cycles (see Figure 1). Both involve discrete generations; and in each, the genotype frequencies of the haploid products of meiosis suffice to describe the evolution of the population.

Haploid lqe cycle: T o simplify the analysis of sexual selection, we consider a dioecious haploid organism. The justification for this simplification is that our analysis captures the central role played by linkage disequilibrium in sexual selection models, but bypasses the additional mathematical complications that arise from nonrandom mating among diploids. After via- bility selection, which may be sex-dependent, haploid individuals mate ( i .e . , fuse), in general nonrandomly, to produce monoecious diploids that do not mate. These diploids may be subject to both viability and

fecundity selection that depends separately on the male and female genotypes that came together. The diploids undergo meiosis (including recombination) to produce the next generation of haploid individuals. T o ensure that the genotype frequencies among the newly produced haploid products of meiosis suffice to describe the dynamics, we require that the loci consid- ered to be autosomal and that nonrandom mating occurs between haploids rather than diploids. This haploid model is a multilocus generalization of the two-locus model used by KIRKPATRICK (1982a) to analyze sexual selection. The robustness of conclu- sions reached from haploid sexual selection models has recently been criticized (e.g. , CURTSINGER and HEISLER 1988; TOMLINSON 1988), but it remains un- clear whether diploid inheritance produces qualitative changes in the dynamics or equilibria apart from allowing the maintenance of variation via overdomi- nance of various forms (CJ: KIRKPATRICK 1988; CURT- SINGER and HEISLER 1989; GOMULKIEWICZ and HAS- TINGS 1990, and our discussion of multilocus models below). Our analysis can be generalized to diploid sexual selection, but this generalization is sufficiently complex to require a separate treatment.

Diploid l f e cycle: Another life cycle that is consistent with our analysis is sex-independent viability selection on random mating diploids. Again we require that the

Page 3: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple

TABLE 1

Glossary of notation

23 1

Symbol Usage

coefficient of association between the sets U in females and V in males symmetrized association coefficient between sets U and V c i , ~ = (sup + av,")/2 coefficient of selection for the set of loci U in females coefficient of selection for the set of loci V in males symmetrized selection coefficient for the set of loci U i u . 0 = (au.0 + ~ 0 , ~ ) / 2 expectation of the product of deviations over loci in the set U E({u); if the indices in U are all distinct, this measures

linkage disequilibrium value of E({") assuming global linkage equilibrium

association between loci in the set (U ,V) before selection: l ? ( (~ {c ) moments of allelic effects across loci after selection, but before recombination, with allelic effects in each sex

measured as deviations from the preselection means, m, moments of allelic effects across loci after selection and recombination, with allelic effects in each sex measured as

deviations from the preselection means, m, expectation over a diploid population, formed by randomly uniting newly produced haploids expectation over the haploid population before selection frequency of newly produced haploid products of meiosis with genotype x

frequency of diploids after selection, but before recombination label individual loci mean value of X , over the haploid population before selection: E ( X , ) mean effect of locus i in females before selection: I?(X,) = m, mean effect of locus i in males before selection: f?(Xp) = m, mean effect of locus i in females after selection mean effect of locus i in males after selection frequency of allele P, at locus i in the diallelic case 1 - p , in the diallelic case total frequency of recombination events that disrupt the set N frequency of recombination events that partition S + T into disjoint sets S and T label sets of loci: e.g., S = (iij) concatenation of the elements in sets U and V, e.g., if U = (ii) and V = ( i j ) , U + V = (iiijl set obtained by deleting the elements of V from U , e.g., if U = ( i i i j k ) and V = ( i j ) , U - V = ( i i k ) number of elements in U e.g., I ( i i j ) I = 3 relative contribution of genotype X,X* to the next generation mean fitness: E(w) a vector denoting a haploid genotype: ( X , , X * , . . . , X , ) variable indicating a haploid genotype at locus i ; when used to describe events within a generation, it refers to a

haploid genotype at locus i in females (for the haploid model) or one that is maternally derived (for the diploid model)

product of X , over the set U &,Xi a haploid genotype at locus i in males (for the haploid model) or one that is paternally derived (for the diploid

change in the central moment CN between generations first-order terms in ACN: see Equations 20, 21 and 22

deviation of locus i from its mean in pre-selection haploids: X , - m,; or the deviation of locus i from the mean in

model)

maternal genes: X , - m,,0 (the context will indicate which of these definitions is appropriate) deviation of locus i from the mean in paternal genes: X? - mg,,

product of [, over the set U IIrEu<,

loci considered be autosomal. This diploid model is the contributions of pairs of diploid genotypes to the the usual multilocus viability selection model. A model next generation. that included sexual selection, fertility selection or Basic notation: We follow the effects of selection, sex-dependent viability selection on diploids would be mating (nonrandom for haploids), and recombination a straightforward (if complicated) extension: one on an arbitrary number of loci. The methods and would follow the frequency of diploid genotypes, and notation used to analyze both life cycles are essentially

Page 4: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

232 N. H. Barton and M . Turelli

the same as those used by TURELLI and BARTON (1990) to model natural selection on an additive polygenic character (Table 1). However, it is useful here to express the effects of selection in a slightly different way. The state of the ith locus in a newly produced haploid product of meiosis is denoted by the random variable Xi, and the overall haploid gen- otype is denoted by the vector X = ( X l , X p , . . . , Xn) . Lowercase letters will be used to indicate specific genotypes. For diallelic loci, as examined in KIRKPAT- RICK'S (1 982a) haploid model of sexual selection dis- cussed below, we will denote the alternative alleles at the ith locus by Pi and Qi. For diallelic loci, it is often convenient to let X ; be an indicator variable for the presence of a specific allele (e.g., X , = 1 if the allele Pi is present, and 0 if Qi is present). For a continuum of alleles, as used in LANDE'S (1981) additive polygenic model, it is natural to let X i denote the contribution of this locus to a quantitative character under natural or sexual selection. However, the machinery below does not require additivity; Xi is an arbitrary label for the allelic state. Multiple alleles could be represented either as a special case of the continuum-of-alleles model or, more naturally, by replacing each Xi by a vector, whose vth element is 1 if the vth allele is present, and 0 otherwise. The derivation of the basic equations will not depend on any restrictive assump- tions about the distribution of X. We will indicate below how various expressions are affected by differ- ent assumptions concerning the number of alleles.

The frequency of newly produced haploid products of meiosis with genotype x is denotedf(x). Because we assume that loci are not sex-linked,f(x) is the same for both sexes. The mean value of Xi is

mi = E(XJ = C f ( x ) x i , (1) X

where E denotes the expectation over the whole hap- loid population (for continuous X, the sum is replaced by an integral). If there are two alleles at each locus, indicated by X i = 1 or 0, then mi is just the frequency of Pi (denoted p i or $(Pi)). In a continuum-of-alleles model, mi denotes the average contribution of locus i. Denote the deviation of the state of the ith locus from its average by

(: = X i - mi. (2)

Linkage disequilibria of all orders are quantified by the central moments

CijkI.. . = E((i3;rt+k3;* . .). (3)

For diallelic loci with indicator Xi, C , = $(PiPj) - pip,

the standard measures of two-locus and three-locus disequilibrium. In the diallelic case, such disequilibria, involving only nonrepeated indices, together with the allele frequencies p i , suffice to describe the haploid

and cijk = $(PiPpk) - picjk - pjc& - pkc, - pipjpk are

genotype frequencies. With multiple alleles, higher- order moments involving repeated indices within loci (e.g., Ci:, Ciii, . . .) and among loci (e.g., C , and Cjij) are also needed.

Because we will be dealing with interactions be- tween many loci, some simplifying notation is useful (summarized in Table 1). Individual loci will be de- noted by lowercase letters ( i , j , k , l , etc.), and sets of loci will be denoted by uppercase letters ( S , T , U, V, etc.). Sets may contain repeated indices, e.g., U = {iyj}. I U I denotes the number of elements in U. The union of two sets U and V, understood as the concatenation of all indices in U and V, is denoted U + V; similarly, U - V denotes the set obtained by deleting the ele- ments of V from U . Thus, if U = (iijkLjkl and V = {y}, U + V = ( i i z j jk} , U - V = (ik}, U - i = (yk], and IU + VI = 6. The product of (' over a set of loci U is denoted

( ( I = II (i. (4) iEU

Thus, Cu = E((u) and Ci = 0. Sets may be empty; products over the empty set are defined to be 1, so that CD = 1. Terms of the form Cu in which U contains repeated indices play an important role in the dynam- ical equations. With a finite number of alleles at each locus, many higher-order disequilibria involving re- peated indices can be reduced to functions of lower- order disequilibria. For example, in the diallelic case with X, = 0 or 1, X : = Xi, so that terms of the form ( f can be reduced to linear functions of 5;. For in- stance, (,& = piqi - (2pi - 1)ri, where ql = 1 - pi . More generally, we have the reduction formula

CU+n = E((i(;Sii) = EI[Piqi - ( 2 ~ ; - 1 ) ( i ] b l ( 5 ) = PiqiCu - (2pt - 1)Cu+c

(see (A8) in BARTON 1986a). The moments m, and Cu describe the frequencies of

haploid genotypes before selection: changes across generations can be described completely by this set of variables, but changes within generations generally cannot. In our haploid model, selection may act dif- ferently on the two sexes, and selection and non- random mating will generate deviations from Hardy- Weinberg proportions among mated pairs as well as associations between alleles at distinct loci derived from different haploid individuals. Similarly, sex-in- dependent viability selection on diploids will often generate deviations from Hardy-Weinberg propor- tions among adults and associations between alleles derived from different haploid gametes. These asso- ciations would have no effect if diploids in both life cycles did not undergo meiosis, but instead simply dissociated into the haploid genomes that joined to form them. However, recombination and segregation transform associations between loci from different gametes into linkage disequilibria between loci on the same gamete, while also breaking down the disequi-

Page 5: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 233

librium within gametes. We therefore need an ex- tended set of notation and coefficients to describe changes that occur within generations.

In the haploid model, we will use Xi to denote the genotype of a female at locus i and let X? denote the genotype of a male at locus i . In the diploid model, Xi and X? denote the alleles at locus i inherited mater- nally and paternally, respectively. Using7 to denote frequencies after (viability and fecundity) selection (see Figure l ) , we will generally have f (x ,x* ) # f’(x)f‘(x*), where x refers to the female genotype, x* to the male. The effects of selection will be described by moments such as m l ! , ~ = E’(Xi), m&, = E’(X?), and Ct.,v = E’&{$); the expectations here are taken over the selected population of diploids, with frequencies f(x,x*). The exact definition used for these moments after selection is clarified below. For our diploid via- bility selection model, we have Cu,v = C& among the zygotes, because they are produced by random mat- ing. As discussed in the next section, we will define fitnesses in our haploid model by referring to a hy- pothetical set of diploids produced by random union of newly produced haploids, so that again Cu,v = C d V before viability and fecundity selection. Because selec- tion and mate preferences may differ between the two sexes in our haploid model, mi',^ and C&,v will not equal mh,, and Cl,u in general.

Definition of selection coefficients: For both life cycles considered, the key quantity is a generalized fitness measure, W(x,x*), that describes the relative contribution of diploids with genotype (x,x*) to the next generation. W(x,x*) need not be symmetric. For the diploid life cycle, W(x,x*) is simply the viability of (x,x*) zygotes, whose frequency isf(x)f(x*) under our assumption of random mating. For the haploid life cycle with sexual selection, we will define W(x,x*) so that it includes the effects of viability selection on haploid and diploid phases, fertility interactions be- tween x and x*, and nonrandom union of haploid individuals. All the complications of natural and sex- ual selection are absorbed into this function, which is defined by letting the relative number of all haploid offspring produced by matings between fe- males with genotype x and males with genotype x* be W(x,x*)f(x)f(x*). Using this definition, the fraction of all haploid offsprin3 produced by (x,x*) matings is W(x,x*lf(xlf(x*)/W, where w = Ex,,* w(x,x*)f(x)f(x*) E k[W(X,X*)]. Here 8 denotes ex- pectation with respect to a hypothetical diploid pop- ulation formed at the beginning of the current gen- eration by random union of the haploid gametes pro- duced by the previous generation. Fitness is defined in this way to simplify the derivation of equations describing the effects of selection. We can use exactly the same notation in the diploid viability selection model, but there k denotes expectation with respect to the zygote frequencies that are actually produced

by random mating. For the remainder of our general theoretical development, we will not refer specifically to the diploid life cycle, because the same mathemat- ical treatment applies. For heuristic purposes, we will describe some of the notation introduced below in terms of the haploid life cycle with sexual selection. The EXAMPLES presented later will remove any am- biguity about their interpretation in the simpler con- text of diploid viability selection.

The response to selection can be simply derived by writing the fitness W(x,x*) in terms of a particular set of selection Coefficients. To define these, we express the relative fitnesses W/m as a function of the deviations of allelic effects, 3; and {F, in the following form:

W E = 1 + c au,o({u - CU) + E a0,v(l$ - CV) U V (6)

+ au,v(lu - CU)(G - CV), u, v

where the sums are over all nonempty U and V and the coefficients U ~ , V are in general complicated func- tions of the haploid genotype frequencies. Note that this expansion reproduces k(W) = m. With discrete alleles, such an expression can always be written down by enumerating the fitnesses of each genotype. For example, consider a single selected locus with two alleles. If genotypes (Q,Q), (Q,P), (P,Q) and (P,P) have fitnesses Woo, Wol, W ~ O and W1 we can express the fitness of an arbitrary genotype by

W(X,X*)=(l -X)(1 -X*)Woo+(l -X)X*W01 (7) + X ( 1 - x * ) w l o + x x * w * ~ ,

where X and X* are 0 if allele Q is present and 1 if allele P is present. The coefficients U U , ~ are obtained by substituting X = { + p and X* = {* + p and identifying terms. In this example,

%,0W = q(W10 - Woo) + P(W11 - Wo,),

a*Jw = q(W01 - Woo) + P(W1l - WIO), (8) U I , I ~ = WOO - Wol - WIO + W11, and

au,v = 0 for all other U and V,

with m = q2Woo + pq(W01 + Wl0) + p2WI1. This shows that the selection coefficients defined by (6) will gen- erally depend on the composition of the population. More complex examples involving phenotypic fitness functions and multiple alleles are described in the EXAMPLES section. This procedure for calculating se- lection coefficients can be automated; for instance, Taylor series representations of the selection coeff- cients can be derived and implemented as Mathema- tics (WOLFRAM 1988) notebooks.

The coefficient a u , ~ in (6) describes selection acting on the combination of loci U in females. As shown below, the coefficient au,v describes selection favoring associations between the sets of loci U in females and

Page 6: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

234 N. H. Barton and M. Turelli

V in males. For convenience, we refer to these coeffi- cients as association coeflcients between the sets U and V to distinguish them from the forces of selection acting within haploid genomes, namely au,O and Analyzing sexual selection on diploids would require considering selection coefficients acting on four sets of loci: maternally and paternally inherited genomes in both females and males.

Selection may act to create associations in several ways. A coefficient au,u may be caused by haploid individuals mating assortatively for genotypes at loci in the set U , while au,v may be due to a preference of females determined by loci in U for males with specific genotypes in V. Associations are also produced by non-additive selection on diploids, because (6) defines fitness interactions (measured by the association coef- ficients) as deviations from additivity, rather than from multiplicative independence. As shown by (S), dominance in fitness at loci i contributes to ai,,. Addi- tive epistasis between loci i and j could select for an association of alleles inherited from the same parent (aq), as well as from different parents (aij). Even with random union of haploid individuals, selection on the haploid stage would, with the definition used here, produce weak selection for associations (see EXAM- PLES). This is because with random union, the fre- quency of mating pairs is the product of the genotype frequencies after selection. Therefore, if these fre- quencies have been changed by selection at the hap- loid stage (as measured by au,O and aa,v), their product will contain terms (proportional to au,0430,”) that con- tribute to the association term au,v.

Generally, it is simplest to write the expansion (6) in terms of some minimal set of selection coefficients that suffice to describe the model and selection re- gime. In particular, for diallelic loci we can use the reduction formula that produces ( 5 ) to express the fitness function so that it is linear in each 5; and {F. Thus, for diallelic loci, we can choose to consider only selection coefficients for sets U and V that do not contain repeated indices. However, for some purposes it may be helpful to retain “unnecessary” coefficients; and this does not affect the calculations below. For instance, the term ai; can describe stabilizing selection on the ith locus, so it may be useful to retain this coefficient even for the diallelic case in which the identity 5-2 = piqi - l i (2pi - 1) shows that it can be absorbed into the coefficient ai,0. Similarly, for many models the fitness expansion can be written more simply by letting the sum over U and V include both orderings U,V and V,U where U # V , and counting all permutations of indices (e .g . , ( i j k ] and (ikj]) separately, even though each of these coefficients is identical. The only requirement in our analysis is that the same conventions be used for defining the selection coeffi- cients and carrying out the calculations described below.

The method of TURELLI and BARTON (1990) for analyzing frequency-independent viability selection on diploids can be extended to cover the frequency- dependent selection schemes described above. This is possible because these forms of selection alter the frequencies of diploids in an easily described manner (see Eq. 9 below). Sexual selection on haploids will often distinguish between cis and trans diploid com- binations, whereas sex-independent selection on dip- loids generally does not. Although the mathematical description of selection is not affected, sex-specific genotype frequencies after selection lead to some com- plications in adapting the method of TURELLI and BARTON (1990), which is based on “selection gra- dients”: derivatives of log mean fitness with respect to the chosen variables. The relationship between the alternative approaches is described in APPENDIX A.

Selection and recombination: We proceed in three steps. We first consider the changes in genotype fre- quency produced by selection and show how these change the sex-specific means of allelic effects and the moments of products of deviations across loci. T o facilitate the analysis of recombination, these devia- tions are computed relative to the initial means before selection. Second, we consider the effects of mating and recombination on these (noncentral) moments. The final step is to account for the effects of changes in the mean on the values of the central moments CN, defined in relation to the means in the new haploid population.

The change in genotype frequency caused by selec- tion is simply

V

+ au,d<u - cu)({v* - c v ) ) . u,v

Hence, using (9) and the relations mi = Ex,,* xf(x,x*) and Ex,,* A,f(x,x*) = 0 , we see that

Asmi,@ = x,Lf(x,x*) + A,f(x,x*)] - m, X,X*

= x;A,f(x,x*)

= S;A,f(x,x*) ( 1 0 4

X,X*

X,X*

= 2 ~u.DCU+, and U

A m , , = a ~ , v C v + ~ . (lob) V

The sums are taken over the same sets U and V used

Page 7: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 235

to define the selection coefficients in (6). Changes in linkage disequilibria can be calculated

similarly. However, to describe the effects of recom- bination, it is useful to measure the deviations of allelic effects in both males and females after selection rela- tive to a reference point that is the same for both sexes. For simplicity, we will measure the moments after selection relative to the means mi before selec- tion. Unless the means are not changed by selection, these moments will not be central moments of the multivariate distribution produced by selection. Let C l , T denote moments of allelic effects after selection, but before recombination, with allelic effects in each sex measured as deviations from the initial m,. Taking the expectation with respect to the diploid genotype frequencies after selection (f(x,x*) = f(x,x*) + Asf(x,x*)), we have

U S , T C4.r - C S , T = rs{T*Asf(~,x*). (1 1) X,X*

Substituting Equation 9 for Asf(x,x*) produces

Ascs ,~ = au,~(Cs+u - Cscu), (1 2 4 U

&%,T = U ~ , V ( ~ T + V - C T ~ V ) , and (1 2b) V

&CS,T = au,0(Cs+u - CSCU)CT U

+ a 0 , V C S ( C T + V - C T C V ) (124 V

+ aU.v(CS+U - c S c U ) ( c T + V - C T C V ) U.V

for I S I , IT1 > 0. Equation 12a shows that linkage disequilibria be-

tween maternal loci (Cs,O) can be produced only by selection coefficients on maternal loci, u U , O . In any model where the marginal fitnesses of both sexes are the same for all genotypes, the frequencies of haploid genotypes are not changed by selection: u ~ , ~ , aO,v are all zero, and only the association coefficients aU,v con- tribute. For example, we show below that assortative mating between genotypes in the set U generates associations au,u, and associations involving subsets of U , but it does not cause any direct selection ( ie . , av,O = a ~ , ~ = 0 for all V) . A preference of females carrying specific alleles in U for males carrying alleles in V contributes to associations between these sets of genes, UU,V, and may also generate sexual selection on males for the set V , Many models of phenotypic assort- ment also produce both direct selection and nonzero association coefficients. Note that coefficients of as- sociation, a ~ , v , do not create linkage disequilibria within gametes (Eqs. 12a,b): they can only build up associations between different gametes (Eq. 12c). These are transformed in linkage disequilibria by recombination.

Next, we consider the effects of mating and recom-

bination. Let ml!,0 and mh,, denote the sex-specific means after selection and let my denote the mean of Xi among newly formed haploids in the next genera- tion. Because recombination does not affect allele frequencies and because half of the alleles at each locus are contributed by each sex,

It is easy to describe the effects of recombination on products across loci as long as the same reference point has been used to measure deviations in each sex. A recombination event splits the genome into two disjoint sets. Let the frequency of recombination events that partition the loci into the sets S and T be r S , T . By symmetry, it suffices to consider only one of rS,T and rT,S. Thus, for two loci with recombination rate r , the relevant partitions have probabilities r0,11,21

= 1 - r and qlI , l21 = r . Although S and T may involve repeated indices, these do not affect the recombina- tion rates, e.g., rligl,lu1 = r(yl,{kl. We will also follow the convention that permutations of the sets S and T will not be distinguished, e.g., r(g1,(kl is the frequency of all recombination events in which loci i and j are derived from one haploid genome and k is derived from the other. (This notation, which is consistent with that in TURELLI and BARTON (1 990), is a simplified version of that used by HASTINCS (1 986) and BARTON (1 986a). TSJ- is equivalent to their T S I T . CHRISTIANSEN (1988) gives a more formal review of the effects of recombi- nation on multilocus systems.) Following the conven- tion used in (1 l), let C$ denote moments of allelic effects after both selection and recombination, with allelic effects measured as deviations from the means, mi, before selection. For each partition (S,TJ of the set N ( i e . , S + T = N ) , half of the resulting gametes are derived from recombination events that bring the loci in set S from the paternal genome and T from the maternal; the other half comes from the reciprocal partition, where S comes from the maternal genome, and T from the paternal. Thus, summing over all nontrivial recombination partitions (S,TJ of the set N (i.e. , S + T = N , with S and T disjoint, S # 0, and T f @),

where r~ is the total frequency of recombination events that disrupt the set N ( i e . , r N = ~ s + T = N r S , T ,

with S and T both nonempty) and 1 - rN = r 0 , N . This slight modification of Equation 2.19 of TURELLI and BARTON (1 990) allows for the possibility that selection acts differently on female and male loci, so that C i . T # C+,s. We have specified the reference points mi

Page 8: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

2 36 N. H. Barton and M. Turelli

for the calculation of C ~ , T , but (14) applies regardless of the reference points, provided that the same ones are used in both sexes.

Combining the effects of recombination and selec- tion, we obtain

and

- rS,T(CN - CSCT) (1 5b) S+T=N

+ rS.T( ASCs.r + &CT,S S+T=N

Here, and throughout the manuscript, we follow the convention used in (1 4). Sums involving recombina- tion are over non-trivial recombination partitions; rS,T

is the total frequency of recombination events that split the set N into S and T , regardless of the config- uration of the other loci; and rN is as in (14). The recombination events ~ s . 7 and TT,S have been lumped together; sums are taken over only one of these or- derings, and the coefficient rS,T includes both. Substi- tuting the selection Equations 10 and 12 gives

Am; = x &,~CU+; (16) u

and c" N - CN = - r S , T ( c N - CSCT)

with

These equations show that the dynamics depend only on the average selection over the two sexes, denoted &,v. Equation 16 shows that allele frequencies are not affected by selection favoring associations between haploid genotypes ( i e . , terms of the form with both U and V nonempty). The sums over U and V in (1 6) and (1 7) must follow the same conventions used in (6) to define the selection coefficients.

Finally, in order to compute ACN, the change in the central moments CN between two successive genera- tions of newly formed haploids, we must take into

account the changes in the means. In the new gener- ation, the moments are calculated relative to my = (mi',@ + m&,)/2 = m, + Am;. Let Af(x) denote the change in the multivariate distribution of haploid genotypes produced by one generation of selection and recom- bination, then

ACN = 2 fl (xi - m : ) [ f ( x ) + Af(x)] - CN x i v r i (1 9)

= II ( S t - Ami)[f(x) + Af(x)] - CN. x iEN

From the definitions, C ; - CN = Ex {NAf(x). Later in this section, we will expand (1 9) completely.

However, we first derive an expression, denoted ACN, that ignores all products of the form AmiAm, and Am,Af(x) in (1 9) ( i e . , it ignores terms that are second- order in selection, recombination, or both). This lead- ing-order term is a good approximation when changes are slow, at least after an initial period during which potentially large initial values of linkage disequi- librium are reduced to values consistent with the intensity of selection (see APPENDIX B). Moreover, the exact expressions that apply for arbitrarily strong selection and recombination can be written using only these leading-order terms as shown in ( 2 2 ) below. Expanding the product in (1 9) and using (1 7), we have

ZCN = <NAf(x) - 2 CN-tAm, X iEN ( 2 0 )

= c;; - CN - CN-iAm, E N

LC% = - r,y.T(CN - c~c,) S+T=N

+ c a,,[U - T N ) ( C N + U - C N C U )

+ r.s,r(Cs+uCr + cscr+u S+T=N

- 2C.sCrCu) - C N " 2 C U + I iEN 1

+ &,V 2 rS,T(cS+U- c S c U ) (cT+V - CTCV). U.V S+T=N

The sums are taken over sets U,V as in (6) and over nontrivial recombination partitions (S ,T) as in (14). We show below how to calculate the changes in the disequilibria exactly, taking in50 account all of the higher-order terms ignored in ACN.

The first sum in ( 2 1 ) describes the effect of recom- bination in the absence of selection. The sum over L' describes the effect of direct selection on genotype frequencies and provides a first-order correction for changes in the mean. The sum over U and V describes the interaction between recombination and selection: combinations of haploid genotypes that produce an excess of offspring will produce an excess of the corresponding recombinants. It is this final term that

Page 9: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 237

can transform mating preferences (as expressed by au,v) into linkage disequilibria (CN). It shows that se- lection favoring associations only affects linkage dis- equilibria if there is recombination. This is because an excess of alleles in the set U on one chromosome associated with alleles in V on the other will only alter the joint frequency of alleles on the same chromosome (and, hence, CN) if recombination can bring the two sets together (6 KIRKPATRICK 1982a).

If the means are not changing (Am, = 0 ) , we can ignore the change in reference point that complicates (1 9), and the leading-order expression, ~ C N , is exactly ACN. In general, if selection is weak ( i e . , if the maxi- mum of all I au,v I is small) and the pairwise recombi- cation rates are not too small, the terms ignored in ACN will be negligible after a short initial period during which recombination will eliminate most of the linkage disequilibrium that may be initially pres- ent. This weak-selection approximation, which applies when means change slowly, will be discussed further below. An exact expression for ACN, including all powers of Am;, can be found by extending (A 1.20) of TURELLI and BARTON (1990) as follows. Expanding the product in (19), and using (20) and Ami = Ex,,* liA,f(x,x*) = L , x * i-,Af(x,x*),

ACN = LCs n (-Ami) SCN i E h " S (22) - (1s I - 1)cN-s n (-Ami).

SCN iES

The sum is over all nonempty subsets S C N (including S = N), with the convention that if IN 1 = n, there are

n subsets of size n - 1, of size n - 2, etc., even

though many of these may be identical. For instance, if N = (iiii), there are four subsets (iii). In our weak selection approximation, the leading-order term is LCN, as given by (21); this corresponds to the term S = N in the first sum of (22). With weak selection, the corresponding term in the second sum, the product of (-Am,) over the set N, is the contribution of highest order in a = maxu,v {I au,v 1 ) . Expression (22) applies for arbitrarily strong selection, and it shows that the exact dynamics of linkage disequilibria can be ob- tained in terms of first-order expressions (2 1). If IN 1 = 2, (22) reduces to

(9

AC, = ACv - Am,Am,, (23)

as given in TURELLI and BARTON (1 990). When there are two alleles at each locus and we use

X , = 0 or 1, the sums in (1 6) and (21) need only be taken over sets without repeated indices. This is be- cause {: can be reduced to a linear function of li (see Eq. 5). Thus, when defining the selection coefficients U U , ~ for diallelic loci, it is unnecessary to include U and V with repeated indices, because these would all be zero once the reduction is performed. This reduction

of the fitness function need not be carried out, and coefficients corresponding to repeated indices (e.g., i i jk) could be included without error. However, the equations for terms such as Ciqh would be redundant, because ( 5 ) shows that such moments can always be expressed in terms of the lower moments cqk, cjk and allele frequencies. The key point is that the expres- sions derived for the au.v depend on how the fitness function is expressed, and these coefficients will de- pend on whether or not the reduction noted above has been carried out. With more than two alleles at each locus, coefficients involving repeated indices are needed; and in the continuum-of-alleles model, all possible sets must be included in the summations. For example, in the continuum model a i i , ~ measures the strength of disruptive or stabilizing selection on locus i.

Approximations for weak selection: Finally, we can find an explicit approximation for the linkage disequilibria when the selection and association coef- ficients are weak relative to recombination. We will demonstrate how to approximate linkage disequilibria to leading order in the intensity of selection. As above, we will describe the strength of selection by a, which denotes the maximum of the I au,v I. The argument is considered in detail in Appendix 2 of TURELLI and BARTON (1990), and is only outlined here. In APPEN- DIX B, we discuss the time scale over which these approximations become accurate, given an arbitrary set of initial values. Let CN denote the moment of interest and let EN denote its value at linkage equilib- rium (e.g., if i, j and k are distinct, t r j . k = 0 and Civ jkkk

= C,,CjjCkkh). If selection is weak, disequilibria will become small and the change in disequilibria, A(CN - EN), will be dominated by two types of terms: first, the term -TN(CN - E N ) which represents the breakdown of disequilibria by recombination; and second, a driv- ing term, denoted g, which represents the net increase in disequilibria due to selection. This term depends in a complicated way on the moments mi and C,; under weak selection, however, it can be approxi- mated by the expression, denoted g N , that the selection term in (2 1) would take in the absence of disequilibria. The population will rapidly reach a "quasi-equilib- rium" in the sense of NACYLAKI (1 976), with CN - eN

&N/TN; it will then respond more slowly to the weak selection, as the within-locus moments that enter g N

change. In the EXAMPLES, we will show how this technique can be extended to find more accurate approximations that include second-order terms.

Consider the case where the set N contains no repeated indices: other cases can be dealt with in the same way, though we have not been able to find a simple form for the general result. For any partition (S,T) of N, products such as C ~ C T can be neglected if selection is weak, because products of disequilibria will be proportional to products of selection coeffi-

Page 10: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

238 N. H. Barton and M. Turelli

cients. Thus, the leading term in the first summation of (21) is simply -r&N, as noted above. The summa- tions in (21) that describe the effects of selection and association (&,a, &,") will be dominated by terms that depend only on within-locus moments, rather than on the small disequilibria. At "quasi-linkage equilibrium" (abbreviated QLE), (21) reduces to

0 = -r&N + (1 - r N ) E &,0eN+U U (24)

+ E &,V 2 r S , T C S + U C T + V , "

u,v S+T=N

where e, denotes the linkage equilibrium value of Cu. (APPENDIX B discusses the time-scale over which the terms ignored in (21) will be of order a'.) Note that e N + u is nonzero when U is any permutation of N , and also when the elements of N are a subset of the elements of U , but U does not include any nonrepeated indices for loci not represented in N . For example, if N = ( i j k ) , the terms in the first summation correspond- ing to U = (@), ( i i j k ) , (z j jk) , ( i j k l l ) , etc. all contribute, but U = (jk) and ( i j k l ) do not.

Explicit approximations for all the linkage disequi- libria can be derived for diallelic loci. If the reduction formula applied in (5) is used to define the selection coefficients, we need only consider sets such as U = ( i j k l ) that do not contain repeated indices. We will follow the convention that permutations of U are treated separately in defining the selection coeffi- cients. In this case, e N + U is nonzero only when the elements of U and N are identical, and ~ S + U , T + V is non- zero only when the elements of S and U are identi- cal and the elements of T and V are identical (where S + T = N ) . Both these terms reduce to ~ N + N = I I ~ E N

pi( 1 - pi) E p N q N (from Eq. 5). Thus,

1 N I!2~.0( 1 - T N ) + I s I! I TI ! & . ~ s , T S+T=N I- (25 )

* p N q N / r N + o(a ')e Substituting these expressions into (16), we obtain

AP, = c ~ i , @ p i q i + E Gu.0 I u + Z 1 !CiU+i,0( 1 - TU+,) UZi

+ E ISl!lTI!" a S . 9 S . T I- (26) S+T=U+i

* pu+iqu+Jru+z + O(a '). In this equation, the outer summation extends over all sets U that do not contain repeated indices. Equa- tion 25 shows that linkage disequilibrium between a set of loci, N , may be generated by selection favoring that combination ( i i N , 0 ) and also by association Coeffi- cients that bring components of that combination together (&T with S + T = N ) , followed by recombi- nation that assembles the complete combination ( r S , T ) .

Allele frequencies change primarily as a result of selection acting directly on the locus ( i i i ,Ofqi) , and secondarily through linkage disequilibria with other selected combinations of loci.

EXAMPLES

Additive, multilocus viability selection on hap- loids: To illustrate how the selection coefficients can be calculated, let the viability of a haploid individual be

W(X) = 1 + s&. (27) i

Applying the definitions, we have

W(X,X*) = (1 + E SIX:)( 1 + E SjX?) i J (28)

= [1 + C Si (C + mi)1[1 + sj(S? + m])] I

and = (1 + zt simi)'. Identifying the coefficients in (6), we have

a , , @ = a0.i = ai,@ = - Si

f i ' (294

(29b)

i U . 0 = C2er.V = &J,v = 0 (294

a . . = a . . = a . . = = , and I Sdj

I J .l" I J w

if U or V has two or more elements. Like our one-locus example (7), this shows that even

when viabilities are constant, the selection coefficients defined by (6) are frequency-dependent: they are inversely proportional to the mean fitness, m. More- over, viability selection on the haploid phase generates weak selection for associations ( ; i d ) , because the asso- ciation coefficients are defined as deviations from additive independence, rather than from multiplica- tive independence.

Natural selection on diploids: We will switch briefly from the simplified haploid life cycle used to consider nonrandom mating to the standard diploid model with sex-independent viability selection. The previous example shows that selection may produce association coefficients (aU.v), even with random mat- ing. However, if fitness does not depend on whether the alleles concerned are in cis or trans combinations, the form of these associations is constrained. An im- portant class of examples involves sex-independent viability selection for a quantitative character deter- mined by multiple loci without dominance or epistasis.

As a simple example, consider Gaussian stabilizing selection on an additive polygenic trait. In this case,

(E (Xi + X,*) - e)' W(X,X*) = exp - [ ' 2vs I 9 (30)

Page 11: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 239

TABLE 2

Relative fitnesses produced by assortment with respect to a single locus a, with two alleles; female genotype is & or pa (x. = 0 or I), male genotype is & or P,, (X,* = 0 or 1)

Male

Female x.* = 0 x: = 1

x. = 0 a

1 - a + - 40

I - a

a x, = 1 1 - a 1 - a + - P a

W(X0, x:) = (1 - Xa)(l - X,*)[(l - (I) + a/q,] + ( 1 - XJXP(1 - a) + X,(1 - X t ) ( l - a) + X.X,*[(l - a) + ./pa] = 1 + a(X. - P.)(X.* - P,)/p,q,.

where 0 denotes the optimal phenotype and Vs meas- ures the intensity of stabilizing selection. It is conven- ient to scale the genetic effects and V, by initial phe- notypic variance. Assuming that stabilizing selection is weak and that the population mean is not too far from the optimum, we can approximate (30) to lead- ing order by

w(x,x*) = 1 - (5; + SF) + Z - e I? + O(s2), (31a)

where

Z = 2 Z m , and s=max (:,'(Z~sBy2)<< - - 1. (31b) 1

T o calculate the selection coefficients as defined by (6), we will follow the convention of distinguishing sets such as {VI and (ji) that are permutations of each other. From the symmetry of the additive model, it follows that for any number of alleles, the strength of selection at each individual locus, denoted ciD,z, and the strength of selection on all pairs of loci, denoted ci,,= and can be expressed as

i,,, = a. . = = 240,LL t.3 J J 2a,,0 = 2a,i,0

1 v, = 2UD,,i. = -- + O(s2); (32b)

and

= iiu." = O(s2) if IUI + I VI > 2 . (32c )

Hence at QLE, (24) yields the leading-order approxi- mation

( 3 3 )

Consequences of assortative mating: The rest of our examples involve nonrandom mating and assume

the haploid life cycle. The simplest model treats the multilocus consequences of assortment at a single dial- lelic locus, labeled a. We assume that a proportion of 1 - a of individuals mate at random, while a propor- tion a mate with their own genotype at locus a (Table 2 ) . Because a fraction (1 - alp% + apa of the mating pairs consists of genotypes (P,,Pa), the relative fitness of this combination is (1 - a) + a/p,. By writing out the relative contributions of each of the four pairs of genotypes in this way, one finds that W(X,X*) = 1 + a 3 ; f f / p , q a and = 1. Hence, the only nonzero selec- tion coefficient is aa,a = a/P,q,. Because all genotypes have the same marginal fitnesses, allele frequencies do not change (Am, = 0) . However, the association coefficient aa+, does influence the linkage disequilibria. From Equation 22 .

ACN = - rS ,T(CN - C S C T ) S+T=N (344 a

+ - 2 rS,TCS+aCT+n. Paqa S+T=N

For example,

Acai = -ra,iCaiaa,a + ra,, - CaaCta a

P.4. (34b) = -ra,,( 1 - a)C,,.

The rate at which linkage disequilibria are broken up by recombination is reduced, because there is a deficit of heterozygotes at locus a. With complete assortment (a = l), linkage disequilibria involving a are not broken down at all. Similarly, if loci i and j are both in linkage disequilibrium with a, then the dynamics of C,, will also be affected by the assortment.

This model can be extended to multiple loci in various ways. For example, one might assume assort- ment for a phenotype determined multiplicatively by a set of diallelic loci, U : the phenotype is expressed only when all these loci carry the P allele, correspond- ing to X , = 1 for each i in U . Thus, the phenotype can be written XU &U X , . For example, with two loci, a and 6, X,b = X a X b (Table 3). The frequency of this

Page 12: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

240 N. H. Barton and M. Turelli

Male

Female x:x: = 0 x:x: = 1

phenotype is $ab = E[(Ca + p a ) ( ( b + p b ) ] = p a p b + C a b ,

which depends on linkage disequilibrium as well as on allele frequencies. As before, because the marginal fitnesses are equal, no sexual selection is generated: a a b . 0 = 0, etc. The model generates association coef- ficients a a b , a b = ( Y / [ $ a b ( l - $ a b ) ] , but also produces associations between subsets of ab: aa& = @&,/[$ab( 1 -

- $ a b ) ] . This illustrates two points: first, multilocus models will usually involve coefficients that depend on both allele frequencies and linkage disequilibria; and second, selection for a set of loci U will usually also involve selection on subsets of U . Because this model of assortment favors associations between loci a and b (aa .b , & ,ab , & ,ab , a&&), it generates linkage disequilibrium between these loci. Substitution into (21) shows that the equilibrium value of c a b is inde- pendent of recombination, and is given by a quadratic equation. When assortment is weak (a << I), c a b =

Now, consider an additive polygenic trait: the fe- male phenotype is z = ci X i , and the male phenotype is z* = X,*. (We neglect environmental variation; it would not be difficult to include it, but it is easier to work directly with fitness as a function of genotype.) A simple model for nonrandom mating, which is analogous to the "absolute" preference model of LANDE (1 98 I), has pairs form with a probability that falls away with the phenotypic difference between them, so that W(z,z*) = exp[-a(z - 2*)'/2], say. This is tractable when z follows a normal distribution, but we cannot in general make that assumption. In the framework set out here, the Gaussian assortment model leads to an infinite series of coefficients aiJ, ag,kr, etc. We therefore use a quadratic model instead: W(z,z*) = 1 - a(z - z*)' /2 . This cannot be exactly correct when the range of z is large (=I/&), since W must be positive. However, as in our stabilizing selec-

$ a b ) ] , aa.a = a p % / [ $ a b ( l - $ a b ) ] ) and a a , b = a p a p b / [ $ a b ( l

a a , b p a q a p b q b = a p a q a P b q b / ( 1 - p a p b ) .

tion example (3 I), the quadratic model can be seen as an approximation to the Gaussian model for weak assortment (aVar(z) << 1) . Taking the covariance of z and z* shows that the correlation between mates (which is what would usually be measured) is aVar(z) to leading order in a.

The coefficients of selection and association can be found by expanding Win powers of 5; and identifying coefficients:

W a - = 1 - - ((z - z*)' - SVar(z)j + O(a') m 2

where Var(z) = C ,

(35)

- 2C X (ir: + C X (f?rT - c,) + ~ ( a ' ) . i j ' I 1

Hence to leading order, a g , ~ = a ~ , ~ = -a/2, and ai j = a for all i and j that contribute to z. This fitness function implies that rare phenotypes are less fit; hence, there is stabilizing selection on all pairs of loci in both sexes (a,,@, a0.g = 4 2 ) .

When assortment is weak, (aVar(z) << I), Equation 24 shows that

The first term represents the negative disequilibria favored by stabilizing selection, and the second term the positive associations caused by assortment. Unless the loci are unlinked (r, = 1/2), the first term domi- nates, and linkage disequilibria reduce the variance.

This model, though simple, involves both assort- ment and sexual selection against rare phenotypes of

Page 13: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 24 1

TABLE 4

Relative fitnesses produced by KIRKPATRICK’S (1982a) model of sexual and natural selection

Female

~

x: = 0 x: = 1 ~

x, = 0

x, = 1

1

1 - sp,

1

1 - s 1 - sp,

( 1 + a)(1 - s)

4 1 - S P I )

W(X,, X?) = (1 - X,)( 1 - X ? ) / ( 1 - S P , ) + (1 - X,)X?( 1 - s)/( 1 - st , )

+ X,(1 - X?)/[z(l - S P I ) ] + XJ?(1 + - s)/[z(l - sP1)l

either sex. There has been considerable discussion of models of “pure” assortment, in which all genotypes have the same marginal fitnesses (see FELSENSTEIN, 198 la, for a review). Such models can only produce association coefficients u ~ , ~ : selection coefficients au.0,

will contribute - C”), O Q , V ( ~ ~ * - Cv) to the marginal fitness of females and males, and so must in general be zero. Then, the means and moments of individual loci (mi, Cii, etc.) do not change: the only effect is on linkage disequilibria.

In the special case where assortment is based on an additive polygenic character, z, the relative fitness W (z,z*)/m can be written as a,,,(z - Z)(z* - i ) + u , ~ , ~ , (z - i)*(%* - 2)‘ . . e ; thus, aij = a,,, and ay,kl = a,z,rr for all loci z,j,k,l that affect the character. Consider first the quadratic model, where only ar,= contributes: this gives a correlation between mates of a,,,Var(z). The exact equation for the pairwise disequilibrium is

AC, = -ryCy + rva,, C,kC$. (37)

At equilibrium, C , = a,,, & CikCjl, independent of recombination rates. With the further assumption that all loci have the same variance Cii, one can obtain a quadratic equation for the Cy (which are all equal) in terms of C,,. The result can be written in terms of the correlation between mates, p = a,,,Var(z):

k l

Var(z) = VA,LE 1 - p(l - 1/72)’

where VA,LE is the genic variance, xi Cii. This is iden- tical with FELSENSTEIN’S (1981a) Equation 9b, except that our formula relates to a haploid model (and so contains n rather than 2n loci), and does not include environmental variance. This derivation applies re- gardless of the linkage relations. However, it does depend on there only being coefficients In gen-

eral, there might be nonquadratic assortment, giving terms such as u ~ , , ~ ~ . Though one can easily show that in the limit of a large number of loci, (38) applies for any model of pure assortment, higher-order terms will alter the variance when there are a finite number of loci. Such effects were not considered by FEUENSTEIN (1 98 la). (See FRANK and SLATKIN (1 990) for an alter- native analysis of assortative mating on a quantitative trait at mutation-selection equilibrium. Their “covar- iance” approach is discussed below.)

Sexual selection due to female choice: two loci: KIRKPATRICK (1 982a) set out a simple model in which natural selection against a male trait is opposed by sexual selection caused by female preferences for that trait. Males who carry a deleterious allele at locus t (labeled by X: = 1) have their fitness reduced by a factor 1 - s. Females who carry a preference allele at locus p (labelled by X , = 1) prefer to mate with males who carry the deleterious allele by a factor of 1 + a. (We denote the preference factor by 1 + a, rather than using KIRKPATRICK’S a2, to avoid confusion with our selection coefficients.)

This model is summarized in Table 4. By writing the relative fitnesses as a function of rp and $, and identifying terms, one can show that the only non- zero coefficients are

up., = b, and aO,, = b,p, - ~

1 - sp,’ (39) S

where

b, = a ( l - 5 )

(1 + s * p w - spt)

and

s * = a ( l - s) - s.

Page 14: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

242 N. H. Barton and M. Turelli

Substituting into (16), (21) and (23) gives

Apt = &&11 &,tPtqt, (404

Ap, = 80,tcpI, (40b)

and

AC,, = -rCpl + ci~,,(l - r)(l - 2pf)Cpf

+ rcZp,,plq1ppqp + ~&cplpql, (404

where r = rp,l, &,, = a0J2 and iP,, = ap,,/2. These are exactly equal to Equation 1 of KIRKPATRICK (1982a). However, writing them in terms of selection coeffi- cients clarifies the structure of the model and facili- tates analytical approximations of its dynamics. Note that female preferences have two effects that both enter through the coefficient b,: they generate associ- ations between the loci (ap,, = b,), and they exert sexual selection for the trait ( ~ 0 , ~ = b,pp - [s/(l - sp,)]) . Although both coefficients depend on the frequency of the male trait, P I , the frequency of the preference allele, p,, only affects through the term bfpp. Equa- tion 40b shows that the female preference locus evolves only when it is in linkage disequilibrium with the male trait locus.

This structure can be generalized to describe arbi- trary models of female preference. Suppose that fe- males with one allele at a preference locus have one set of preferences for males (i.e., W(X,X*) = h,(X*) if X , = 0), whereas females with the other allele ( X , = 1) prefer different male genotypes (ie., W(X,X*) = h l ( X * ) if X , = 1). The functions h, include both selec- tions on males before they enter the mating pool and sexual selection by the females. In general, the h, may depend on the frequency of male genotypes; however, they must be independent of p,. Since the marginal fitness of females is constant, the h, must be normal- ized so that the expectation of h, across males is constant. We can therefore write hi as 1 + X U b i , u ( E - Cu), where b,," represents the preference of females with X , = i for genotypes at the loci in the set U . Overall, W(X,X*) = 1 + [(I - X,)bo,U + X p b l , U ] ( G

Hence, = b , , ~ - bo,^ bu, and a 0 , ~ = bo,^ + buPp, a simple generalization of Equation 39. The conse- quences of this general structure for female prefer- ence will be explored in a later paper.

When sexual and natural selection are both weak (s, a << I ) , (39) simplifies to

- CU) = 1 + Eu [ ( q p - {,)bo," + ( p p + i-,)bl.UI(i-U* - CU).

up,, = a and a ~ , , = app - s. (41)

Discarding terms of order s 2 , as, and a' in (40) yields

APP = 2 - Cpl,

and

The factors of two in these equations arise because each gene is expressed in only one of the two sexes: thus, tier,, = a 0 J 2 and i,,, = ap,J2. The population will quickly reach quasi-linkage equilibrium (QLE), in which Cp, = (a /2)ppqpplqf . At QLE, we have

(43)

If sexual selection outweighs natural selection (ap, > s), the deleterious allele at the t locus will be fixed; if not, it will be lost. In this weak-selection limit, the line of equilibria in (pp,pl)-space discussed by KIRKPATRICK (1982a) becomes vertical, and the male trait cannot be polymorphic (except when ap, = s). For there to be a non-vertical line of polymorphic equilibria, the selection coefficient a0,, must depend on p , such that ~ 0 , ~ = 0 over a range of values of p,. In KIRKPATRICK'S model, the coefficients are only weakly frequency- dependent, so that the range of p p over which the male trait can be polymorphic is narrow: s/[a(l - s ) ] < p , < s(1 + a)/a. Even with strong selection and preference, such as the parameter values used by KIRKPATRICK (1982a) (s = 0.4, a = 2), the preference allele frequency must be between 0.333 and 0.6 for polymorphism in p, . Thus, Equations 42 give a reason- able description of the dynamics even when coeffi- cients are large: for most initial conditions, the male trait will be fixed or lost.

When a substitution occurs at the t locus, the fre- quency of the preference allele changes as a result of its association with the directly selected allele. The net change in preference caused by the change at the trait locus can be found by assuming that the process is approximately continuous, and integrating. Equation 43 gives

Hence, as p , changes from p,(O) to p,(Q), the allele frequency at the preference locus changes to

The same method can be applied even when selec- tion and preference are strong. This requires three approximations: that linkage disequilibrium is close to equilibrium (QLE), even when allele frequencies are changing (ie., AC,, = 0); that C,, is small enough that terms of the form C;, for k 2 may be ignored; and that the preference allele frequency changes approx- imately continuously in time. We begin with the exact

Page 15: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 243

0.2- a a a 0.15” a 0 K

- - -

LC

L a Q a 0.05” r, w- 0 )r 0 c a 3

a a

-0.05“ LC L

5 -0.1”

s 5

.- K

0) a

-0.15“

- 0 . 2 1 FIGURE 2.-A test of the quasi-linkage equilibrium approximation, using KIRKPATRICK’S (1982a) model of sexual selection by female choice

(see Eq. 49). Positive values show the increase in frequency of the preference allele (Ap,) caused by an increase in the frequency of the male trait from a low value ( p , = 0.001); this is plotted against the initial frequency of the preference, p,. Negative values show the decrease in the preference when the male trait decreases from near fixation ( p , = 0.999). Natural selection against the trait is s = 0.4, and the preference for it is a = 2 (corresponding to KIRKPATRICK’S a2 = 3). The solid curve shows the predicted change in preference, calculated assuming quasi- linkage equilibrium. This prediction is indistinguishable from exact results, indicated by dots, for recombination rates r = 0.5 and r = 0.1. The exact results are based on iterating the dynamic equations beginning in linkage equilibrium (CPt = 0), with p , initially at 0.05, 0.1, ..., 0.95. The exact results deviate from the QLE prediction only when linkage is very tight ( r = 0.01; dots away from the solid curve). When the reference allele is rare ( p , < s/(a(l - s)) = 0.333), the male trait cannot increase. When the preference is sufficiently common ( p , > s d k [ s 6 + a(1 - s) - s]” = 0.464 under the QLE approximation), the male trait goes to fixation. In between (0.333 < p , < 0.464), the male trait and female preference move together onto the line of polymorphic equilibria. The population behaves in a similar way when the male trait decreases from a high value; there, the trait becomes polymorphic when 0.464 < p , < 0.6 = s(1 + a)/..

expression

Keeping only the leading-order terms in ( ~ O C ) , we obtain the QLE approximation

Cpt = ~ , . t P , q , P t q t . (47) Substituting into (46) and proceeding as in (44), we have

Hence,

A substitution at the trait locus ( f i , (O) = 0, $,(a) = 1 ) multiplies pp /qp by J(1 + s*) / ( l - s) = 6. Loss of the deleterious allele (p , (O) = 1 , p , (Q) = 0) divides p p / q p by the same factor. Equation 49 also approxi-

mates the net change in preference when the male trait moves to a polymorphic equilibrium from low or high frequency (Figure 2).

Comparison with the exact results, obtained by it- erating the recursions (40), shows that this QLE ap- proximation is remarkably accurate, even when selec- tion and preference are strong (e.g., s = 0.4, a = 2; see Figure 2). With unlinked loci, the change in pref- erence frequency differs from that predicted by (49) by at most 3.3%. Even with recombination r = 0.1, the greatest error is no more than 10.5%. When the preference and trait loci are tightly linked, associations between alleles on maternal and paternal chromo- somes are only slowly converted into linkage disequi- libria between alleles on the same chromosome: the disequilibrium only slowly approaches its “quasi-equi- librium” value, and the net effect on the preference is reduced. However, even when recombination is as low as r = 0.01, the effect is only halved (Figure 2).

This example shows that strong selection on, and preference for, a male trait only leads to a moderate change in the preference ( p p ( Q ) - p(0) = 0.136 at

Page 16: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

244 N. H. Barton and M. Turelli

most for s = 0.4 and (Y = 2). This change is closely approximated by assuming QLE. In a subsequent paper, we will show how this can be extended to more general models of female preference.

Sexual selection due to female choice: two poly- genic traits: LANDE’S (1 98 1) model of sexual selection involves two additive polygenic traits: trait y is ex- pressed in females and determines the preference for trait z* in males. The female phenotype y is measured in units of the male trait; a y female prefers males with z * near y. Here we will analyze a haploid version of this model. We will first use our notation to produce general recursions for the evolution of the traits with- out making any assumptions about the number of loci or alleles or the distributions of allelic effects. We will then use this framework to rederive the basic results of LANDE (1981) by imposing his assumption that the distributions of allelic effects across loci are multivar- iate Gaussian. Finally, we present a more general analysis that makes no assumptions about the distri- butions of allelic effects, but does assume that selection and preferences are weak so that we can apply our QLE approximations for linkage disequilibria.

Following LANDE (198 l), we assume that the male trait is under Gaussian stabilizing selection towards an optimum, 0: the viability of z* males is proportional to exp[-(z* - 0)*/2w2]. Females have an “absolute” preference for males with z * close to their y : the chance that a y female will mate with a z* male is proportional to exp[-(z* - J J ) ~ / ~ u ’ ] . (LANDE (1981) also modeled “relative” and “psychophysical” models of female preference; these gave similar results, and are not analyzed here.) From Equation 4 of LANDE (1981), the relative contribution of matings between y and z * is

ex&”-- (z* - 8)2 (z* - yy)

2 v2

where #(z*) denotes the distribution of the male trait. The denominator is a normalizing term which ensures that all females leave the same number of offspring. When selection and preference are weak (w2 >> Var(z), u2 >> &ar(z)Var(y)), the relative fitnesses can be accurately approximated by a quadratic form:

The second term represents stabilizing selection on

the male trait, caused by both natural and sexual selection. The third term represents directional selec- tion on z* toward a value that is a compromise be- tween the optima favored by natural and sexual selec- tion. The fourth term generates associations between the female preference and the male trait.

If both traits are additive, then y - j = CiEY {i and

contribute to the female preference, and Z refers to the set of loci that contribute to the male trait; follow- ing LANDE (1981), we assume that these do not over- lap. We have not explicitly included environmental effects in our model. However, if we assume that they are Gaussian, they can be absorbed into the parame- ters w2 and u2. By expanding in powers of ( and {*, and identifying terms, one finds that to leading order the non-zero symmetrized coefficients are

z * - z - = CiEz 5;*. Here, Y denotes the set of loci that

for i E Z, (52a)

for i, j E 2, and (52b)

for i E Y, j E 2. (52c)

The expression for i 0 . 9 relies on our convention of keeping track of and separately for i # j. The factors of two arise because each trait is expressed only in one sex. Selection and association coefficients involving more than two indices can be expressed as products of the coefficients in (52). The contributions of these higher-order terms are considered below.

Substituting into (16), (21) and (23) gives general equations for changes in the means and covariances of the various loci: these equations do not depend on any assumptions about the numbers of alleles or their distribution of effects. We also introduce the effects of mutation by adding a term ui to the equation for Ami and u& to AC,, where 6, = 1 if i = j and 0 otherwise. This approximation is accurate provided that terms of order u X a can be ignored, and provided that mutation is weak relative to recombination (uti << T ) . Letting ry denote the recombination rates between loci i and j, we have, to leading order in (z,

Am, = cia,, 2 Cik + ( z 0 , ~ ~ 2 Ciki + ui (53a) kEZ k.lEZ

and

AC, = -r&, + (1 - T,,)

k,lEZ J

Page 17: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 245

+ rGGy,, x (CihCjl+ CilCjr) + utigu. kEY 1EZ

If the distribution of allelic effects is multivariate Gaussian, and if mutation is symmetric (uj = 0), as assumed by LANDE (19Sl), then cijk = 0 and Ctj.~ - CGCkl = CikCjl + CilCj.4 so these expressions become

Am, = (i~).,C,z (544

and

Here, the subscripts Y and 2 denote sums over all loci in Y, and all loci in 2, respectively: thus, Zzz is the variance of z * , Cyz is the covariance between Y and 2, and so on. There is a line of equilibria for the char- acter means, along which natural and sexual selection balance to give cia,, = 0. Away from this line, trait and preference evolve together, along lines of slope Aj/ A i = Cyz/Czz; this slope is the regression of y on z*. The line of equilibria will be stable if its slope is steeper than the lines along which trait and preference CO-

evolve: (1 -+ v2/w2) > Cyz/Czz . Otherwise, trait and preference diverge from the line, leading to the in- definite exaggeration or diminishment of the male trait away from its optimum (LANDE 1981). For two- locus models, stability depends in the same way on the relative slopes of the two lines (SEGER 1985).

LANDE (1981) found the ratio C Y ~ / C ~ ~ by assuming that the means change slowly and settin ACi, to zero. In our notation, this gives CIz = t /“-g -u,i/2a0,,,. Summing over all i E Y, and then over all i E 2, gives Cyz/Czz = UY/Uz, where Us = CiEs (LANDE’S Eq. 16b). This result is independent of the parameters describing recombination, selection, and preference. It suggests that if the net mutation rates to trait and preference are similar, then changes in the male trait will cause substantial changes in the preference. However, if Uy/ Uz is too large, the line of equilibria becomes unstable.

One can complete this Gaussian analysis by sum- ming (54b) over i € Y and j € 2. For simplicity, we assume an equilibrium and unlinked loci. Then

0 = -cyz + 2(i0,ZZ~YZCZZ + (iY,,(CYYCZZ + CA). (55)

(Summing (54b) over i, j E Y and i, j E 2 would give equations for the genic variances, CiEY Cii and C i E Z Cii .) Substituting for CYZ and Czz gives the variance in preference:

The middle term is the reduction in mean fitness caused by stabilizing selection on the male trait, and will usually be small. Under our assumption that pref- erences are weak, the third term is also small, so that Cry must always be close to the approximation given

by the first term in (56). LANDE’S (1 98 1) analysis predicts both the effect of

changes in the mean of the male trait on the mean female preference (Aj/Ai), and also the variance in preference maintained at an equilibrium between mu- tation and induced stabilizing selection (CYY). His analysis is based on the assumption that the distribu- tion of allelic effects is multivariate Gaussian. HOW- ever, this assumption cannot easily be justified (TuR- ELLI 1984). A different, and more general, approach assumes only that selection and preference are weak. Then, linkage disequilibria can be approximated by their QLE values, without any further assumptions about the distribution of allelic effects. We assume that mutation does not alter the means, SO that Ui = 0. We begin by summing (53a) over preference and trait loci to obtain general approximations for the changes in means:

AZ = G0,,CZz + ~ c ~ , ~ , C ~ ~ ~ + O((i2) (57a)

and

A j = cZa,,Cyz + ( i ~ ? , ~ ~ C y z z + O(G5). (57b)

T o clarify the subsequent analysis, we will use bold subscripts to indicate loci that contribute to the male trait. Cyz and Cyzz in (57) can be approximated using the QLE approximations for d i k , CikI and c i k k given in Table 5; this produces

- A,? Aj = GY,,C, + O(2) . (58)

Equation 58 generalizes LANDE’S result concerning the coevolution of female preferences and male trait values to arbitrary distributions of allelic effects; and, like LANDE’S (1981) analysis, it applies to arbitrary numbers of loci: note, for example, that it includes the two-locus result presented in (44). If selection acts directly on the female preference-determining loci, there will usually be an additional term of the form 4.0 in (57b). In general, this will reduce the line of equilibria to discrete points [as noted by LANDE (1 98 l), KIRKPATRICK (1 982a), POMIANKOWSKI (1 988) and BULMER (1989)l; and (58) will no longer apply. The incorporation of biased mutation (i .e. , nonzero ui in 53a) has similar consequences (BULMER 1989). Thus, the existence of lines of equilibria is not a robust feature of sexual selection models.

The lines of equilibria discussed by LANDE (198 1) and KIRKPATRICK (1982a) correspond to ciD,z = 0. Equations 57 show that G B , ~ = 0 is generally not sufficient to characterize equilibria, but this condition does suffice for both LANDE’S quantitative genetic model and KIRKPATRICK’S haploid model, because each implies that c i ~ . , , C ~ ~ ~ = 0. However, as noted by R. GOMULKIEWICZ (personal communication), this oc- curs for different reasons in these two models. In LANDE’S model, it arises from the assumption that the

Page 18: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

246 N. H. Barton and M. Turelli

TABLE 5

Quasi-linkage equilibrium (QLE) approximations for various linkage disequilibria in LANDE’S (1981) model of sexual

selection on a polygenic trait

QLE approximation Loci involved

Ckkll -

i j E Z

i E Y,j E Z i j E Y k,l,m E Z

k,l E 2

i E Y, j,k E Z i E Y , j € Z i E Y , k E Z i j E Y , k E Z i j , k E Y

k,l,m,n E Z k,l,m E Z

k,l E Z

i j E Y

k.1 E 2

i E Y, k,l,m E Z i E Y, k,l E Z i E Y , k E Z i j E Y, k,l E Z i j E Y, k E Z i E Y, k,l E Z i E Y , k E Z i j , k E Y, 1 E Z i j E Y , I E Z i E Y , I E Z i j E Y i , j ,h , l E Y i j E Y i E Y , k E Z i E Y, k,l E Z i E Y, k,l E Z i E Y, k,l E Z

i E Y, k,l,m E Z i E Y, k,l,m,n E Z

These approximations are derived using the methods described for weak selection (see (24)). For simplicity, they ignore the contri- butions of mutation. Bold subscripts refer to loci that affect the male trait, z* , rather than the female preference. y. All expressions are accurate to first order in a, unless otherwise stated. Different subscripts denote distinct loci (e.g., C,, does not include C,,k as a special case).

distributions of breeding values for both the male trait and female preference are Gaussian, and hence not skewed, ie., Czzz = 0. In contrast, for KIRKPATRICK’S diallelic haploid model, only directional selection is possible for the male trait, and so c i ~ , ~ ~ = 0. Thus, the agreement of these two very different genetic models concerning the existence of a line of equilibria is

somewhat misleading. Our more complete analysis shows that the conditions for equilibria are generally more complex. Equation 57 suggests that multiple lines of equilibria may coexist [corresponding to alter- native equilibria with different amounts of skew (cf: BARTON 1986b)I or that none may exist (because of constraints connecting the values of the means, vari- ances and skews), depending on the details of the underlying genetics. Thus, some of the differences observed by CURTSINGER and HEISLER (1988) be- tween their simple diploid models and KIRKPATRICK’S (1 985) haploid and quantitative genetic models may not be attributable to diploidy per se or dominance, but rather to the intermediate complexity of their genetic model. These subtleties merit further analysis. For simplicity, we will restrict attention below to polygenic systems in which Czzz = 0 at equilibrium, so that = 0 does define a line of equilibria.

From (58) we see that Aj /A i can be rewritten as (p/2)-, where p is the correlation between y and z* across mating pairs. Since the correlation must be less than 1, the genetic variance in female prefer- ence must be larger than the variance in the male trait for there to be substantial coevolution of preference with male trait. In the present model, this is likely: the only force that balances recurrent mutation by reducing the variance in y arises from the weak cor- relation between y and z.

We can take the analysis further by finding the equilibrium variance in female preference, Cw. The increase in variance due to mutation is balanced by (induced) stabilizing selection on the preference; the latter arises from stabilizing selection on the trait loci, which are correlated with the preference loci as a result of the association term &. To simplify, we begin by ignoring mutation, and just calculate the changes due to natural and sexual selection. We also assume that the male trait is on the line of equilibria (thus, A i = 0) and that many loci contribute to the male trait. From the results in TURELLI and BARTON (1990), we know that with Gaussian fitness functions, no skew will be generated by selection at equilibrium if many loci contribute to the character (2.e.. Czzz = 0). Thus, from (57a), A i = 0 implies c i ~ , ~ = 0. We will show that regardless of the distribution of allelic ef- fects, the effective stabilizing selection on the prefer- ence locus is proportional to ciO,zzci&; this implies that if preferences are weak (& << I), the variance in preference will become very large.

Since the following calculations will involve terms of order a 3 , we must extend the approximation of the Gaussian fitness model to include higher-order terms: the quadratic approximation given by (51) will no longer suffice. These higher-order terms are most easily calculated by deriving Taylor series represen- tations for the selection and association coefficients. Expanding (50) reveals new coefficients; with the as-

Page 19: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 247

sumption that cia,, = 0, the relevant terms are ci0,22z2 = ci&22 + o(aS), = ci;,2 + O(a3), and ciY,,,, = 2ci,,2ci0,22 + O(aS). In addition, with ii0,2 = 0, the stabilizing selection coefficient, ci0,22, can be refined to ci0,22 = ciQ,22 - 2Czz(ci&2,)2 + CWG&, where G& -[ 1/(4v2) + 1/(4w2)]; but the second-order terms in ci0,22 do not contribute to the results below.

The change in variance at a preference locus is:

A C i , = &,22 ( c i i k l - c i i c k l ) k,IEZ

+ (i0,2222 2 (Ciiklmn - CiiCklmn) + k,l,m,nEZ

= 60.22 ( C i i u - C z i C w ) (59) kEZ

+ G0.22 ( c i ik l - C i i C U ) k#lEZ

+ ~ 0 , 2 2 2 2 (Ciiklmn - CiiCklmn) + O(a4) k,l.m,nEZ

for i E Y. There are contributions from stabilizing selection on individual loci (ci@,kk), and on pairs of loci (ci0,kI); these depend on the four-way disequilibria c i i k k

- C i i C u and c i i k l - CiiCkI. There is also a small contri- bution (O(a')) from the G0,2222 terms, which arise from the difference between the Gaussian and quadratic models.

Applying the leading-order QLE approximations in Table 5 to the terms inside the summations in (59), we find that the leading term is contributed by C i i ~ :

A c i i = &,rz ( c i ikk - c i i c k k ) kEZ

= 2 0 , z z ~ y ~ t i : 2 c a + o(aS> (60) k€Z

= GYiy,CiiiAZ + 0(a3) .

Since we have assumed that the male trait is on the line of equilibria (Ai = 0), this shows that the variance in preference changes slowly, at a rate of order a'.

To calculate the third-order terms, we proceed in the same way, by writing out the full expressions for A ( C i i u - CiiCu) and A(Ciik1 - CiiCkI), substituting into these the QLE approximations (to first order in a) given in Table 5, and thus obtaining QLE expressions for Ciikk - C i i C u and C i a 1 - C , i C k l to second order in a:

Note that because G ~ . 2 2 r r is second order, it suffices to use the first-order approximations from Table 5 for Cjiklmn - CiiCklmn in (59). These expressions can now all be substituted into (59) to give ACii to third order in a. However, the result is much simplified if we assume that either the distribution of effects at each preference locus is not skewed (Cizi = 0), or more generally, that the distribution of the trait, z, has zero

that even though E k C k & = 0, there may be skew at the individual loci: Ck& # 0. For example, with two alleles at each locus, some loci will be near fixation for + alleles, and some for - alleles; though there may be no skew overall, C k k k = -pkf&(pk - qk) at each locus (see BARTON 1986a). (Actually, it is not quite sufficient to assume that the third and fifth moments of z are zero, since the recombination fractions enter the sums. However, this will not alter the results if variation is distributed evenly over the chromosomes.)

With these assumptions, the only terms that con- tribute to ACzi are the first part of (6 1 b), proportional to a:,, and the terms proportional to Gp,22, which are also proportional to G&:

third and fifth mOmentS (& Ckkk, z k Ckkkkk = 0). Note

where

The coefficient # depends on the distribution of allelic effects at each trait locus, and on the linkage relations between preference and trait loci. However, it will be of order 1, and does not depend strongly on the underlying genetics. For example, with a large num- ber of unlinked loci, the first term is negligible, and

The variance in preference decreases as if stabilizing selection of strength G B , ~ ~ = GO,&#C$z acted directly on the preference. By balancing this effective stabiliz- ing selection against mutation, ui,, we can approximate the equilibrium variance in preference. This requires some assumption about the distribution of allelic ef- fects at the individual preference loci, so that we can

Tkl/T,kI = 2/3; thus, # = 4/3.

Page 20: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

248 N. H. Barton and M. Turelli

express Ctiri in terms of Ci,. First, suppose that this distribution is Gaussian, so that ciiii = 3C:. Solving for C:, at each locus, and summing, we find that to leading order,

UY C Y Y =

UZ&JG (634

and

uz c z z =

J x z ’ (63b)

where US = z7= &. Applying (58),

A j - UY A: Uz&‘

These expressions are similar to LANDE’S (1981) results and (56), differing only by the factor of 4 to leading order. The difference arises because LANDE assumed that the distribution of allelic effects across all loci was multivariate Gaussian; whereas here, we have only assumed that the distribution at each pref- erence locus is Gaussian. The QLE analysis which led to the expressions in Table 5 shows that sexual selec- tion generates fourth and higher moments across loci that differ from the multivariate Gaussian values, and which depend on recombination rates. Like LANDE’S result, (64) shows that there can be substantial co- evolution of preference with trait if the mutabilities of the two sets of loci are comparable.

Alternatively, we can calculate the variance in pref- erence on the assumption that each locus is close to fixation, so that the fourth moment is proportional to the variance. Specifically, we assume that Cirii = Ciia?, where a’ is the variance of mutational effects at locus i and is assumed to be much larger than Cii. Under this assumption, Clin - C f = Ciia?. This is the “rare alleles” approximation used by BARTON and TURELLI (1987); it gives the same results as the “House of Cards” analysis of models with a continuum of alleles (TURELLI 1984), and as WRIGHT’S (1935) model of diallelic loci (BARTON 1986b). To leading order, the “rare alleles” analysis produces

”- (64)

- C W = r and P Y ~ B , , ,

Pza,=4

czz = -7,

(654

(65b) PZ

where ws CIES p,, and pi = ud/a?, the mutation rate at locus i. Applying (58) ,

”“ A j - ~ ~ 6 0 . ~ 2 A5 P.%~& ’

Again, there can be substantial coevolution of pref- erence with trait. In fact, the ratio Aj /At is much

(66)

larger under the “rare alleles” than under the Gaus- sian allelic approximation, since pz is likely to be small. As in the Gaussian result (64), the reason why changes in the trait can have a large effect on the preference even when preferences are weak is that the variance in preference (CW) builds up to a large equilibrium value under the pressure of mutation (6 NICHOLS and BUTLIN 1989). In the “rare alleles” case, the degree of coevolution (given by Eq. 66) actually in- creases when the preferences become weaker, because Cw is inversely proportional to the square of the pref- erence. As the preference coefficient iiy,z tends to zero, the variance in preference Cw increases, until the “rare alleles” approximation breaks down, and Cw approaches its maximum value under the pressure of recurrent mutation.

Because the stabilizing selection induced on the preference loci is so weak, this model predicts that the variance of preference will be large. However, if it becomes too large, the approximation that terms of order a4 and higher are negligible breaks down. More important, the model itself becomes biologically im- plausible. The reason is that if most females prefer males far more extreme than any actually present in the population, then the Gaussian preference model implies that they will exert unreasonably strong pref- erences. We can measure the typical strength of pref- erence by taking the ratio between the probability that a female with preference one standard deviation from the mean (y = j + G y ) will mate with males one standard deviation above the mean (z* = i + G) to the probability that they will mate with males one standard deviation below the mean (z* = i - 6). This ratio is exp[iy,,(Cyy - CZZ)] exp(iy,zCyy) = exp(Aj/Ai) for C y , >> Czz. Thus, strong coevolution between preference and trait implies that most fe- males exert strong preferences, even when the coef- ficient Cy,, is small. This relation is useful, because it relates the degree of coevolution ( A j l A i ) to a measure of the degree of preference exerted by females, a quantity that could in principle be measured.

A separate problem, pointed out by LANDE (1981) and KIRKPATRICK (1 985, Appendix), is that when the variance in female preference becomes large, the net selection on the male trait becomes disruptive instead of stabilizing: variances at both sets of loci then di- verge. This divergence can be explained using the present notation by considering the difference be- tween the quadratic approximation (5 l ) and the Gaus- sian model (50). T o second order in a, i i ~ , ~ ~ = (z&,,( 1 - 2i&,Czz) + C Y Y i ~ , r , where G,,, = -[ 1/(4v2) + 1/(4w2)]. (These additional terms do not affect (62) to leading order). The first term is always negative, representing stabilizing selection, but the second term is positive. Thus, when the variance in preference becomes larger than C y y = (z&/ii&, selection becomes disruptive.

Page 21: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 249

(This argument is only approximate, because the ex- pansion in powers of a will break down as C y y becomes large. However, calculations using the alternative ap- proximation that the distributions of trait and pref- erence are normally distributed lead to a similar threshold for C y y . )

For the Gaussian model, (63a) shows that the condition for stability is U& < U&. In the “rare alleles” approximation, the condition is py < pz&i$,,,. Because the a’s are assumed small, this condition is more stringent than in the Gaussian case. It would be useful to compare simulations where the preference is determined by the sum of a set of biallelic loci, with simulations where the preference depends on loci with roughly Gaussian effects: the arguments here suggest that the former case will give greater variance in preference, and so will be less stable.

Relation to other quantities: The effects of selec- tion and nonrandom mating are absorbed into the joint fitness of the pair of haploid genotypes, W(X,X*). The method described above relies on ex- panding W(X,X*)/m in powers of the allelic states (ru). This gives a series of selection coefficients and and association coefficients The rela- tion between this method and the “selection gradient” approach developed in BARTON and TURELLI (1 987) and TURELLI and BARTON (1990) is set out in detail in APPENDIX A. In this section, we discuss the relation between the selection and association coefficients and other descriptions of selection: the covariance be- tween fitness and the character, average excess, and average effect.

PRICE’S method: the covariance with fitness: PRICE (1970) showed that the evolution of any inherited character, r, can be divided into two parts: one arising from the covariance between the character and fitness (representing natural selection), and the other arising from changes during transmission (representing mu- tation, recombination, etc.):

GAZ = Cov(w,r) + E(w Ar). (67)

There are various ways of carrying out this partition. In the present context, it is simplest to follow partic- ular haploid genotypes, X. Here, w refers to the mean fitness of some genotype; this is an average over the two sexes, and is w(X) = [Cy*f(X,Y*)W(X,Y*) + Cy f(Y,X*)W(Y,X*)]/2 in our notation. The covariance and the expectation in (67) are taken over the popu- lation of haploid genotypes X immediately after meiosis, including both sexes; Az is the expected change in the character due to mutation and recom- bination. FRANK and SLATKIN (1990) apply PRICE’S formula to the moments of allelic effects, rs, and derive results for mutation and stabilizing selection on an additive polygenic trait. It is easy to relate this approach to our general formulas. The covariance

between fitness and S‘S is

Cov(w,S‘s) = x CiU,er(CS+U - C S C U ) . (68) U

(Here, we use the symmetrized coefficient Ciu.0, be- cause the covariance is taken over the two sexes.) Equation 68 is identical to the average of (12a) and (12b), which give the change in CS as a result of selection alone. Most of the complexity of multilocus selection theory arises from recombination, whose effects are represented by E(w Az) in (67); also, addi- tional terms are needed to take account of changes in the central moments due to changes in the mean, a case not dealt with by FRANK and SLATKIN (1990). Though PRICE’S formula is elegant, it does not circum- vent the fundamental complexity of multilocus prob- lems.

Average excess: For simplicity, we only consider the diallelic case. The average excess fitness of allele ‘ 1 ’ at locus i is k[Xi(W(X,X*) - m)/m/k(X,) = Ap,/$, = (xu Ciu,~Cu+,)/pi. The average excess of combinations of alleles (r&, say) can be written in a similar way:

CsCu)(Cv+, - CTCv))/(CsCT), the sum including U = 0 and V = 0.

Average effect: The average effect of a substitution at locus i (Ai) is the regression of fitness on the state of that locus (FALCONER 1985). Minimizing E(([(W(X,X*) - m)/W - AX)‘) gives Ai = C;’Apj. Here, C;’ is the inverse of the matrix of pairwise linkage disequilibria. The average effects of combi- nation of alleles are given by more complicated expressions, which depend on the inverse of matrices involving higher order disequilibria.

&sWv(X,X*) - rn)/W/&S‘sr?) = (Cu,v 6v,v(Cu+s -

DISCUSSION

Most characters of evolutionary and economic im- portance have complex genetic bases whose details are unknown. Both evolutionists and applied geneticists want to know the extent to which the dynamics of quantities such as character means and variances, which are often both easily measurable and themselves the objects of greatest interest, can be understood without knowing the underlying genetic details. A natural approach to this problem is to contrast the results from alternative idealized genetic models of the same phenomenon (e.g., FELSENSTEIN 1979, 1981a; KIRKPATRICK 1982b). A major obstacle to this undertaking is the complexity of multilocus selection. Previous approaches have relied on: statistical analyses without a firm genetic basis, unsupported genetic assumptions, computer simulations, or detailed anal- yses of particular selection schemes that are amenable to mathematical analysis but difficult to relate to se- lection on phenotypes. Here we have presented a new approach and illustrated its range of applications.

Page 22: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

250 N. H. Barton and M. Turelli

Once the fitnesses of multilocus genotypes have been expressed in the form (6), our Equations 16, 21 and 22 provide an exact and complete, albeit complex, description of the resulting dynamics. Any pattern of viability selection on autosomal loci can be expressed in this form. It also encompasses fertility selection and non-random mating among haploids.

A major advantage of this formulation is that it leads to explicit approximations for linkage disequi- libria, at least when selection on individual loci is weak. The resulting “quasi-linkage equilibrium” (QLE, sensu NAGYLAKI 1976) expressions are probably the most useful product of our approach. They can be used to approximate multilocus dynamics in terms of the dy- namics at individual loci. Unfortunately, neither our approach nor any other can simplify the analysis of these within-locus dynamics without making specific genetic assumptions. However, our method generates equations that indicate how the resulting dynamics and equilibria depend on these assumptions. Both the analysis above of KIRKPATRICK’S (1 982a) sexual selec- tion model and the additive polygenic selection analy- sis presented in TURELLI and BARTON (1990) show that the QLE approximations can be extremely accu- rate, even when selection is fairly strong. As shown in APPENDIX A, the analysis presented here is equivalent to the selection gradient approach of TURELLI and BARTON (1990). The gradient formulation may be more convenient for analyzing selection on additive polygenic traits or for analyses involving changes of variables. However, the approach presented here will often be simpler to apply and is more easily modified to incorporate nonrandom mating and non-additive genetic interactions.

Our most substantial examples involve the joint action of sexual and natural selection in governing the coevolution of male traits and female preferences. For KIRKPATRICK’S (1 982a) two-locus haploid model, we were able to approximate analytically the change in allele frequency at a locus which determines female preferences caused by the evolution of a locus encod- ing a male trait subject to both natural and sexual selection. Our analysis also indicates that the condi- tions for both loci to reach polymorphic equilibria are quite restrictive. A limitation of this two-locus model is that its restricted range of genetic variability ob- scures some of the evolutionary consequences of its biological assumptions.

These consequences are clarified in the polygenic model developed by LANDE (1 98 1) to explore FISHER’S (1 958) “runaway” process of sexual selection. LANDE’S (1 98 1) treatment has two parts: a phenotypic analysis that treats the critical quantitative genetic parameters as constants, and a genetic analysis that describes the dynamics and equilibria of these parameters under selection and mutation. LANDE’S phenotypic analysis relies on the traditional quantitative genetic assump-

tion that the distribution of breeding values (which involve sums of effects across all relevant loci) is Gaus- sian. His genetic analysis relies on the much more restrictive and questionable assumption that the dis- tributions of allelic effects at individual loci are Gaus- sian ( c j TURELLI 1984, 1988; BARTON and TURELLI 1989). An alternative simplification of the multilocus dynamics is the QLE approximation, This ieads to general recursions that reproduce the key features of LANDE’S (1981) results when the Gaussian allelic as- sumption is imposed; the recursions also include KIRK- PATRICK’S (1 982a) model as a special case.

As noted below Equation 57, our general analysis shows that the similarity of the lines of equilibria obtained from LANDE’S (198 1) and KIRKPATRICK’S (1982a) very different genetic models is rather mis- leading. Genetic models of intermediate complexity, and greater realism, can in principle produce multiple lines or none. This deserves further analysis. How- ever, our general recursions make it clear why, when these lines of equilibria exist, they are a structurally unstable feature of the model that disappears when direct selection is applied to female preferences or when mutation has a directional component. Al- though we obtain a general description of the ratio of changes in coevolving male and female traits in terms of a measurable index of female choosiness, the equi- librium genetic variances for both preference and trait and the amount of coevolution depend critically on the genetic details.

An important conclusion from our genetic analysis is that, in the absence of direct selection on female preferences, the genetic correlations built up by non- random mating produce very weak induced stabilizing selection on the female preference loci. This induced stabilizing selection is proportional to IzO.zzIz~,rC.&, where Czz denotes the variance of the male trait, and iO,zz and are small parameters that describe the intensity of stabilizing selection on the male trait and the degree of female choosiness, respectively. Thus, this model predicts extremely large equilibrium vari- ances in female preferences, which in turn lead to rapid coevolution. The critical dependence of the equilibria and dynamics on this extremely small pa- rameter combination further illustrates the structural instability of this model. As previously noted by LANDE (1 98 l), KIRKPATRICK (1 982a), LEIGH (1 986), POMIANKOWSKI (1 988) and BULMER (1 989), this very weak indirect selection on female preferences will be overwhelmed by any direct force, such as selection or directional mutation, that affects preferences. Our analysis clarifies the nature of this instability by pro- viding tractable expressions for the evolutionarily rel- evant selection parameters.

The approach described above can be usefully ap- plied to a wide range of multilocus problems. In particular, it can easily encompass other evolutionary

Page 23: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 25 1

forces that affect the coevolution of male traits and female preferences; and these will be explored in a subsequent paper. It also provides a complete genetic analysis of the dynamics of polygenic traits under selection that can test and possibly refine the tradi- tional statistical approach. Extending the recursions described here to include sexual selection on diploids, sex linkage, and modifiers of recombination rates is a straightforward, if tedious, task; the incorporation of genetic drift is discussed in Appendix 1 of TURELLI and BARTON ( 1 990). Without modification, our meth- ods can be used to analyze, for example, the relation between the net distribution of fitness and selection on individual loci, the hitchhiking effects of multilocus selection, the dynamics of nonadditive polygenic traits under selection, and general consequences of selection on traits governed by many loci. For other possible applications, see FELSENSTEIN ( 1 98 1 b).

BENGT OLLE BENGTSSON, JERRY COYNE, RICHARD GOMULKIE- WICZ, CHUCK LANGLEY, LINDA PARTRIDGE, MONTY SLATKIN, and NEAL TAYLOR gave helpful comments on preliminary versions, which are gratefully acknowledged. We are particularly indebted to DICK HUDSON, MARK KIRKPATRICK and TOM NAGYLAKI for helping to eliminate many obscurities. This work was supported by the Science and Engineering Research Council (GR/C/91529, GR/E/08507), the National Science Foundation (BSR-8866548), the Institute ofTheoretical Dynamicsand the Center for Population Biology at the University of California, Davis, and by the Swedish Natural Science Research Council.

LITERATURE CITED

BARTON, N. H., 1986a The effects of linkage and density-depend- ent regulation on gene flow. Heredity 57: 415-426.

BARTON, N . H., 1986b The maintenance of polygenic variation through a balance between mutation and stabilizing selection. Genet. Res. 47: 209-216.

BARTON, N. H., and M. TURELLI, 1987 Adaptive landscapes, genetic distance and the evolution of quantitative characters. Genet. Res. 49: 157-1 74.

BARTON, N. H., and M. TURELLI, 1989 Evolutionary quantita- tive genetics: how little do we know? Ann. Rev. Genet. 23: 337-370.

BULMER, M. G., 1980 The Mathematical Theory of Quantitative Genetics. Clarendon Press, Oxford.

BULMER, M. G., 1989 Structural instability of models of sexual selection. Theor. Popul. Biol. 25: 195-206.

CHARLESWORTH, B., 1987 The heritability of fitness, pp. 21-40 in Sexual Selection: Testing the Alternatives, edited by J. W. BRADBURY and M. B. ANDERSON. John Wiley, Chichester, U.K.

CHRISTIANSEN, F. B., 1988 The effect of population subdivision on multiple loci without selection, pp. 71-85 in Mathematical Evolutionary Theory, edited by M. W. FELDMAN, Princeton University Press, Princeton.

CURTSINGER, J. W., and I. L. HEISLER, 1988 A diploid “sexy son” model. Am. Nat. 132: 437-453.

CURTSINGER, J. W., and I. L. HEISLER, 1989 On the consistency of sexy-son models: a reply to Kirkpatrick. Am. Nat. 134: 978-98 1.

FALCONER, D. S., 1985 A note on Fisher’s ‘average effect’ and

FELSENSTEIN, . I . , 1979 Excursions along the interface between ‘average excess’. Genet. Res. 46: 337-347.

disruptive and stabilizing selection. Genetics 93: 773-795. FEUENSTEIN, J., 1981a Continuous-genotype models and assor-

tative mating. Theor. Popul. Biol. 1 9 341-357. FELSENSTEIN, J., 198 1 b Bibliography of Theoretical Population

Genetics. Dowden, Hutchinson & Ross, Stroudsburg, P.A. FISHER, R. A., 1958 The Genetical Theory of Natural Selection,

Second Edition. Dover, N.Y. FRANK, S. A., and M . SLATKIN, 1990 The distribution of allelic

effects under mutation and selection. Genet. Res. 55: 11 1-1 17.

GOMULKIEWICZ, R. S., and A. HASTINGS, 1990 Ploidy and evolu- tion by sexual selection: a comparison of haploid and diploid female choice models near fixation equilibria. Evolution 4 4

HASTINGS, A,, 1986 Multilocus population genetics with weak epistasis. 11. Equilibrium properties of multilocus selection: what is the unit of selection? Genetics 112: 157-1 7 1.

KIRKPATRICK, M . , 1982a Sexual selection and the evolution of female choice. Evolution 36: 1-12.

KIRKPATRICK, M., 1982b Quantum evolution and punctuated equilibria in continuous genetic characters. Am. Nat. 119:

KIRKPATRICK, M., 1985 Evolution of female choice and male parental investment in polygynous species: the demise of the “sexy son.” Am. Nat. 125: 788-810.

KIRKPATRICK, M., 1986a The handicap mechanism of sexual se- lection does not work. Am. Nat. 127: 222-240.

KIRKPATRICK, M., 1986b Sexual selection and cycling parasites: a simulation study of Hamilton’s hypothesis. J. Theor. Biol. 119:

KIRKPATRICK, M., 1988 Consistency of genetic models of the sexy son: reply to Curtsinger and Heisler. Am. Nat. 132: 609-610.

LANDE, R., 1981 Models of speciation by sexual selection on polygenic traits. Proc. Natl. Acad. Sci. USA 78: 3721-3725.

LANDE, R., and M . KIRKPATRICK, 1988 Ecological speciation by sexual selection. J. Theor. Biol. 133: 85-98.

LEIGH, E. G. , 1986 Ronald Fisher and the development of evo- lutionary theory. 1. The role of selection. Oxford Surv. Evol. Biol. 3: 187-223.

NAGYLAKI, T., 1976 The evolution of one- and two-locus systems, Genetics 83: 583-600.

NICHOLS, R. A,, and R. K. BUTLIN, 1989 Does runaway sexual selection work in finite populations? J. Evol. Biol. 2: 299-313.

POMIANKOWSKI, A. P., 1988 The evolution of female mate pref- erences for male genetic quality. Oxford Surv. Evol. Biol. 5:

PRICE, G. R., 1970 Selection and covariance. Nature 227:

757-770.

833-848.

263-272.

136-1 84.

520-52 1. ROUGHGARDEN, J., 1979 Theory of Population Genetics and Evolu-

tionary Ecology: An Introduction. Macmillan, N.Y. SEGER, J., 1985 Unifying genetic models for the evolution of

female choice. Evolution 39: 1185-1 193. TOMLINSON, I. P. M . , 1988 Diploid models of the handicap prin-

ciple. Heredity 60: 283-293. TURELLI, M., 1984 Heritable genetic variation via mutation-

selection balance: Lerch’s zeta meets the abdominal bristle. Theor. Popul. Biol. 25: 138-193.

TURELLI, M., 1988 Population genetic models for polygenic var- iation and evolution, pp. 601-618 in Proceedings of the Second International Conference on Quantitative Genetics, edited by B. S. WEIR, E. J. EISEN, M. M. GOODMAN and G . NAMKOONG, Sinauer, Sunderland, M A .

TURELLI, M., and N. H. BARTON, 1990 Dynamics of polygenic characters under selection. Theor. Popul. Biol. 38: 1-57.

WOLFRAM, S., 1988 Mathernatica. Addison Wesley, Redwood City, C.A.

WRIGHT, S., 1935 Evolution in populations in approximate equi- librium. J. Genet. 30: 257-266.

Page 24: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

252 N. H. Barton and M. Turelli

WRIGHT, S., 1969 Evolution and the Genetics of Populations. II. The Theory of Gene Frequencies. University Chicago Press, Chicago.

Communicating editor: R. R. HUDSON

APPENDIX A

Relation of selection coefficients to selection gra- dients: Here we will show how the basic selection recursions (1 0) and (1 2) can be derived by extending the selection gradient analysis of TURELLI and BAR- TON (1 990). Because selection generates associations between alleles from different gametes (Cu,v), we must define the selection gradients favoring such associa- tions, as well as those affecting individual loci, and sets of loci from the same gamete. The definition requires that the mean fitness be written as W = Ex+* W(x,x*)f(x,x*), where the genotype frequencies f(x, x*) are implicit functions of the moments mi,@, ma,,, and Cu,v described in the text. The selection gradients are defined by

and

These gradients are evaluated at the beginning of the generation, whenf(x,x*) =f(x)f(x*), mi,@ = m0.i and Cv,v = C d v ; they are computed by ignoring the fact that the fitnesses W(x,x*) may themselves change with genotype frequencies. [This convention rests on the fact that Equation 2.3 of TURELLI and BARTON (1990) remains valid with frequency-dependent selection, if we ignore the dependence of W(x,x*) on f(y,y*) when we compute dlogW/df(y,y*); see WRIGHT (1969), ROUGHGARDEN (1979, Ch. 4)]. Imagine that W(x,x*), the relative contribution of genotype (x,x*) to the next generation, is held constant at the value produced by the current population composition. Then, the mean fitness that the population would have at other genotype frequencies (represented by mi,@, mO,i, Cu,v) can be calculated using these fixed values. This point is elaborated below when we show how to relate the selection gradients defined by (A.l) to the selection coefficients defined in the text.

The central idea of the selection gradient analysis, developed in BARTON and TURELLI (1987) and TUR- ELLI and BARTON (1990), is to describe the effects of selection on various functions of the distribution of allelic effects. This involves transforming the basic equation for selection response, (9), from one set of variables to another. TURELLI and BARTON (1990) analyzed sex-independent selection on additive poly- genic traits in diploid populations. Thus, we did not

encounter the problems produced by recombination acting on genotypes with sex-specific distributions after selection; and we worked with sex-independent means and central moments throughout our calcula- tions. Nevertheless, the general equations that we derived for changes in the means and central moments under selection apply even if fitness is sex-dependent, i.e., W(x,x*) # W(x*,x). With sex-dependent selection, the central moments after selection are defined using the sex-specific means after selection, i e . , mi',@ and m&. We avoided using these central moments in the text to simplify the analysis of recombination. Thus, two routes are available for showing that both the selection gradient approach and the selection coeffi- cient approach lead to the same recursions. We can either apply the analysis of TURELLI and BARTON (1 990) to the variables used in this paper or apply the selection coefficient technique to analyze the changes in the central moments caused by selection. The latter is simpler.

Here we will ignore convention (1 1) and define ASCs,r as the change caused by selection in the central moment CS,T, defined with respect to the sex-specific means. Thus,

ASCs,-r = n . (xt - 40) n (x;" - &J) x+* iES JET

. Lf(x,x*) + Af(x, X*)] - CS.T

= n (x, - m, - A,m,,0) ('4.2) %,X* ,ES

n . (x;" - m, - Am~~)Lf(x,x*) + Af(x,x*)l

- C.7.T.

1-

Combining (1 2) with the argument used in deriving (21) from (19), we see that this definition of AsCs,T leads to

U.Y.T= C a".a(~s+"- CSCU)CT u

+ C aa,vCs(cT+v - CTCV)

+ au,v(Cs+u - cSCU)(CT+V - C T C ~ )

V (A.3)

U.V

- Cs- iAsmi ,~ - C~- j&I tca j o (a2 ) iES j € T

for all S and T , where sums over null sets are zero and the selection coefficients are defined as in (6). This differs from (12) only in the appearance of the terms involving the changes in the means; these enter be- cause we are analyzing central moments. As shown in (22), the O(a') can all be written explicitly as functions of the leading-order terms given in (A.3). Thus, to show that the selection gradient method produces the same recursions, it suffices to show that the leading- order terms agree with (A.3) and (1 0) . The derivation

Page 25: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 253

of these selection equations presented in the text requires only that the sums in (A.3) be consistent with those used to define the selection coefficients in (6). However, to be consistent with the conventions in TURELLI and BARTON (1 990), we assume that reduc- tion formulas like (5) have not been used in defining the selection coefficients in the treatment below.

The derivations in TURELLI and BARTON (1990) followed central moments throughout the life cycle. Although we emphasized selection on additive poly- genic characters, our basic recursions do not assume this. In general, Equations 2.5 and A1.6 of TURELLI and BARTON (1 990) give

.4,m,,0 = g(b, 01, {U, VI)LU,V (A.4a) u,v

and

U S . T = g({S, T ) , {u, V))Lu,v + O(L2). (A.4b)

The notation here has been simplified slightly from TURELLI and BARTON (1990). The sum over U, V includes the cases {z, 0) and {0, i); (U, V ) is counted separately from {V, U ) for V # U , since selection pressures may differ between the sexes. Permutations of indices are also distinguished: for example, the sum includes all the permutations {i jk, Zm), {kji , mZ), {kij , m l ) , etc. separately. This convention is a matter of choice: since L ( Y k I , { l m ) = L { i k j l , { m l l , etc., one could choose instead to merge all these permutations into one term. However, it is easier to describe the summations if permutations are distinguished: we therefore follow this convention throughout this AP- PENDIX (see TURELLI and BARTON, 1990, Eq. Al . 12). The matrix g that connects selection response with selection gradients in (A.4) is:

U.V

g(k 01, I j , 01) = c.0, (A.5a)

g(k 01, (0, j ) ) = 0, (A.5b) g({ir 01, (up V I ) CU+i,v - Cu-j,d.c,

j € U

for J U + VI > 1 (A.5c)

and

- 1 c S + j , T c U - j , V - cS,T+jcU,V- j (A.5d) j € U j € V

+ cS-j,&U-k,&jk j € S h€U

+ c S , T - j c U , V - k c j k / E T kEV

for + V I , I S + TI > 1 (6 (A1.6) of TURELLI and BARTON 1990). Here, I U I is the number of elements in U. Since these expressions are evaluated at the beginning of the generation, Cu,v can be replaced by C J V throughout.

The terms of order L 2 in (A.4b) arise because the CU," are central moments, and so change as the means change. Just as with (A.3), these higher-order terms can be expressed in terms of the leading-order expres- sions given (see Eq. 22). Hence, it suffices to show that the leading-order terms in (A.3) and (A.4b) agree. If we were to measure moments relative to a f ixed mean (or used non-central moments), the higher order terms would not appear, and all the terms in (A.5) that involve sums would disappear (see (A1.5b) of TURELLI and BARTON 1990).

Equations A.4 describe the response to selection in terms of selection gradients, LU,", defined with respect to the effects of changing the distributions of genotype frequencies at specific loci. When fitness depends only on the sum of effects across loci, these selection gra- dients on the underlying loci correspond directly to selection gradients defined in terms of the moments of the distribution of breeding values: this makes them a convenient representation for describing selection on additive polygenic characters. However, in the more general case considered here, (A.4) can be greatly simplified by changing the representation.

We define the seZection cogjcients as follows:

aU,v = LU.V for I U l > 0, IVl > 0 , (A.6a)

au.0 - - c CVLlJ,V for IUI > 1, (A.6b)

a0.v = E c ~ L , " for I V I > 1, (A.6c)

V

U

(A.6d)

and

a0,i = ~ 0 , t - cvaer,v+i I V P 1

= L0,, - z c CvCdu.v+, , (A.6e) I V p - 1 u

where the unrestricted sums include the null set. T o see that the a's defined by these expressions are the same as those produced by the definition (6), it is important to remember that selection gradients are defined with the fitnesses being held constant while calculating the partial derivatives of with respect to f(x,x*), or its surrogates { m , , ~ , ma,, and Thus, to compute LU,V from (6), we ask how the mean fitness would change if the population had different values for the moments but the fitnesses were held fixed at

Page 26: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

254 N. H. Barton and M . Turelli

the values produced by the current moments. Denot- ing the new moments by mi.0, ma,,, and 6 U . v and the current (fixed) values by mi, and Cu,v = CVCV, the new mean fitness would be proportional to

We can use this expression to compute the selection gradients by taking derivatives with respect to the moments Gj.0, GD,,, and ea,", then evaluating the re- suiting expressions at m i . 0 = mer,, = m, and CU," = CU," = C&v. This yields, for example,

-

Lu.0 = au.0 - c au,vCv ( A 4 I vb-0

for I Ul > 1, which is equivalent to (A.6b). Some tedious algebra shows that with these definitions, (A.4a) can be rewritten as (loa) and (A.4b) can be rewritten as (A.3).

APPENDIX B

Time scale for QLE: We will consider the time scale over which the QLE approximations apply. Our analysis is based on the arguments introduced by NAGYLAKI (1 976) for two loci and applied by TURELLI and BARTON (1990) to multiple loci. As noted by NACYLAKI (1 976), three periods can be distinguished for two-locus dynamics when selection is weak relative to recombination. There is a brief initial period (pro- portional to the inverse of -In(l - r), where r is the recombination rate) during which initially large link- age disequilibria, if present, are broken down by re- combination much more rapidly than they are changed by selection. After this initial period, the disequilibria change at a slower rate governed by the selection intensity. Over a second period, of length comparable to the first, the disequilibria approach their QLE values. During a third period, when most allele frequency evolution takes place, if the initial values are not near equilibria, linkage disequilibria change very slowly (ACN is proportional to the square of the selection coefficients); and they remain close to their QLE values. Here we will extend the argument presented in Appendix 2 of TURELLI and BARTON (1990) to estimate the duration of the initial period after which all of the linkage disequilibria are small

and change slowly. The length of the second time period can be estimated similarly.

As in the text, we define

a = m a x d I &,VI I . P .1)

We will show that there exists a finite time T such that for all N and t 5 T , -

CN - CN = o(a) (B.2)

and

ACN = + OCU'), (B.3)

where 6 N denotes the linkage equilibrium value of CN and LCN denotes the approximation (20) for ACN. Equation B.3 implies that we can use LCN to calculate the QLE approximation for CN to first order.

We proceed by induction on the size of N. The case IN I = 2 was treated explicitly by TURELLI and BAR- TON (1 990). Now we assume that (B.2) and (B.3) have been verified for all N satisfying IN I k , and we will verify them for IN1 = k + 1. To verify (B.3), first note that (16) implies Am, = O(a) for all i and t . Hence, from (22), we have

ACN = LCN + LCN-,(-Ami) + O(U') (B.4) iEN

for all t. Equation 21 implies that for all N and i,

LCN-, = - x TS,T(CN-~ - CsCT) + O(U). (B.5) S+T=N-t

For any nontrivial recombination partition S + T = N - i, = e s t T . Hence, our induction hypothesis concerning (B.2) implies that

x rS,T(CN-i - CSCT) (B.6) S+T=N-z

= ~ S , T [ ( C N - ~ - E N - ; ) - (cSCT - C S T ) ] = o(a) S+T=N-I

for t 3 T. From (B.4), (B.5) and (B.6), we see by induction that (B.3) must hold for all IN I .

To verify (B.2), first note that because 6, depends only on within-locus moments, ACN = O(a) for all N and t . From (B.4) and (B.5), we see that for all t ,

A(C, - e ~ ) = - T s . T ( C N - C S C T ) + g N ( C , t ) , S+T=N

= - T S . T [ ( C N - CN) - ( C S C T - 6.$T)] S+T=N

+ gN(C9 t )J (B.7)

= -TN(CN - EN)

+ T S . T ( C S C T - 6 S e T ) + g N ( c , t ) , S+T=N

where C denotes a complete set of moments sufficient to calculate all genotype frequencies and gN(c, t ) = A(CN - e,) + rS,7(CN - C S C T ) is a complicated

Page 27: Natural and Sexual Selection on Many Loci · 2002. 7. 8. · sexual selection, but this generalization is sufficiently complex to require a separate treatment. Diploid lfe cycle:

Multilocus Selection Made Simple 255

function of these moments that satisfies gN(C, t ) = O(u) for all t. Iterating (B.7) yields

Cdt) - i.hf(t) = (1 - TN)I[CN(O) - EN(O)]

I

+ ( 1 - TN)r"IgN(C, t - 7 )

+ T.TJ (1 - TN)r"[CS(t - 7 ) C T ( t - 7 )

T= 1 (B.8) 1

S+T=N , = I

- e,s(t - 7 ) C ~ ( t - T ) ] .

Our task is to find T such that for t 2 T, each of the three terms on the right hand side of (B.8) is O(u). Let Tk denote a value such that (B.2) holds for all t 2 Tk for all N satisfying IN1 c k. We now consider N with IN I = k + 1. To make the first term on the right hand side of (B.8) small, Tk+l must satisfy

(1 - rNy I CN(0) - &(O) I < a 03.9)

for t Z Tk+l and all I N I = k + 1. Because gN(c, t ) = O(u) for all t , the first summation in (B.8) is O(u) for all t. The second sum is more difficult. If t Z Tk, Cs(t)CT(t) - e,y(t)&-(t) = O(u) by the induction hypoth- esis. Hence, the second summation in (B.8) can be broken into two parts. For each partition, S + T = N,

1

2 (1 - rN)r-I[CS(t - 7)CT(t - 7) T= 1

- d,(t - 7 ) e ~ ( t - T)]

7 )

(B. 10)

+ (1 - rN)r-l[C.s(t - T)CT(t - 7 ) T = f - T k + I

- e& - T ) C T ( t - T)]. The first summation on the right hand side of (B. 10) is O(u) for all t > Tk by the induction hypothesis. To ensure that the second sum is also O(u), T ~ + I must be chosen so that (1 - r ) T k + l - T h

'V max { I cS( t )cT( t ) f4Tk- 1

- e . y ( t )CT( t ) I 1 < U . (B.11) Because recombination will rapidly reduce large initial values of I C.v(t)Cdt) - es ( t )Cd t ) I , whereas selection can only increase them by increments that are O(u), it suffices to replace (B.11) by

(1 - ~ , v ) ~ ~ + ~ - ~ ~ I C.~(O)CT(O) - C S ( O ) C T ( O ) I < U . (B. 12) Thus, for each value of IN I , we can find a T so that- (B.2) holds. Hence, as long as the number of loci and alleles is finite, this recursive procedure will produce a value of T so that (B.2) and (B.3) hold for all N and t 3 T. Because higher order disequilibria are reduced by recombination faster than are pairwise disequi- libria, we conjecture that the time T is generally determined by the maximum of l/r, over all pairs of loci.