social evolution and genetic interactions in the short and long term

Upload: jeremy-van-cleve

Post on 25-Feb-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    1/25

    Theoretical Population Biology 103 (2015) 226

    Contents lists available atScienceDirect

    Theoretical Population Biology

    journal homepage:www.elsevier.com/locate/tpb

    Social evolution and genetic interactions in the short and long term

    Jeremy Van CleveDepartment of Biology, University of Kentucky, Lexington, KY 40506, USA

    a r t i c l e i n f o

    Article history:

    Received 14 October 2014Available online 21 May 2015

    Keywords:

    Inclusive fitnessFixation probabilityStochastic stabilityIsland modelTrait substitution sequenceCooperation

    a b s t r a c t

    Theevolutionof social traits remains one of the most fascinating andfeistytopics in evolutionary biologyeven after half a century of theoretical research. W.D. Hamilton shaped much of the field initially withhis 1964 papers that laid out the foundation for understanding the effect of genetic relatedness on theevolution of social behavior. Early theoretical investigations revealed two critical assumptions requiredfor Hamiltons rule to hold in dynamical models: weak selection and additive genetic interactions.However, only recently have analytical approaches from population genetics and evolutionary gametheory developed sufficiently so that social evolution can be studied under the joint action of selection,mutation, and genetic drift. We review howthese approaches suggest two timescales for evolution underweak mutation: (i) a short-term timescale where evolution occurs between a finite set of alleles, and(ii) a long-term timescale where a continuum of alleles are possible and populations evolve continuouslyfrom one monomorphic trait to another. We show how Hamiltons rule emerges from the short-termanalysis under additivity andhow non-additivegenetic interactions canbe accountedfor more generally.This short-term approach reproduces, synthesizes, and generalizes many previous results including theone-third lawfrom evolutionary game theory and risk dominancefrom economic game theory. Using thelong-term approach, we illustrate how trait evolution can be described with a diffusion equation that is astochastic analogueof the canonicalequation of adaptive dynamics.Peaks in the stationarydistribution ofthediffusion capture classic notions of convergence stabilityfrom evolutionary gametheory andgenerallydepend on the additive genetic interactions inherent in Hamiltons rule. Surprisingly, the peaks of the

    long-term stationary distribution can predict the effects of simple kinds of non-additive interactions.Additionally, the peaks capture both weak and strong effects of social payoffs in a manner difficult toreplicate with the short-term approach. Together, the results from the short and long-term approachessuggest both how Hamiltons insight may be robust in unexpected ways and how current analyticalapproaches can expand our understanding of social evolution far beyond Hamiltons original work.

    2015 Elsevier Inc. All rights reserved.

    1. Introduction

    The theory of evolution by natural selection as first fully elu-cidated byDarwin(1859)is so profoundly elegant and compre-

    hensive that truly new additions to theory have been extremelyrare. In 1963, W.D. Hamilton began publishing his seminal work onhow natural selection can shape social behavior (Hamilton,1963,1964a,b), which is often either referred to as the theory of kin se-lection(MaynardSmith,1964)orinclusivefitness (Frank,2013).It is a tribute to the importance of this work that upon his un-timely death in 2000 Hamilton was called one of the most influ-ential Darwinian thinkers of our time (Eshel and Feldman, 2001)and a candidate for the most distinguished Darwinian since Dar-win (Dawkins, 2000).

    E-mail address:[email protected].

    In this article, we will review how the tools of populationgenetics and evolutionary game theory can be used to formalizeHamiltons insight. We will begin with a summary of classicanalyses of Hamiltons approach and will then introduce the

    population genetic andgame theoretic tools that currently providea complete framework for studying social evolution under weakselectionandweakmutation (Lehmannand Rousset, 2014b).Usingthese tools, we will see how two general timescales for analysisemerge: a short-term timescale where evolution proceeds amonga finite set of alleles, and a long-term timescale where populationsevolve continuously among a continuum of alleles. These notionsof short and long-term derive from a broader attempt to reconcilepopulation genetic methods with evolutionary game theory(Eshel,1996;Hammerstein,1996;Weissing,1996).

    Using the short-term approach, we show how genetic inter-actions between individuals (e.g. Queller, 1985) can affect se-lection for cooperation in deme or group-structured populations

    http://dx.doi.org/10.1016/j.tpb.2015.05.002

    0040-5809/2015 Elsevier Inc. All rights reserved.

    http://dx.doi.org/10.1016/j.tpb.2015.05.002http://www.elsevier.com/locate/tpbhttp://www.elsevier.com/locate/tpbmailto:[email protected]:[email protected]://dx.doi.org/10.1016/j.tpb.2015.05.002http://dx.doi.org/10.1016/j.tpb.2015.05.002mailto:[email protected]://crossmark.crossref.org/dialog/?doi=10.1016/j.tpb.2015.05.002&domain=pdfhttp://www.elsevier.com/locate/tpbhttp://www.elsevier.com/locate/tpbhttp://dx.doi.org/10.1016/j.tpb.2015.05.002
  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    2/25

    J. Van Cleve / Theoretical Population Biology 103 (2015) 226 3

    (Ladret and Lessard, 2007). These results extend previous anal-yses of stochastic evolution that have shown conditions such asrisk dominance (Harsanyi and Selten, 1988;Blume,1993;Kan-dori et al.,1993)and the one-third law (Nowak et al.,2004;Oht-suki et al., 2007)to be important determinants of evolutionarystability. Using the substitution rate approach to long-term evolu-tion(Lehmann,2012;Van Cleve and Lehmann, 2013), we describe

    a diffusion equation that approximates the long-term change inmonomorphic trait values. We show how peaks in the stationarydistribution of this diffusion captures classic notions of evolu-tionary and convergence stability. Moreover, the location of theseconvergence stablestatescan be calculated using the classic direct-fitness approach of kin selection (Taylor and Frank, 1996;Rous-set and Billiard, 2000; Rousset, 2004). Applying this long-termapproach to a simple non-additive social interaction, we findsurprisingly that the long-term analysis can capture these non-additive effects even though the diffusion integrates over only ad-ditive interactions. Moreover, the long-term approach appears toreproduce results from some strong selection models, whichsuggests an unexpected robustness of the long-term diffusion.Together, the results from the short and long-term approaches re-veal the usefulness of these approaches for integrating Hamiltonsoriginal insight with recent results from population genetics andevolutionary game theory.

    1.1. Hamiltons rule

    The core insight in Hamiltons work is often summarized withhis eponymous rule (Hamilton,1964a,1970): an allele for a socialbehavior increases in frequency when the inclusive fitness effectis positive, namely

    c+ b r>0. (1)

    In Hamiltons rule(1), b is the increase in fitness (benefit) of asocial partner from the behavior of a focal individual, c is the

    decrease in fitness (cost) of a focal individual that performs thebehavior, and rmeasures genetic relatedness between focal andrecipient individuals(Frank,1998). More generally, ciscalledthedirect fitness effect andbthe indirect fitness effect.Hamilton(1964a)initially emphasized that genetic relatedness is generatedby a genealogical process that produces alleles identical bydescent (IBD) among a group of socially interacting individuals.Another general definition of genetic relatedness says that it isthe regression of the genotypes of social partners on the genotypeof the focal individual (Hamilton,1970;Grafen,1985). Hamiltonsrule crystallized the notion that natural selection depends bothon the effect of an individuals genes on its own fitness andalso on the indirect effect of those genes on the fitness of socialpartners. Although Darwin (1859), Fisher (1930), and Haldane

    (1955), among others, had expressed this idea in relation to howevolution would lead one individual to sacrifice its fitness foranother, Hamilton was the first to present a compelling frameworkapplicable to social evolution more generally.

    Within Hamiltons inclusive fitness framework, behaviors thatdecrease the fitness of a focal individual (c > 0) but increase thefitness of social partners (b > 0) are altruistic. Well-known ex-amples of altruism include worker sterility in eusocial insects (An-dersson, 1984), stalk cells that give up reproduction to dispersespore cells in Dictyostelium discoideum (Strassmann et al., 2000),and costly human warfare(Hamilton,1975;Lehmann and Feld-man, 2008). Other behaviors can also be classified in Hamiltonsframework (Hamilton, 1964a), andTable 2:(i) behaviors are mu-tualistic when they increase the fitness of the focal individual

    and its social partners, (ii) selfish when they increase the fitnessof the focal at the expense of the fitness of social partners, and

    (iii) spiteful when they decrease the fitness of both the focal in-dividual and its social partners. Although there are other potentialdefinitions of altruism and other behaviors (seeKerr et al.,2004;Bshary and Bergmuller, 2008), Hamiltons classification based ondirect and indirect effects has proven useful for distinguishing dif-ferent kinds of helping behaviors (mutualisms and altruisms) andfor showing how different biological mechanisms can promoteor inhibit the evolution of these behaviors(Lehmann and Keller,2006a;West et al.,2007).

    Though Hamiltons approach was initially accepted among em-piricists (Wilson, 1975) and some theorists(Maynard Smith, 1964;Oster et al.,1977), other theorists were concerned about the gen-erality of the approach due to its emphasis on fitness maximiza-tion and optimality modeling(Cavalli-Sforza and Feldman, 1978;Williams,1981;Karlin and Matessi, 1983). Fitness maximizationwasviewedas untenable because examples where it is violated arewell known (Moran, 1964). Optimality models were additionallyviewed with skepticism because, by neglecting gene frequency dy-namics, they cannot study genetic polymorphisms; in effect, suchmodels must assume that mutant alleles that invade a populationalso reach fixation. An initial wave of population genetic studiesin response to these concerns showed that Hamiltons rule was

    generally a correct mutant invasion condition so long as selec-tion is weak and fitness interactions between individuals are addi-tive(Cavalli-Sforza and Feldman, 1978;Wade,1979;Abugov andMichod, 1981;Uyenoyama and Feldman, 1981;Uyenoyama et al.,1981). However, these models were familystructuredwhere coop-eration occurs between close relatives and could not address theapplicability of Hamiltons rule in populations with more genericstructure, such as deme structure in island(Wright, 1931)and lat-tice models (Kimura and Weiss, 1964;Malcot,1948,1967).

    1.2. The Price equation and the individually-based approach

    Part of the difficulty with the population genetic methods usedto analyze family-structured models is that they use genotypes as

    state variables. This quickly increases the dimensionality of themodel as the number of loci, family size, or demes increases andmakes approximation difficult. An important alternative approachwas introduced to population genetics by George Price (Price,1970,1972). The core of that approach, the Price equation, usesthe distribution of allele frequencies in each individual in thepopulation as the set of state variables and tracks the firstpopulation-level moment of this distribution, which is the meanallele frequency. If p = (p1, . . . ,pNT ) represents the allelefrequency distribution for NThaploid individuals (pi = 0 or 1 forindividuali), the Price equation yields

    E [wp|p]= Cov [wi,pi] + E [wipi] (2)

    where E [wp|p]is the expected change in mean allele frequency

    pweighted by mean fitness wand conditional onpin the parentalgeneration. The first term on the right hand side, the covariancebetween individual fitness wiand allele frequencypi, measures theeffect of selection on the change in mean allele frequency in thepopulation.Thesecondterm,E[wipi],measurestheeffectofnon-selective transmission forces, such as mutation and migration (andrecombination forchanges in genotype frequencies), on the changein mean allele frequency. When selection is theonly force on allelefrequencies and the population size remains fixed (w = 1), thePrice equation simplifies to

    E [p|p] = Cov [wi,pi] . (3)

    Calculating higher-order moments of the allele frequency dis-tributionpis necessary to measure the exact dynamics of the dis-

    tribution over time; thus, moment-based approaches like the Priceequation are not necessarily more tractable than directly tracking

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    3/25

    4 J. Van Cleve / Theoretical Population Biology 103 (2015) 226

    genotype frequencies. However, an important observation aboutmoment-based approaches is that they are readily amenable toapproximation. When selection is weak relative to other forcessuch as recombination and migration, a kind of separation oftimescales occurs where mean allele frequency dynamics con-verge very slowly and associations between alleles, such aslinkage disequilibrium between loci (Barton and Turelli, 1991;Nagylaki,1993;Kirkpatrick et al.,2002)and FSTbetween individ-uals in a deme (Roze and Rousset, 2003;Wakeley,2003;Rousset,2006;Roze and Rousset, 2008), converge much more quickly. Be-cause of this separation of timescales, linkage disequilibrium, FST ,and other associations will converge to quasi-equilibrium (QE)values that are a function of mean allele frequencies. This meansthat the mean allele frequency dynamics can be expressed as aclosed system of equations, which considerably simplifies analy-sis of multilocus systems or structured populations.

    With respect to social evolution, the QE results for structuredpopulations are particularly useful as theyhave helped to establisha rigorous basis forkin selection andHamiltons rule in populationswith finite size, localized dispersal, or both(Rousset and Billiard,2000;Rousset,2004). Initiated by the seminal work of FranoisRousset (Rousset and Billiard, 2000; Rousset, 2003), the mean

    allele frequency dynamics in this approach are calculated underweak selection and can be expressed as functions of FST andother between individual genetic associations evaluated underneutrality when selection is absent. In the simplest cases, thisapproach shows that Hamiltons rule holds for weak selectionand additive genetic interactions in populations with island-typestructure(Rousset and Billiard, 2000) and family structure (Rozeand Rousset, 2004). More generally, this approach producesanalogues of Hamiltons rule where the direction of selectionis given by a sum of relatedness coefficients and fitness effectsindexed by the spatial distance between a focal individual andits social partners (Rousset and Billiard, 2000; Rousset, 2004;Lehmann and Rousset, 2014b) or by the demographic class(e.g., juvenile vs. adult or worker vs. queen in social insects) of the

    focal and its partners (Rousset and Ronce, 2004;Roze and Rousset,2004).It isthis weakselection and QE approachthat we willuse tostudy genetic interactions and their affect on cooperation in deme-structured populations below.

    1.3. Genetic drift, adaptive dynamics, and evolution in the short and

    long term

    Another difficulty with the early analyses of kin selection andHamiltons rule in family structured populations was that thosepopulation genetic models could easily produce stable polymor-phic equilibria (Uyenoyama and Feldman, 1981; Uyenoyama et al.,1981;Toro et al.,1982), which made general predictions concern-ing the level of altruism or other social behaviors difficult. Gen-

    erally, such equilibria are of intrinsic biological and mathematicalinterest since they illuminate stabilizing selection that can main-tain genetic variation in levels of cooperation. In finite populationshowever, even alleles under stabilizing selection either eventuallygo extinct or reach fixation due to genetic drift. If genetic drift issufficiently strong relative to the rate of mutation , then the pop-ulation will spend most of its time fixed for one of a set of possiblealleles generated by mutation. This occurs for large NTand small when

    NT log NT 1, (4)

    which can be arrived at heuristically by using the expected num-ber of alleles in the population in an infinite alleles model underneutrality (Karlin and McGregor, 1967;Wright,1969;Watterson,

    1974,1975). Moreover, this condition ensures that any mutationthat can arise will either fix or go extinct before another mutation

    arrives (Champagnat,2006;Champagnat et al., 2006)1;in otherwords, there are at most two alleles in the population at one time:a resident anda mutant that eithergoes extinctor fixes andbe-comes the new resident. This process of sequential substitution ofalleles is called the trait substitution sequence (TSS;Dieckmannand Law, 1996;Champagnat et al.,2001)and is the fundamentalbiological model of adaptive dynamics(Metz et al.,1996;Geritzet al.,1998;Dercole and Rinaldi, 2008)when population size goesto infinity,NT . In addition, the TSS is often the implicit dy-namical process behind many phenotypic models of kin selectionand evolutionary game theory(Taylor and Frank, 1996;Weissing,1996; Eshel et al., 1997; DayandTaylor, 1998; Rousset and Billiard,2000;Wild and Taylor, 2004;Lehmann and Rousset, 2014b) thatuse an optimality criterion (i.e., fitness maximization) in search ofan evolutionarily stable strategy (ESS).

    On timescales short compared to those required for generat-ing phenotypic novelty, alleles generated by mutation in the TSSconstitute a finite set, and the population jumps between alle-les in this set as each allele invades and fixes. Assuming that it ispossible to mutate from one allele to any other in the set througha sequence of zero or more intermediates (i.e., the mutation pro-cess is irreducible), the short-term process equilibrates to a sta-

    tionary distribution among the fixation or monomorphic states.This short-term TSS corresponds to short-term evolution as de-fined byEshel (1991,1996)where a fixed set of genotypes are al-lowed to change frequency but new mutations outside this set donot occur. The length of time the population spends fixed for eachallele is primarily determined by the likelihood each allele arisesvia mutation () and fixes (fixation probability, ) in populationsmonomorphic for the other possible alleles (Fudenberg and Imhof,2006). Assuming weak selection, the QE approach discussed abovecan be used to calculate fixation probabilities even in spatially ordemographically structured populations (Rousset, 2004). Togetherwith theTSS condition (4), thisallowsacompletedescriptionofthestationary distribution of allelic statesunder theforcesof selection,mutation, and genetic drift in the short-term.

    For longer timescales, novel phenotypes are possible due to theinvasion of mutations outside of a given finite set. For example,gene (Lynch andConery, 2000) andgenome duplication (Thornton,2001), transposons(Oliver and Greene, 2009), and lateral genetransfer (Husnik et al., 2013) can generate novel physiologicaland ecological functions not possible with small changes in singlegenes. These processes suggest that the set of possible phenotypesmay have a continuum of values over the long term. Supposethat for phenotype z the probability density of generating amutant allele of phenotype z + is u(,z). If the support ofu(,z) covers a fitness peak (i.e., it is possible to generate amutant that resides exactly at the peak), then it is possiblethat the population will not only approach the peak, but it willspend most of its time fixed for a phenotype within a smallneighborhood of the peak. This long-term TSS corresponds to the

    definition of long-term evolution byEshel (1991,1996)whereinvasion of new genotypes allow the population to approachphenotypic equilibria defined by the classic evolutionarily stablestrategy (ESS) condition (Eshel and Feldman, 1984; Liberman,1988;Hammerstein and Selten, 1994). Without any assumptionson the distribution of mutational effects u(,z), the long-termTSS is described mathematically as a Markov jump process andis given by an integro-differential (master) equation(Champagnatetal., 2001; Champagnat, 2006; Champagnatet al., 2006; Lehmann,2012). Often for the purpose of tractability, only small mutants

    1 Thisconditionis verysimilar to the oneobtainedin thestrong-selectionweak-mutation limit of (p. 221;Gillespie, 1991) and the successional-mutations regime

    of (eq. 1;Desai and Fisher, 2007): NT log NT 1 where is the strength ofselection.

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    4/25

    J. Van Cleve / Theoretical Population Biology 103 (2015) 226 5

    are allowed in small intervals of time, which means that u(,z)is narrowly peaked around z and the population cannot makelarge jumps quickly. This assumption turns the jump processinto a diffusion process (Champagnat and Lambert, 2007)that isthe stochastic analogue of the deterministic canonical equationof adaptive dynamics (Dieckmann and Law, 1996;Champagnatet al., 2001). The long-term TSS diffusion also has a stationaryprobability density, (z), and the phenotypes located at peaksin that density correspond to equilibria obtained from classicESS or adaptive dynamics analyses (Lehmann, 2012;Van Cleveand Lehmann, 2013). Exactly as in the short-term TSS, weakselection can be used to calculate the fixation probabilities thatdetermine (z). This leads to a long-term stationary distribution ofphenotypes that captures selection, mutation, and drift in spatiallyor demographically structured populations.

    1.4. Putting it all together: Hamilton and social evolution in the short

    and long term

    The two main assumptions above, weak selection and the TSScondition(4),allow us to describe the stationary density of tran-sitions between a discrete set of phenotypes in the short term oramong a continuum of phenotypes in the long-term. When thereare only two types in the short term, cooperative and noncoopera-tive, andthe population is spatiallystructured or familystructured,Hamiltons rule in Eq.(1)is readily recovered when genetic inter-actions are additive (Rousset and Billiard, 2000;Roze and Rous-set, 2004). As we will see below, this is a result of comparing thestationary density of cooperative versus noncooperative types.Moreover, much of the recent work in evolutionary game theorythat focuses on finite populations uses this same short-term TSSmodel to calculate a stationary distribution of types (e.g., Nowaket al., 2004; Imhof et al., 2005;Fudenberg et al., 2006; Hauertet al., 2007; Ohtsuki et al., 2007; Sigmund et al., 2010). Whenthere is a continuum of levels of cooperation in the long term,c+ brin Hamiltons rule becomes the gradient of the potential

    function used to solve for the stationary density of the TSS diffu-sion(Lehmann, 2012). Since c+brcan be thought to measure thechange in inclusive fitness for additive genetic interactions (Taylorand Frank, 1996;Rousset and Billiard, 2000;Lehmann and Rous-set, 2014b), phenotypes at peaks in inclusive fitness (c+ br =0)correspond to peaks in the stationary density. Thus, the long-termaction of natural selection, assuming additivity,leadstoakindmax-imization of inclusive fitness, which supports the use of classic in-clusive fitness analyses (however, see Refs.Lehmann and Rousset,2014a,b,for difficulties in interpreting this result as broadly justi-fying inclusive fitness maximization).

    1.5. Genetic interactions and non-additivity

    If one is willing to assume weak selection, weak mutation rel-ative to genetic drift (i.e., the TSS), and additive genetic interac-tions, then a direct application of Hamiltons rule can be justifiedusing the theoretical work discussed above. However, the abilityto predict short and long term distributions of types under weakselection and the TSS is possible even when genetic interactionsare non-additive. Non-additivity at the genetic level allows for in-teraction among alleles, within or between individuals. Within in-dividuals, such interactions produce dominance and epistasis andbetween individuals they produce scenarios analogous to classictwo-player games with pure strategies, such as the HawkDove orStag-Huntgames.Non-additive interactionsare importantbecausethey produce frequency dependence in the sign of the change inallele frequency (Eq.(3)) even for weak selection(Lehmann and

    Keller, 2006b;Ohtsuki,2012;Van Cleve and Akay, 2014). In thecase of social behavior, this implies that Hamiltons rule becomes

    frequencydependent andno longer provides an unambiguous pre-diction of the effect ofselection in either theshort or the long term.Rather, applying thetools of QE and theTSS for non-additive inter-actions requires additional terms to account for higher-order ge-netic associations.

    Once we calculate these additional terms, we determine theeffect of non-additive interactions on the short-term stationary

    distribution of types in a given demographic context. Here, weapply the theory to Wrights island model of population structurewhere there are n demes or groups each containing Nhaploidindividuals(Wright, 1931) (NT = nN). All groups are connectedequally by migration at rate m. One of the important features of thePrice equation approach is that it allows us to express the geneticassociations (e.g.,FST) in terms of mean coalescence times. Usingresults from coalescenttheoryto calculate the genetic associations,we replicate and generalize well known short-term results frominclusive fitness (the Taylor cancellation result (Taylor,1992a,b;Ohtsuki, 2012)) and evolutionary game theory (the one-thirdlaw (Nowak et al., 2004)and risk dominance(Harsanyi and Selten,1988; Blume, 1993;Kandori et al., 1993)). Moreover, we showhow changing the competitive environment (i.e., the degree of

    competition for local resources) changes these well known results,particularly in the presence of non-additive interactions.The long-term approach, in contrast, only accounts for addi-

    tive genetic interactions. Nevertheless, at least in social interac-tions with simple non-additive payoffs, the long-term approachremarkably reproduces results from the short-term approach thatexplicitly includes non-additive genetic interactions. We discuss apotential explanation for this power of the long-term approach,which suggests that three-way genetic interactions may beuniquely analytically tractable among possible non-additive inter-actions. Finally, we show howthe long-termapproachmay capturesome strong effects of social payoffs that the short-term approachneglects.

    2. Theory: short-term evolution

    2.1. Weak mutation, the TSS, and evolutionary success

    Consider evolutionin a population with total size, NT,wherethepopulation can be group structured (ngroups of size N) or other-wise spatially structured with some pattern of migration betweenspatial locations. For the sake of simplicity, we assume that thepopulation structure is homogeneous so that all individuals haveequivalent effects on allele frequency change in the absence ofnatural selection(Taylor et al., 2011). Examples of homogeneouspopulation structures include the island model and the stepping-stone and lattice models with a uniform migration rate. Inho-

    mogeneous population structures include groups connected bydispersal via heterogeneous graphs, such as hub and spoke pat-terns, and require weighting different classes of individuals bytheir reproductive value (Rousset and Ronce, 2004;Taylor,2009;Tarnita and Taylor, 2014).

    Recall from Section 1.3 that the short-term TSS requiresconsidering only two alleles, which we label A and a where pimeasures the frequency of A in individual i (see Table 1 for adescription of symbols used throughout this paper). Suppose thatthe mutation rate from Atoais a|Aand A|ais the rate fromato

    A. We assume = max(A|a, a|A)measures the overall strengthof mutation. The weak mutation condition that defines the TSS,condition(4),is derived under the limit as NT and 0.In this limit, the TSS consists of the population jumping between

    states fixed for alleleA and fixed for allele a. To represent the jumpprocess between these two fixation states, we create a Markov

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    5/25

    6 J. Van Cleve / Theoretical Population Biology 103 (2015) 226

    Table 1

    Description of symbols.

    Symbol Description

    NT Total population sizeN Group (or deme) sizen Number of groups (or demes)m Migration rateM Population migration rate,M= nNm/(n 1)

    pi(pgi) Frequency of alleleAin haploid individuali(living in groupg)pg Mean frequency of alleleAin groupgp(q) Mean frequency of alleleA(a) in the populationp= (p1, . . . ,pNT ) Vector of allele frequencies for the population Probability an offspring carries a mutant alleleA|a(a|A) Probability an offspring carries allelea(A) with a parent carrying allele A(a)zi(zgi) Phenotype of individuali(living in groupg)z= (z1, . . . ,zNT ) Vector of phenotypes for the population Phenotypic deviation of mutant phenotype from resident phenotypezu(,z) Distribution of mutant deviations (mutational effects) given a mutation from resident phenotypez2(z) Second moment (variance) of the distribution of mutant deviationsk(,z) Instantaneous rate of substitution of population monomorphic for traitzwith one monomorphic for traitz+ (z, t) ((z)) Probability density that population is monomorphic for traitzat timet(t )wi(wgi) Fitness of individuali(living in groupg)w Mean fitness in the populationb Fitness benefit to the focal individual of the expression of alleleAin a social partnerc Fitness cost of the expression of alleleAin the focal individualfig,fg,f Fertility of individualiin groupg, mean fertility in groupg, mean fertility in the populationA|a(a|A) Probability of a single mutantA(a) allele fixing in a population ofNT 1a(A) alleles((z)) Probability of a neutral allele (in a population resident for phenotypez) fixing in the populationS(z) Derivativ e of fixation probability with respect to and evaluated at = 0; also called the selection gradient Selection strengthsi,k1 kd Selection coefficient for individualiof the frequency ofAin individualsk1, . . . , kdPi(p) Sum of selection coefficients for individualitimes their respective allele frequency productsP(p)= (P1(p) , . . . , PNT (p)) Vector of sums of selection coefficients and allele frequency productsE

    Tik1 kd

    Expected coalescence time of alleles in individualsiandk1throughkdunder neutrality

    Qij Probability of identity by descent between alleles in individualiandjr Genetic relatedness Scaled relatedness coefficientk Selection gradient proportionality constant

    chain for these two states with the following transition matrix

    ()=

    1 a|A

    a|A

    a|A

    a|AA|a

    A|a 1

    A|a

    A|a

    (5)where A|a and a|A are the probabilities that alleles A and a,respectively, reach fixation starting from an initial frequency of1/NT in a population where the other allele has frequency 1 1/NT. Rescaling the mutation rates by the overall rate allowsa nontrivial stationary distribution of the Markov chain (i.e., theleft eigenvector of ()) as 0. Inspired by ideas in largedeviations theory (Freidlin and Wentzell, 1984),Fudenberg andImhof(2006) show that the stationary distribution of the TSS as 0 is simply the stationary distribution of the Markov chainin(5)in the limit as 0. Thus, instead of having to calculate

    the stationary distribution of the complex stochastic process withmany different possible population states, we need only calculatethe stationary distribution of the much simpler embedded chaincomposed of fixation states. Calculating the stationary distributionusing the embedded chain yields

    =

    A|aA|a

    A|aA|a+ a|Aa|A,

    a|Aa|A

    A|aA|a+ a|Aa|A

    . (6)

    If we are interested only in the effect of selection on thestationary distribution, we can assume that the mutation rates aresymmetric,A|a =a|A = . In this case, the expected frequencyof alleleA in the population at stationarity,which we write as E [p],becomes in the limit as the mutation rate goes to zero

    lim0

    E [p]= A|aA|a+ a|A

    . (7)

    An intuitive condition for the evolutionary success of allele Arelative to alleleais thatAis more common at stationarity, or

    E [p] > 12

    . (8)

    Using Eq.(7),condition(8)is equivalent to

    A|a > a|A (9)

    when 0, which means we need only compare complemen-tary fixation probabilities in order to determine which allele isfavored by natural selection(Fudenberg et al.,2006;Allen andTarnita, 2014). This condition on fixation probabilities is the evo-lutionary success condition that we will use to derive Hamiltonsrule (Eq.(1)).

    2.2. Fixation probability and the Price equation

    Calculating the fixation probabilities in a model with arbitrarilycomplex demography or spatial structure can be daunting ifnot impossible. Thus, our next aim is to show how to connectfixation probabilities to the Price equation, which will make itstraightforwardto use weak selection and QE results. Suppose that

    p(t)is the mean frequency of allele A at time t. Following recentmethods(Rousset,2003;Lessard and Ladret, 2007;Lehmann andRousset, 2009), we can write the fixation probability as

    A|a = E [p()|p(0)]

    = p(0) + E

    t=0

    p(t)|p(0)

    = p(0) +

    t=0

    E [p(t)|p(0)] (10)

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    6/25

    J. Van Cleve / Theoretical Population Biology 103 (2015) 226 7

    where p(t) = p(t+ 1) p(t) and we can exchange the expecta-tion and the infinite sum in the last line because the Markov chainconverges in mean (Lessard and Ladret, 2007). Expanding the sumin (10) by conditioning on all possible population states p(t) yields

    A|a = +

    t=0 p(t)Pr [p(t)|p(0)] E [p(t)|p(t)] (11)

    where we have used the fact that the fixation probability of a neu-tral allele, , is its initial frequency p(0). The second term in thesum, E [p(t)|p(t)], is exactly the left-hand side of the Price equa-tion(3).

    2.3. Fixation probability under weak selection

    The most straightforward way to calculate the probability offixation A|a assuming weak selection is to Taylor expand A|ain terms of a parameter that measures the strength of selection(see: Rousset, 2003; Lessard and Ladret, 2007; Lehmann andRousset, 2009), which we call . This expansion is simply

    A|a = + dA|a

    d + O

    2

    (12)

    where the derivative dA|a

    d is evaluated under neutrality ( =0). Using Eq.(11),we can calculate the derivative of the fixationprobability under neutrality as

    dA|ad

    =

    t=0

    p

    d

    d

    Pr [p(t)|p(0)] E [p(t)|p(t)]

    (13)

    where the exchange of derivative andthe limit is justifiedprovidedthe derivatives converge uniformly (see Appendix ofLessard andLadret, 2007, for such a proof). Expanding the derivativein the sumin(13)using the chain rule yields

    d

    d

    Pr [p(t)|p(0)] E [p(t)|p(t)]

    =

    d

    d

    Pr [p(t)|p(0)]

    E [p(t)|p(t)]

    + Pr [p(t)|p(0)] d

    d

    E [p(t)|p(t)]

    (14)

    with the symbol indicating evaluation of an expectation orprobability in the neutral case when = 0. The first term on theright hand side of(14)is zero since the expected change in allelefrequency under neutrality is zero for homogeneous populationstructures. Simplifying Eq.(13)with this fact yields

    dA|ad

    =

    t=0

    p

    Pr [p(t)|p(0)] dd

    E [p(t)|p(t)]

    (15)

    which we can write as

    dA|ad

    =

    t=0

    E

    d

    d

    E [p(t)|p(t)]

    (16)

    where E implies expectation over the neutral realizations of pgiven an initial frequency ofAofp(0).

    In order to evaluate the derivative of the fixation probabilityin Eq. (16), we need the derivative of the expected change inmean allele frequency. This is a quantity that is relatively simple tocalculate since all one needs is the first-order term in an expansion

    of E [p(t)|p(t)]in terms of selection strength . To obtain thisexpansion, we first expand the fitness of the focal individual i in

    terms of selection strength. Without loss of generality, the fitnessof individualiis

    wi = 1 + NT

    d=1

    NTk1

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    7/25

    8 J. Van Cleve / Theoretical Population Biology 103 (2015) 226

    In order to better interpret the expression for fixation probabil-ityin (21), which contains a difficult infinite sum over time, we fol-lowtheargumentgivenin Rousset (2003) and expanded in Lessardand Ladret (2007)and Lehmann andRousset (2009) that interpretsthe expected allele frequency products in terms of coalescenceprobabilities. Recall from the TSS that in a population composed ofallele a, a singleA mutation will arise and either fix or go extinct. Inthis case, the expected allele frequency product, E pipk1 pkd,is the probability that individuals i and k1through kdall have allele

    Aat some future time t. Going backwards in time, this probabilityis equivalent to the probability that those lineages coalesce beforetime t, Pr

    Tik1kd t

    ,timestheprobabilitythattheancestrallin-

    eage is alleleA, which is the initial frequency p(0) = = 1/NT .Writing Pr

    Tik1kd t

    as 1 Pr

    Tik1kd > t

    and using the fact

    that the selection coefficients sum to zero (Eq.(18)), Eq.(21)be-comes

    A|a =1

    NT

    NT

    NTd=1

    NTi=1

    NTk1

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    8/25

    J. Van Cleve / Theoretical Population Biology 103 (2015) 226 9

    from evolutionarygame theory is an evaluation of condition(28) ina finite population where individuals interact in a pairwise mannerand the payoffs from their interactions can produce non-additivegenetic interactions(Nowak et al.,2004;Ohtsuki et al.,2007). Ineffect, the one-third law then yields a measure of the relative sta-bility of a population fixed for allele a, but does not provide infor-mation about the stability of the population when fixed for allele

    A(see Fig. 2b inNowak et al., 2004)unless genetic interactions areadditive.

    2.5. Weak effect mutations and continuous phenotypes

    An important case where genetic interactions are additiveoccurs when the difference between the phenotypes produced byalleles A and a is small and phenotypes are allowed to take acontinuum of values. Suppose that the phenotype of type a iszandthat of type Ais z+ where is called the phenotypic deviation.Further, let the phenotype of individual ibe zi = z+ pi, and thevector z = z(p) = (z1, . . . ,zNT )contain the phenotypes for thewhole population. Since phenotype is a continuous variable, weassume that the fitness of each individual i, wi(z), is a differentiablefunction of phenotype (p. 41 in Ref.Courant and John, 2000). This

    also implies, using Eqs.(3)and(11),that the fixation probability asingle mutant with traitz + in a population with resident trait z,denoted(,z), is differentiable. When the phenotypic deviationis small (weak effect mutations), we can ignore termsO

    2

    andindividual fitness can be written as a Taylor series in :

    wi(z) = 1 + dwi(z)

    d+ O

    2

    = 1 + NT

    i

    wi(z)

    zjpj+ O

    2

    (30)

    where w izj

    are evaluated at = 0 (written as w i(z)zj

    ) when the

    population is fixed for allele aandwi(z) = 1. Comparing Eq.(30)

    to the expression for fitness when phenotypes are discrete(Eq.(17))reveals that the phenotypic deviation is analogous tothe selection strength and that the derivatives of fitness withrespect to phenotype, w i (z)

    zj, are equivalent to additive selection

    coefficients. Thus,so-called-weak selection (WildandTraulsen,2007) implies additivity of fitness effects and allele frequency,which is well known in the literature (e.g.: Taylor,1989;Rousset,2004).

    Applying the fitness function in(30)to the fixation probabilityEq.(25)produces

    A|a =(,z) =1

    NT

    NT

    NTi=1

    NTj=1

    wi(z)

    zj

    E

    Tij

    NT+ O

    2

    =1

    NT+

    d

    d(0,z) + O

    2

    where

    S(z) =d

    d(0,z), (31)

    is often called the phenotypic selection gradient in adaptive dy-namics (Geritz et al., 1998;Leimar,2001), inclusive fitness the-ory (LehmannandRousset,2010)andquantitativegenetics (Lande,1979). The gradient terminology derives from the fact thatS(z)measures the direction of selection on the phenotype with respect

    to fixation probability: a positive (negative) selection gradient im-plies that mutants with positive (negative) will have a higher

    (lower) chance of fixing than the resident type. The zeros of these-lection gradient, which correspond to extrema of the fixation prob-ability, are candidate evolutionary equilibria. We will show this tobe the case though once we have described evolution under thelong-term TSS in Section4.

    2.6. Coalescence time and identity by descent

    So far, we have shown how the fixation probability of an allelewith effects on social behavior depends on mean coalescencetimes (Eq. (22)). However, the relatedness term in Hamiltonsrule is often expressed as a function of probabilities of identityby descent (Hamilton,1964a,1970). Translating between meancoalescence times and probabilities of identity by descent ispossible using an argument first presented by Slatkin (1991).Suppose thatQijrepresents the expected probability of identity bydescent between alleles i and j. Since mutations are distributedindependently on a neutral genealogy (Wakeley, 2009)and IBDrequires that the alleles not mutate before coalescing, the IBDprobability can be expressed as

    Qij =

    t=1

    (1 )2t

    Pr

    Tij = t

    . (32)

    Sincethe TSS assumes weak mutation, we can ignore terms O

    2

    and rewrite Eq.(32)as

    E

    Tij

    = lim0

    1 Qij2

    . (33)

    We use the limit as 0 in Eq. (33)since we derived thefixation probabilities and their dependence on coalescence timeunder the TSS assumption that new mutation is not possible untilthe old mutation either fixes or goes extinct. Eq.(33)only pertainsto pairwise coalescence times and IBD probabilities, so we can onlyapply it to additive genetic interactions. The relationship betweenthree-way (and more generally d-way) coalescence times and IBD

    probabilities is more complex and deserves further study.Applying the relationship between pairwise coalescence timesand IBD probabilities in (33) to the selection gradient in (31) in thecase of-weak selection yields

    S(z) = lim0

    1

    2N2T

    NTi=1

    NTj=1

    wi(z)

    zjQij. (34)

    This expression was first obtained byRousset and Billiard(2000),and an analogous derivation was presented by Rousset (2003).For models with simple population structure (homogenous struc-tures(Taylor et al.,2011;Ohtsuki,2012) like the island (Wright,1931) or stepping-stone(Kimura and Weiss, 1964) models) andsimple demography, the IBD probabilities Qij are relatively easy

    to obtain in the low mutation limit. The fitness function wi(z

    )de-pendson nature of the social interactions as well as on the demog-raphy and population structure.

    2.7. Inclusive fitness effect and Hamiltons rule

    Essentially, the right hand side of (34) is a measure of howexpression of alleleA affects inclusive fitness (Rousset and Billiard,2000;Rousset,2004); fitness effects are given by the derivativeof the fitness of individual i with respect to the phenotype ofindividual j, and each effect is weighted by the likelihood thatindividuals i and j share alleles IBD. Applying the evolutionarysuccess conditionS(z) >0 to(34)yields a Hamilton-type rule,

    lim0

    1

    NTi=1

    NTj=1

    wi(z)

    zj Qij > 0, (35)

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    9/25

    10 J. Van Cleve / Theoretical Population Biology 103 (2015) 226

    wherethepopulationstructureandfitnessfunctionshavenotbeenspecified. In order to obtain the classic form of Hamiltons rule fromcondition(1),we make some simplifying assumptions about thesocial interaction and population structure.

    Suppose that there is a homogeneous population structure withn groups each containing Nhaploid individuals (NT = nN) andequal migration between all groups (i.e., an island model). This im-plies that we need only to track two kinds of IBD probabilities,Q0 ,which measures the chance that two alleles drawn from differentindividuals in the same group are IBD, and Q1, which is the proba-bility that two alleles from different groups are IBD. Individuals so-cially interact within their group,but social effectsbetween groupsalso occur due to differential productivity of groups (i.e., hard se-lection(Christiansen, 1975)). Since the fitness derivatives w i (z)

    zj

    are evaluated at = 0 where all individual behave the same (asif they express the a allele), there are only three different fitnessderivatives:(i) individuals i andj arethesameindividual,and w i (z)

    ziis theeffect of the individuals behavior on itself, which we call thecost or c; (ii) individuals i and j live in the same group, andw i(z)

    zjis the effect on the individual due to its group mates behav-

    ior, which we call the benefit orb/(N 1) (for each of theN 1

    group mates); and (iii) individuals i and j live in different groupswhere we can set w i(z)

    zj= (b c)/((n 1)N) since the selection

    coefficients must sum to zero (Eq.(18)). Putting these expressionsinto Eq.(35)and simplifying produces

    lim0

    1 Q1

    c+ b

    Q0 Q1

    1 Q1

    > 0. (36)

    Typically, in finite populations (NT < ), the IBD probabilitiesQ0andQ1will go to one as the mutation rate goes to zero since onelineage will eventually fix in the population. In these cases, the IBDprobabilities can often be expressed as 1 O (1) (Rousset, 2004),which suggests that the first term in(36)has a positive limit as 0. The ratio multiplying the benefit b turns out to be WrightsF

    ST, which will also have a positive limit under low mutation. Set-

    ting the relatedness tor = FST = Q0Q1

    1Q1and simplifying, we then

    obtain Hamiltons rule for this population

    c+ b r >0

    where the left-hand side of the inequality is the inclusive fitnesseffect. Using Slatkins formula in(33),we can equivalently writerelatedness in this context as (Slatkin, 1991)

    r=E

    T(2)

    E

    T(0,1)

    E

    T(2)

    where E

    T(0,1)

    and E

    T(2)

    are the expected coalescence timesbetween alleles sampled in the same and different groups, respec-tively.

    In his derivation of the inclusive fitness effect,Hamilton(1970)began by directly applying the Price equation to the change in thefrequency of an allele affecting social behavior. His relatednesscoefficient r in that context is a regression coefficient for thefrequency of the allele in a focal individual as a function of thefrequency of the allele in social partners. This is equivalent toassuming that E

    pj |pi

    = r pi +(1 r)pfor focal individual i,

    social partnerj, and population-wide frequencyp (Hamilton, 1970;Frank, 1998; RoussetandBilliard,2000 ; Rousset, 2004).Inanislandmodellikethe one used in Eq. (36), this means that a social partnerin the group either shares ancestry with the focal individual inproportion to rand has the same allele frequency as the focal ordoes not share ancestry and has the population-wide frequency.Relatedness using this regression definition is

    r= Covpi,pj

    p(1 p)

    .

    Table 2

    Definitions of social behavior using Hamiltons rule (Eq.(1)).

    Effect on focal (c) Effect on social partner (b)

    +

    + Mutualism Sel fishness Altruism Spite

    The covariance term above can be writtenas Covpi,pj

    = E

    pipj

    p2 where E

    pipj

    is the probability that both individualsiandj

    in the same group have alleleA. Since Epipj

    = Q0p + (1 Qo)p2 ,

    the regression definition of relatedness becomes r=Q0in this ex-ample. This matchesr = Q0Q11Q1 from Eq.(36)since the regressionassumption implies that individuals in different demes (individu-als outside the social interaction) share no ancestry, which meansQ1 = 0.Onewaythiscanoccurisifthenumberofgroupsisinfinite(n ).

    There are two important points about Hamiltons rule that areilluminated by the derivation in Eq. (36). First,even thoughrelated-ness can be defined as a regression coefficient and thus might ap-pear to capture only a statistical snapshot of population structure,

    the definition above in terms of coalescence times or IBD prob-abilities emphasizes that relatedness is a multigenerational con-cept. Genetic associations between alleles are created by the longterm effects of mating, reproduction, dispersal, and survival on ge-nealogy, which can be measured with coalescence times. Theseassociations, through coalescence times or IBD probabilities, willgenerally depend explicitly on the demography and populationstructure. More broadly, this also implies that the inclusive fitnesseffect in Hamiltons rule captures multigenerational effects of se-lection on allele frequency change.

    The second point is that the cost and benefit terms are effectson fitnessas measured over the whole lifecycle. This implies thatfitness effects will be functions of demographic parameters (suchas population size and migration rate) and of fertility or survivalpayoffs from the social interaction. In order to determine the signsof c and b, which are required for classifying a behavior asaltruistic or not (seeTable 2), the demography and the effect ofsocial behavior on both fertility and survival must be specified.

    3. Application: social games in island-type populations

    To recap a bit, we have shown above how weak mutation andthe TSS allow a simple criterion for evolutionary success, A|a >a|A, in the short term. When selection is weak and genetic inter-actions are additive, this simplifies the condition for evolutionarysuccess considerably and we recover a measure of inclusive fitnessin(34).When non-additive genetic interactions are important andselection is still weak, the evolutionary success condition is given

    more generally by condition(24).There are many biological scenarios where non-additive in-teractions are important, and the simplest one that is often in-voked is a two-player game. Each individual plays one of two purestrategies, cooperation or noncooperation, with a social part-ner where the payoffs for the game are given in Table 3. When twoindividuals are noncooperators, they receive no payoff. If one doesnot cooperate and the other cooperates, the cooperator receivesCand the noncooperator receives B. When two individuals co-operate, they each receive payoffB C+ DwhereDis a measureof non-additivity or synergy. The strategy names and payoffs areinspired by the Prisoners Dilemma game (Binmore, 2007) whereBandCare positive andD 0, though we will allow the param-eters to take negative values as well in order to study other games

    like the Stag-Hunt(Skyrms, 2004). In the simplest case, the strate-gies are fixed by the genotype of the individual where individuals

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    10/25

    J. Van Cleve / Theoretical Population Biology 103 (2015) 226 11

    Table 3

    Payoffs for the cooperation (coop) allele, A , and noncooperation(noncoop) allele,a, in the social game.

    Focal individual Social partner

    Acoop anoncoop

    Acoop B C+ D Canoncoop B 0

    bearing theAallele always cooperate and individuals bearing aal-waysdonotcooperate.Thismeansthereisnophenotypicplasticitythat might result from changing strategies over repeated interac-tions (e.g., reciprocity (Trivers,1971;Axelrod and Hamilton, 1981)or responsiveness (Akay et al., 2009; Akay and Van Cleve, 2012)).

    We assume an island-type population structure where n groupsofNhaploids (NT = nN) are connected each by migration rate m.Generations are non-overlapping. The frequency of the coopera-tion allele A in individual i in group g is pgi, the mean frequencyin group g is pg including individual i and pg\iexcluding i, and pis the mean frequency in the whole population. Individuals choosesocial partners at random and the mean payoff from the social in-teractions in the group determines how each individuals fertility

    differs from the baseline of one. Given these assumptions, the fer-tility of individualiin groupgis 1 + fgiwhere

    fgi =B pg\i C pgi+ D pgipg\i (37)

    andB and Care additive and D non-additive in allele frequency.This fertility function represents not only the two-player case butalso then-player case within the group when payoff is an additivefunction of the frequency of other player types in the group (VanCleve and Lehmann, 2013,Appendix C).

    3.1. Baseline demography: hard selection

    We begin with a model of hard selection (see Section2.7) withgroups producing different numbers of migrants depending on

    their composition. The fitness of individual i in group gcan bewritten as

    wgi =(1 m)

    1 + fgi

    (1 m)

    1 + fg

    + m

    1 + f\g

    +

    1

    n 1

    nk=g

    m1 + fgi

    (1 m) (1 + fk) + m

    1 + f\k

    , (38)where 1 + f\g is the mean fertility in the population excludinggroupg, or to first order in as

    wgi = 1 +

    fgi

    (1 m)2fg+ m(2 m)f\g

    + mn 1

    fg f\g

    + O

    2

    , (39)

    which takes the same form as Eq.(17).All that remains in orderto calculate the fixation probabilities in Eqs. (22)and(23)are theexpected coalescence times under neutrality.

    Fortunately, coalescence times in structure populations area well-studied topic (Notohara, 1990; Herbots, 1997; Wakeley,1998;Wilkinson-Herbots,1998)in coalescent theory where theprocess often considered is called the structured coalescent (Nord-borg and Krone, 2002;Nordborg,2007). As applied to the islandmodel, the structured coalescent usually assumes that the migra-tion rate m is O (1/N)and nNm/(n 1) Mas N . Thisimplies that during a small interval of time, either two lineages

    within a group can coalesce or one lineage can migrate from onegroup to another, but more than one such event does not occur.

    For two lineages, either both can be in the same group with con-figuration(0, 1)or each can be in different groups with configu-ration(2, 0) = (2)where the first element of the configuration isthe number of groups with a single lineage, the second element isthe number of groups with two lineages, etc. (e.g., see (Wakeley,1998)). We denote the expected coalescence times of these config-urations by E T(0,1)and E T(2), respectively. With synergisticpayoffs (D = 0) that create non-additive fitness effects, we alsohave to track three lineage samples. For three lineages, there arethree possible configurations: (3), (1, 1), and(0, 0, 1). Using themaster equation for the continuous-time Markov process that de-scribes coalescence (eq. 2.8 in Notohara, 1990), a system of equa-tions for the five expected coalescence times can be constructed(Wakeley (1998) describes the method nicely); these equationsaregiven insection5.1 of(Ladret and Lessard, 2007),and we only pro-vide the solutions here (time in absolute units):

    E

    T(0,1)

    = NT, E

    T(2)

    = NT

    1 +

    1

    2M

    ,

    E

    T(0,0,1)

    = NT

    4

    3+

    n 1

    6n(1 + M)

    ,

    E

    T(1,1)

    = NT

    43

    + 12M

    16n(1 + M)

    ,

    E

    T(3)

    = NT

    4

    3+

    2

    3M

    1

    6n(1 + M)

    .

    (40)

    Note that evaluating a more complex fitness function with higher-order frequency dependence would require calculating expectedcoalescence times for four lineage or larger samples. The numberof configurations and equations grows quickly as the lineagesample size increases, which makes this method cumbersome forcomplex fitness functions. Working on the related problem ofcalculating the total length of the coalescent genealogy, Wakeley(1998) provided a method to calculate expected coalescence timesfor arbitrarily large samples so long as the number of groupsn is large; this suggests that an analogous method might workfor coalescence times that could be used to calculate fixationprobabilities.

    Applying the fitness function from (39)and the coalescencetimes in(40)to Eq.(22),we calculate the probability allele A(thecooperation allele) fixes in a population ofa(the noncooperationallele) as

    A|a =1

    NT+

    C+ D

    1

    3+

    1 1n

    6(1 + M)

    (41)

    and the fixation probability ofain a population ofAas

    a|A =

    1

    NT +C

    D2

    3

    1 1n

    6(1 + M) . (42)

    First derived byLadret and Lessard(2007,eq. 29), these expres-sions are correct to first order in selection strength and ze-roth order in O (1/N) since our coalescence times assume thatN . These two fixation probabilities are the first main re-sult of thissection andproduce a fewimportant observations. First,they reproduce the classic cancellation result from Taylor(Tay-lor,1992a,b;Taylor et al.,2011) for additive genetic interactions(D = 0). Taylors result says that the benefits of cooperation areexactly balanced out by the effect of competition between relatedindividuals within a group (so-called kin competition or localcompetition) when population structure is homogenous (Tayloret al., 2011) and generations do not overlap. In our model notation,

    the cancellation result implies that the benefit to others B will can-cel out of the fixation probability expressions; Eqs.(41)and(42)

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    11/25

    12 J. Van Cleve / Theoretical Population Biology 103 (2015) 226

    Table 4

    Evolutionary success conditions calculatedin an infinite island model(N andn ) withhard selection. Thesolid boxedcondition is theone-third law (Nowaketal.,2004) and thedashed box conditionis therisk dominancecondition (Harsanyiand Selten, 1988;Ellison,1993;Kandori et al.,1993)that holds for all values ofM.

    show that this indeed does occur. Thus, when interactions are ad-ditive (D = 0), the cooperation allele A fixes with a probabilitygreater than the neutral probability = 1/NT (Eq.(28)) onlywhen there is negative direct cost (i.e., cooperation is directly ben-eficial). Likewise, a|A in Eq.(42)shows that the noncooperationalleleais advantageous when the cost is positive. In an extensionof the cancellation result,Ohtsuki(2012)shows (forn ) thatpositive synergy that does not change the structure of the game,0 < D < C, cannot result in positive selection for cooperation.Eqs.(41)and(42)reproduce this result since even the strongestpopulation structure,M 0, results in cooperation fixing morelikely than chance only when D > 2C.

    The expression for A|a in Eq. (41) also produces the one-third law from evolutionary game theory (Nowak et al., 2004).The one-third law says that as N in a single panmicticpopulation, the cooperation alleleA fixes with a probability greaterthan chance when the mixed strategy equilibrium of the gameinTable 3is less than one third. If an opponent cooperates withprobabilityzand does not cooperate with probability 1 z, themixed strategy equilibrium is the value ofzwhere an individualdoes equally well against the opponent by either cooperating ornot cooperating(Hofbauer and Sigmund, 2003). Using the payoffs

    from Table 3, the mixed strategy equilibrium equation isz(BC+D) (1 z)C =zB, which yieldsz =C/D. Thus, the one-thirdlaw translates to

    D > 3C. (43)

    We can immediately recover the one-third law from A|a >1/NTby taking the high migration limit M in(41),which resultsin an unstructured population. The complementary condition forfixation of the noncooperation allele a, a|A > 1/NT, becomesD < 3C/2 in the high migration limit. In contrast, populationstructure is at its strongest in the lowmigration limit when M0and when the number of groups is large, n . Fixation ofthe cooperation allele becomes easier in this case asA|a > 1/NTtranslates toD > 2C. Conversely, fixation of the noncooperation

    allele also becomes easier when population structure is strongsince the fixation condition becomes D < 2C. These conditionsare summarized inTable 4.

    As discussed in Section2.4,each fixation condition alone (and,consequently, the one-third law, A|a > 1/NT) is sufficient as ameasure of evolutionary success only when genetic interactionsare additive. When non-additive or synergistic interactions areincluded, the condition A|a > a|A(Eq.(9)) should be used. Thiscondition is

    D > 2C (44)

    using the fixation probabilities in (41) and (42) (see Table 4).Interestingly, this implies that whether the cooperation allele Aor the noncooperation allele a is more common at stationarity

    is independent of the strength of population structure, M, anddepends only on the payoffs in the social game. This is a

    generalization of the Taylor cancellation result in the sense thatthe kind of simple population structure considered here (non-overlapping generations,homogenous island-type migration, hardselection) is not sufficient for the benefits of cooperation B to affectselection for cooperation. Rather, synergistic effects are important,but they must be significantly outweigh the costs in order forcooperation to be more prevalent than noncooperation.

    In fact, once synergistic effects outweigh the costs at all, D> C,they change the structure of the social game from a PrisonersDilemma where noncooperation is the strictly dominant strat-egy to a Stag-Hunt or coordination game where both coopera-tion and noncooperation are Nash equilibria (Binmore, 2007). Incoordination games in unstructured populations, resident popula-tions of cooperators andnoncooperators are bothresistant to inva-sion by the complementary type when evolution is deterministic(i.e., there is no genetic drift). This implies that whether allele Aora becomes fixed in the population depends on the initial frequencyofA. When the initial frequency is greater than the mixed strategyequilibriumz = C/D, selection leads to fixation of the coopera-tion alleleA, and fixation of the noncooperation allele aoccurs forinitial frequencies less thanC/D. In effect, if the phenotypic spaceis the probability of cooperationz, thenthe mixed strategy equilib-rium is a fitness valley and pure cooperation and noncooperationare fitness peaks(Van Cleve and Lehmann, 2013). The basin of at-traction for the cooperation peak in this case would be (z, 1), andfor the noncooperation peak the basin would be (0,z). An intu-itive condition for the cooperation peak to be more likely to evolveis that is has the larger basin of attraction under a model of simpledeterministic evolution. This condition, which is called risk dom-inance in game theory(Harsanyi and Selten, 1988;Kandori et al.,1993), is equivalent to 1 z > z or

    z =C

    D a|A in (44).The fact that we can derive the evolutionarysuccess condition for cooperation in a structured population withselection, drift, and mutation from the risk dominance condition ina purely deterministic model with no population structure is an-other way of describing the generalized Taylor cancellation resultintroduced above.

    Even though the equivalence between A|a > a|A and riskdominance is proven here for the infinite island model ( n andN ), it approximately holds for the finite island modelas well. In order to show this, we make two observations. First,examining the general condition for A|a > a|A in Eq.(24),weobserve that so long as fitness depends on at most three waygenetic interactions (d =2), which is true for the fitness functionin(39),all three-way coalescence times cancel out. Thus, we only

    need exact pairwise coalescence times and do not need exacttriplet coalescence times (we elaborate on further ramificationsof this fact in the Discussion). Second, as suggested byLadret andLessard(2007,p. 416), exact expected coalescence times can becalculated using a discrete-time Markov process that produces thesame linear equations in(Notohara, 1990,p. 66) that were used togenerate the structured coalescent results in (40). These equations,given by (I) inLadret and Lessard(2007), produce the followingexact expected coalescence times:

    E

    T(0,1)

    = NT,

    E

    T(2)

    = NT

    1

    1

    N+

    1

    M(2 M/N)

    .

    (46)

    Combining the above expressions and the fitness function from(39) withtheevolutionarysuccesscondition A|a > a|Ain Eq. (24),

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    12/25

    J. Van Cleve / Theoretical Population Biology 103 (2015) 226 13

    we get

    D> 2C+2 (B C+ D)

    NT. (47)

    Condition(47)contains a single correction to the risk dominancecondition, 2(B C +D)/(NT), which is due to local competitionin a finite population. Compared to the infinite island model, thecooperation allele A is only slightly less prevalent in this case. Ifthe number of groups is infinite or if each group is infinitely large,condition(47)simplifies to risk dominance.

    3.2. Demography and the scale of competition

    Under the baseline demography of hard selection, we obtainthe general Taylor cancellation result that removes the additivebenefits of cooperation. This cancellation is a function of how de-mography shapes both the competitive environment and geneticidentity within and between groups. Thus, demographies that cre-ate a different competitive environment may be more or less con-ducive toward the evolutionary success of the cooperation allele.

    One of the most common alternative demographies to hardselection is soft selection (Christiansen, 1975) where individuals

    compete for resources or breeding spots within their groupbefore the migration stage. Thus, each group contributes thesame number of individuals to the next generation. Intuitively,this should make it more difficult for the cooperation allele A tosucceedsince groups with a higher frequency ofA will not be moreproductive than groups with a lower frequency. In this case, thefitness function is

    wgi =1 + fgi1 + fg

    , (48)

    (Rousset, 2004,p. 125) when the number of groups is large (n ). For additive genetic interactions (D =0) and using the exactpairwise expected coalescence times in (46), the evolutionarysuccess condition A|a > 1/NTis

    C >B C

    N, (49)

    which agrees with previous analyses (Rousset, 2004; Lehmann andRousset, 2010). In contrast to the case of hard selection where theevolutionary success condition is C >0, soft selection increasesthe strength of local competition so that cooperation is actuallyselected against in finite groups with a strength proportionalto thenet benefits, B C. Allowing for non-additive interactions (D =0),we calculate the evolutionary success condition A|a > a|Ato be

    D> 2C+2 (B C+ D)

    N, (50)

    which again has a stronger local competition correction to the risk

    dominance condition than the hard selection case (Eq. (47)). Infact, since it does not depend on the number of groupsn, the softselection correction is not simply a finite population size (NT a|Afor this case is

    D +B C+ D

    M

    1

    1

    n

    > 2C (52)

    for large groups (N ). When migration is high and the pop-

    ulation is unstructured (M ), we recover the risk dominancecondition from(52).Strong population structure from weak mi-gration however very strongly selects for cooperation. In fact, forany level of cost, there is a low enough population migration rateMthat results in more cooperation alleles A at stationarity. Thisstrong selection for cooperation is a direct result of the lack of lo-calcompetition, which allows the benefits of cooperation to accrueto individuals who cooperate with other individuals in their groupwho tend to share their genes IBD.

    4. Theory: long-term evolution

    We showed above how the forces of selection, mutation,and genetic drift generate the stationary distribution between

    two alleles when mutation is weak and evolutionary changefollows the TSS. This stationary distribution gave us a measure ofevolutionary success of one allele relative to another, which wassimply A|a > a|A. Under the additional assumption of weakselection, we showed how to calculate this condition (Eq.(24)) forarbitrary non-additive genetic interactions. From this condition,we obtained Hamiltons rule, the one-third law, risk dominance,and generalizations of these conditions. However, this conditiononly gives the stationary distribution among a fixed set of allelesandthus does notexplicitly make predictions forlongertimescaleswhen the evolutionary process can sample a continuum of alleles.

    Studying long-term evolution among a continuum of possiblealleles requires specifying how those alleles are generated bymutation and how mutations are fixed or lost over the long-term

    due to selection anddrift. Just as with the short-term model amonga finite set of alleles, we will assume weak mutation so that thepopulation can be described by the TSS and we only need to trackthe evolution of a population from one fixed, or monomorphic,state to another. Our approach to modeling the long-term processusesthesubstitutionrateapproachofLehmann(Lehmann, 2012;Van Cleve and Lehmann, 2013), which derives from populationgenetic approaches to adaptation (Gillespie, 1983, 1991) andtokinselection in finite populations (Rousset andBilliard, 2000; Rousset,2004;Taylor et al.,2007;Taylor and Maciejewski, 2014)and fromtheadaptivedynamicsapproach(Metzetal., 1996; Dieckmann andLaw, 1996;Champagnat et al.,2001).

    4.1. Substitution rate approach and the TSS diffusion

    The essence ofthisapproach is that, over the long term, the evo-lutionary process at each point in time can be fully characterizedby a substitution or transition rate that measures how likely thepopulation is to move from one monomorphic state to another.We assume that the substitution rate is an instantaneous measureof change, which is justified since organismal life cycles and gen-eration times become very short on the scale of long-term evolu-tion. If(z1, t1|z0, t0) is the probability density that a population ismonomorphic for traitz1at timet1given it was monomorphic for

    z0at timet0, then we define the substitution rate as

    limt0

    (z+ , t+ t|z, t)

    t=k(,z),

    which is intuitively the rate at which mutations of type z+ areproduced and fixed in the population of type z. The substitution

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    13/25

    14 J. Van Cleve / Theoretical Population Biology 103 (2015) 226

    rate is a function of both the mutation rate, , and the distribu-tion of mutational effects, u(,z), which represents the probabilitydensity that a mutant offspring is of type z+ given that its par-ent is of type z. For simplicity, we assume thatdoes not dependon the resident trait in the population,z. Using weak mutation andthe TSS condition in(4),Champagnat (Champagnat,2006;Cham-pagnat and Lambert, 2007) showed that the long-term TSS can becharacterized as a Markov jump process (continuous in time) withinstantaneous jump rate

    k(,z) = NT u(,z) ( ,z) (53)

    (see also Theorem 1.1 in (Champagnat and Lambert, 2007) andEq. (2) in(Lehmann, 2012)). This rate is conceptually analogous tothe classic long-term neutral substitution rate k = NT(1/NT) = from molecular evolution(Kimura and Ohta, 1971).

    Following standard methods in stochastic processes, theMarkov jump process representing the TSS with jump rate k(,z)can be represented with the following (forward) master equa-tion (Dieckmann and Law, 1996;Gardiner,2009;Lehmann,2012)

    (z, t)

    t= k(,z )(z , t) k(,z)(z, t)d, (54)

    where we write (z, t) = (z, t|z0, t0)for simplicity. The masterequation captures the intuition that the change in probability den-sity fortraitzat time t isequaltothesumofthejumpstowardthattrait minus thesum ofthe jumps away fromthat trait. Without fur-ther assumptions about the size of jumps, the master equation isdifficult to analyze. A common way to approximate Markov jumpprocesses is to assume that the jumps are small, which turns thediscontinuous jump process into a continuous process and turnsthe master equation into a diffusion equation. Specifically, stan-dard methods (e.g., a KramersMoyal expansion(Gardiner, 2009))generate a diffusion equation by ignoring third-order and highermoments of the jump process. Biologically, we can justify thisapproximation by assuming that the mutational effects clustertightly enough around the mean trait value zso that the evolu-tionary dynamic is affected only by the variance of the distributionof mutational effects and higher order moments can be neglected.The (forward) diffusion equation obtained with these methods forthe jump process in(54)is

    t=

    z[a(z)(z, t)] +

    1

    2

    2

    z2[b(z)(z, t)] (55)

    where

    a(z) =

    k(,z)d (56)

    is the drift term and measures the mean jump away from thetraitzand

    b(z)=

    2k(,z)d (57)

    is the diffusion term and measures the variance of the jumpsaway fromz. The drift and diffusion terms can be simplified by as-suming that the fixation probability is differentiable (which is trueif fitness is differentiable; see Section2.5)and approximating thesubstitution or jump rate by the first-order Taylor series

    k(,z) = NT u(,z)

    (z) + S(z)+ O

    2

    (58)

    where (z) = (0,z)and S(z) = dd(0,z)is the selection gra-

    dient. Using the expansion in(58)and assuming that the muta-tional distribution is symmetric in , the drift and diffusion terms

    become, respectively,

    a(z) = NT

    u(,z)

    (z) + S(z)+ O

    2

    d

    = NT S(z)

    2 + O

    3

    u(,z)d

    NT 2(z)S(z) (59)

    and

    b(z) = NT

    2u(,z)

    (z) + S(z)+ O

    2

    d

    NT 2(z)(z)= 2(z) (60)

    where (z) = 1/NT , 2(z)=

    2u(,z)d isthesecondmoment(and variance) of the mutational effects distribution and third-order and higher moments are neglected. Substituting these ex-pressions fora(z) andb(z) into the diffusion Eq.(57)yields

    t= NT

    z

    S(z)2(z)(z, t)

    +

    2

    2

    z2

    2

    (z)(z, t)

    , (61)which was derived byLehmann(2012,eq. 4) and is analogous tothe stochastic differential equation derived by Champagnat andLambert(2007,eq. 3). The diffusion equation in(61)is our mainmathematical description of how the TSS evolves over the long-term. In a sense, this diffusion equation is a stochastic version ofthe deterministic canonical equation of adaptive dynamics (Dieck-mann and Law, 1996). The first term in(61)measures the deter-ministic effect of selection on the trait and is the counterpart of thecanonical equation of adaptive dynamics (Champagnat and Lam-bert, 2007). The second term measures the stochastic effect of ge-netic drift through the neutral fixation rate (Eq.(60)).

    4.2. Evolutionary success in the long-term TSS

    Just as in the short-term TSS analysis, we will define the evolu-tionary success for a trait zin the long-term TSS as its stationaryprobability or (z)= limt(z, t). Using thelong-term TSS dif-fusion in (61) and standard methods assuming reflecting boundaryconditions (Karlin and Taylor, 1981; Ewens, 2004; Gardiner, 2009),we find that

    (z) =1

    K2(z)exp [2NT(z)] (62)

    where

    (z) = z

    S(y)dy

    is the potential function and Kis a normalizing constant that en-sures (z)integrates to one over its support. The most successfultraits will be those that reside at peaks of the stationary distribu-tion, and the least successful will reside at troughs. Obtaining thepeaks and troughs, when they do not reside at the boundaries oftrait space, requires calculating the extrema of (z), which mustsatisfy

    S(z)=1

    2NT

    dlog 2(z)

    dz(63)

    evaluated at a candidate extremum z =z . So long as either pop-ulation sizeNTis very large or the mutational variance 2(z)doesnot depend on the resident trait z, Eq.(63)becomes

    S(z)= 0. (64)

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    14/25

    J. Van Cleve / Theoretical Population Biology 103 (2015) 226 15

    Consequently, in these twocases,the extrema of the stationary dis-tribution are given the zeros of the selection gradient S(z), whichare also the extrema of the fixation probability (,z) with re-spect to . For the remainder of this analysis, we will assume thatd2(z)/dz=0 and the extrema of the stationary density are zerosof the selection gradient.

    The zeros of the selection gradient are the candidate evolution-ary equilibria obtained using evolutionary game theory, adaptivedynamics, and inclusive fitness theory. A candidate evolutionaryequilibriumz is called convergence stable when S(z) 0 for the upper boundary. Thus, long-term evolutionaryattractors given by convergence-stable equilibria obtained fromthe selection gradient are generally local maxima of the stationarydensity of the long-term TSS diffusion(Lehmann, 2012).

    In the simple TSS model above, the selection gradient alsodetermines more generally which phenotypes are more probablethanothersatstationarity.Thisisaconsequenceofourassumptionthat both total population size NTand mutational variance2 arenotfunctionsoftheresidenttraitz.Toseethis,considertherelativelikelihood of two traits, zA and za, at the stationary distribution:traitzAis more likely that traitzawhen

    (zA

    )

    (za)=exp [2NT((zA) (za))]> 1 (65)

    or

    (zA) > (za).

    Thus, though total population size and mutational variance mayaffecthow divergentthe likelihoods of twotraits areat stationarity,the selection gradient alone (through its integral) completelydetermines which trait is more likely than the other.

    4.3. Additive genetic interactions and weak versus strong payoffs in

    the long-term TSS

    Since the diffusion approach to the long-term TSS uses -weak

    selection (see Eq.(58)), we know from Section2.5 that this ap-proach can only capture additive genetic interactions with respectto fitness. This could suggest that the diffusion equation for thelong-term TSS only representsevolution under additive genetic in-teractions. Intriguingly, this is not true as we will show in Section 5where we apply the long-term TSS diffusion to the same socialgame analyzed using the short-term process in Section3.Assum-ing that the payoffs to the social game are small (weak payoffs)and taking a large total population size limit (NT ), we findthat thediffusion modelof thelong-term TSS can reproduce all theresults of the short-term TSS with three-way genetic interactionsin a group structured population. This is likely due to the cancella-tion of three-way coalescence times in short-term model.

    Moreover, we find for strong payoffs that the diffusion

    approach appears to reproduce some results generated fromanother evolutionary game theory analysis that does not assume

    weak selection. Specifically,Fudenberg et al.(2006) show that thecondition for evolutionary success in an unstructured populationis less favorable to cooperation than simple risk dominance wouldsuggest, and the long-term TSS diffusion reproduces this effect.

    5. Application: social games in structured populations

    In this section, we apply the long-term TSS diffusion to thesame social game as in Section3with a similar group-structuredpopulation.Ourgoalistotracktheevolutionofthecontinuoustrait

    zthat measures the fraction of time that an individual cooperateswith social partners living in its own group; the complementaryfraction 1 z is the fraction of time the individual does notcooperate. Using the payoffs from Table3, the fertility of individualiin groupgis 1 +fgiwhere

    fgi =fgizgi,zg\i

    = B zg\i C zgi+ D zgizg\i, (66)

    which is analogous to the fertility function in Eq. (37).Followingthe weak effect mutation model for the short-term TSS, we assumethat the phenotype of individual i in groupgiszgi = z+pgi where

    pgiis the frequency of the mutant allele in that individual.

    5.1. Selection gradient

    Without explicitly specifying how fertility and survival trans-late into fitness, we can calculate the selection gradient by assum-ing a homogeneous group-structured population where dispersalis potentially local so that genetic identity or relatedness can buildup. This is the same kind of demography we used in Section 2.7where we reproduced Hamiltons rule from the selection gradientin Eq.(34).Briefly, this demography ensures that the derivativesof fitness with respect to the trait values of different individuals at

    neutrality take only three possible values: wgi(z)

    zgifor the effect of

    the focal individuals trait on its own fitness, wgi(z)

    zg\ifor theeffect of

    the average group trait (excluding the focal) on the focals fitness,and

    wgi(z)

    z\gfor the effect of the average trait in other groups (i.e., all

    groups exceptg) on the focals fitness. The latter derivative, wgi(z)

    z\g,

    can be written in terms of the former two since the selection co-efficients must sum to zero. The IBD probabilities also collapse tothree categories: identity with self, which is one, identity with an-other individual in the group or Q0, and identity with an individualin another group or Q1. These facts together allow us to rewrite theselection gradient in Eq.(34)as

    S(z) = lim0

    1 Q11 Q0

    wgi(z)

    zgi+

    wgi(z)

    zg\i

    Q0 Q1

    1 Q1

    c+ b r (67)

    where we have used the fact that 1 Q0 = 2NT + O

    2

    (eqs. 26 and 46 in Nagylaki, 1983 and eq. 3.68 in Rousset,2004). The selection gradient in(67)is the same as that derivedby Rousset (Rousset and Billiard, 2000; Rousset, 2004) forgroup structured populations. Using the definitions for terms in

    Hamiltons rule in Section2.7 where c = wgi(z)

    zgi, b =

    wgi(z)

    zg\i

    and r = FST = Q0Q1

    1Q1, it is clear that selection gradient S(z)is

    proportional to the inclusive fitness effect, which is a function ofthe phenotypic traitzand all of the demographic effectsthat shapeb,c, andr.

    Since fitness is a function of fertility, we can write the selectiongradientS(z)in Eq.(67)in terms of derivatives of fertility instead

    of fitness. This will allow us to express S(z) in terms of thepayoffs B , C, and D from the social interaction. First, we assume

  • 7/25/2019 Social evolution and genetic interactions in the short and long term

    15/25

    16 J. Van Cleve / Theoretical Population Biology 103 (2015) 226

    a group-structured demography where fitness is determined bycompetition with groups for the Navailable spots or patches inthe next generation. Let be the probability than an offspringwill compete for a patch in its natal deme and 1 be theprobability that it competes in some other deme. The fitness of afocal individualiin groupgis then

    wgi =

    1 +fgi1 +fg

    + (1 )

    1 +f\g

    +

    1

    n 1

    nk=g

    (1 )1 +fgi

    (1 +fk) + (1 )

    1 +f\k

    (68)where 1 + fg\i is the average fertility in group g excludingindividual i and f\g is the average fertility among all groupsexcludingg.IgnoringtermsO

    2

    duetotheassumptionof-weakselection allows the average fertilities in the denominator of(68)to be rewritten as the fertility of an individual with the averagephenotype:

    wgi =

    1 +fgi

    zgi,zg\i

    1 + fgi zg,zg + (1 )fgi z\g,z\g+

    1

    n 1

    nk=g

    (1 )1 +fgi

    zgi,zg\i

    1 + fgi(zk,zk) + (1 )fgi

    z\k,z\k

    + O

    2

    . (69)

    Using Eq.(69),we can write the fitness derivatives in the selectiongradient in Eq.(67)as

    wgi(z)

    zgi=

    1

    1 +fgi(z)

    fgi(z)

    zgi

    N

    fgi(z)

    zgi+

    fgi(z)

    zg\i

    (70)

    and

    wgi(z)

    zg\i

    =1

    1 +fgi(z)

    fgi(z)

    zg\i

    N 1

    N

    fgi(z)

    zgi+

    fgi(z)

    zg\i

    (71)

    where

    =2 +(1 )2

    n 1 (72)

    is the probability that two offspring born in a group will competein the same group after dispersal and is a measure of the strengthof local competition (also called the scale of competition (Frank,1998,p. 114)). Using the fitness derivatives in (70)and(71)inEq.(67)allows us to rewrite the selection gradient as

    S(z)=

    k

    1 +fgi(z)fgi(z)

    zgi +

    fgi(z)

    zg\i

    (73)

    where

    =r

    1N

    + rN1N

    1

    1N

    + rN1N

    (74)is the scaled relatedness coefficient and

    k = lim0

    1 Q11 Q0

    1

    1

    N+ r

    N 1

    N

    (75)

    is a positive coefficient that scales the magnitude of the selectiongradient anddepends on demographic andcompetitive effects. Weassume that both and k do not dependon thephenotypez, which

    is true if the demographic variables, such as survival, migration,andpopulationsize,donotdependonz. Additionally,fertilitymust

    also be generally large or Poisson distributed in order to neglectthe effect of the trait on demographic stochasticity(Lehmann andBalloux, 2007).

    The first term in the parentheses of(73) is the effect of thetrait in the focal individual on its own fertility. The second termis the effect of the trait in the group (excluding the focal) on thefertility of the focal weighted by the scaled relatedness coefficient . The scaled relatedness in Eq. (74) is classic relatedness (r)reduced by the effect of local competition weighted by the averagerelatedness of the individuals affected by the local competition( (1 + r(N 1))/N) and normalized. Thus, scaled relatedness accounts for both relative genetic identity due to geneticrelatedness and competitive effects due to demography and finitepopulation size (Lehmann and Rousset, 2010; Van Cleve andLehmann, 2013; Van Cleve and Akay, 2014). In general, the scaledrelatedness can take a value between 1 and 1 depending onthe demography (Lehmann and Rousset, 2010). For example, thehard selection demography in Eq. (38) has = 1 m and from Eq.(74)yields = 1/(NT 1)(Van Cleve and Lehmann, 2013,eq.B.1), soft selection in Eq.(48)has =1 and yields = 1/(N1)(Lehmann and Rousset, 2010,eq. A-8), and group competitionfrom Eq.(51)has = 1/nand yields =rwhenn .

    Compared to the fitness effects andgenetic identity formulationof the selection gradient in (67), the selection gradient in (73)partitions terms into fertility effects and scaled relatedness. Theformer partition is that used to define b, c, and r in Hamiltonsrule, which leads to the definitions for different social behaviorsbased on fitness effects inTable 2.In the latter partition, all theeffects of demography and local competition are encompassedby scaled relatedness and the coefficient k and the effects ofthe phenotypic trait on the social interaction are isolated in thefertility effects. In so far as we are interested in understanding howthe immediate payoffs from social interactions and demographyindependentlycontribute to selection on social behavior, the latterpartit