statistical power analysis of neutrality tests under ... · of 16 statistical tests to detect...

13
Copyright Ó 2008 by the Genetics Society of America DOI: 10.1534/genetics.107.083006 Statistical Power Analysis of Neutrality Tests Under Demographic Expansions, Contractions and Bottlenecks With Recombination Anna Ramı ´rez-Soriano,* Sebastia ` E. Ramos-Onsins, Julio Rozas, Francesc Calafell* ,‡,§,1 and Arcadi Navarro* ,‡,§, ** ,1,2 *Departament de Cie`ncies de la Salut i de la Vida, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain, Departament de Gene `tica, Universitat de Barcelona, 08028 Barcelona, Catalonia, Spain, **Institucio ´ Catalana de Recerca i Estudis Avanc xats and Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain, § Population Genomics Node (GNV8), National Institute for Bioinformatics, Spain and CIBER en Epidemiologia y Salud Pu ´ blica (CIBEREsp), Spain Manuscript received October 5, 2007 Accepted for publication February 26, 2008 ABSTRACT Several tests have been proposed to detect departures of nucleotide variability patterns from neutral expectations. However, very different kinds of evolutionary processes, such as selective events or demographic changes, can produce similar deviations from these tests, thus making interpretation difficult when a significant departure of neutrality is detected. Here we study the effects of demography and recombination upon neutrality tests by analyzing their power under sudden population expansions, sudden contractions, and bottlenecks. We evaluate tests based on the frequency spectrum of mutations and the distribution of haplotypes and explore the consequences of using incorrect estimates of the rates of recombination when testing for neutrality. We show that tests that rely on haplotype frequencies— especially F s and Z nS , which are based, respectively, on the number of different haplotypes and on the r 2 values between all pairs of polymorphic sites—are the most powerful for detecting expansions on nonrecombining genomic regions. Nevertheless, they are strongly affected by misestimations of recom- bination, so they should not be used when recombination levels are unknown. Instead, class I tests, particularly Tajima’s D or R 2 , are recommended. A N increasing number of statistical tests (Tajima 1989a; Fu and Li 1993; Fu 1997; Fay and Wu 2000; Ramos-Onsins and Rozas 2002) have been de- veloped to detect departures of DNA sequence var- iability from the expectations of the neutral theory of evolution (Kimura 1968). Most of the research in this area is based upon the Wright–Fisher model (Fisher 1930; Wright 1931; Hein et al. 2005), which assumes populations of constant size that are panmictic and nonrecombining. Moreover, the Wright–Fisher model provided the founding of coalescent theory (Kingman 1982a,b, 2000; Hudson 1990; Donnelly and Tavare 1995; Fu and Li 1999), which was fundamental for developing neutrality tests and furthering their study. Even if these models are quite prone to mathematical treatment, analytic derivations are often unreachable and the significance of departures from neutrality and the statistical power of the tests are estimated by com- puter simulations based on the coalescent process (Wall 1999; Ramos-Onsins and Rozas 2002; Depaulis et al. 2003). The detection of departures from the null hypothesis of neutrality points to the violation of one or more of its assumptions. These deviations can be due to selec- tive and/or demographic events. For example, selec- tive sweeps or population growth can produce longer external branches in the genealogy that result in an excess of recent mutations over neutral predictions. In contrast, population subdivision or balancing selec- tion will result in longer internal branches and, conse- quently, in an excess of old over recent mutations. In summary, different kinds of processes can produce sim- ilar genealogies and therefore confound the interpre- tation of tests. Much effort has been devoted to ascertaining the power of different statistical tests to reject the null hypothesis of neutrality when it is actually false, as well as to defining properly which tests perform best in each scenario. Ramos-Onsins and Rozas (2002) studied the power of 17 statistical tests under sudden or logistic population-expansion models. The tests were classified in three categories on the basis of the information they used. Class I tests are based on the frequency spectrum of mutations, class II on the haplotype distribution, and class III on the distribution of pairwise differences. Depaulis et al. (2003) studied the power of seven statistics under bottlenecks (both severe and moderate) and hitchhiking with positively selected mutations. 1 These authors contributed equally to this work. 2 Corresponding author: ICREA (Institucio Catalana de Recerca i Estudis Avanc xats), Departament de Cie `ncies Experimentals i de la Salut Uni- versitat Pompeu Fabra Doctor Aiguader 88, 08003 Barcelona, Spain. E-mail: [email protected] Genetics 179: 555–567 (May 2008)

Upload: others

Post on 16-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

Copyright � 2008 by the Genetics Society of AmericaDOI: 10.1534/genetics.107.083006

Statistical Power Analysis of Neutrality Tests Under DemographicExpansions, Contractions and Bottlenecks With Recombination

Anna Ramırez-Soriano,* Sebastia E. Ramos-Onsins,† Julio Rozas,† Francesc Calafell*,‡,§,1

and Arcadi Navarro*,‡,§,**,1,2

*Departament de Ciencies de la Salut i de la Vida, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain, †Departament deGenetica, Universitat de Barcelona, 08028 Barcelona, Catalonia, Spain, **Institucio Catalana de Recerca i Estudis Avancxats

and Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain, §Population Genomics Node (GNV8), NationalInstitute for Bioinformatics, Spain and ‡CIBER en Epidemiologia y Salud Publica (CIBEREsp), Spain

Manuscript received October 5, 2007Accepted for publication February 26, 2008

ABSTRACT

Several tests have been proposed to detect departures of nucleotide variability patterns from neutralexpectations. However, very different kinds of evolutionary processes, such as selective events ordemographic changes, can produce similar deviations from these tests, thus making interpretationdifficult when a significant departure of neutrality is detected. Here we study the effects of demographyand recombination upon neutrality tests by analyzing their power under sudden population expansions,sudden contractions, and bottlenecks. We evaluate tests based on the frequency spectrum of mutationsand the distribution of haplotypes and explore the consequences of using incorrect estimates of the ratesof recombination when testing for neutrality. We show that tests that rely on haplotype frequencies—especially Fs and ZnS, which are based, respectively, on the number of different haplotypes and on the r 2

values between all pairs of polymorphic sites—are the most powerful for detecting expansions onnonrecombining genomic regions. Nevertheless, they are strongly affected by misestimations of recom-bination, so they should not be used when recombination levels are unknown. Instead, class I tests,particularly Tajima’s D or R2, are recommended.

AN increasing number of statistical tests (Tajima

1989a; Fu and Li 1993; Fu 1997; Fay and Wu

2000; Ramos-Onsins and Rozas 2002) have been de-veloped to detect departures of DNA sequence var-iability from the expectations of the neutral theory ofevolution (Kimura 1968). Most of the research in thisarea is based upon the Wright–Fisher model (Fisher

1930; Wright 1931; Hein et al. 2005), which assumespopulations of constant size that are panmictic andnonrecombining. Moreover, the Wright–Fisher modelprovided the founding of coalescent theory (Kingman

1982a,b, 2000; Hudson 1990; Donnelly and Tavare

1995; Fu and Li 1999), which was fundamental fordeveloping neutrality tests and furthering their study.Even if these models are quite prone to mathematicaltreatment, analytic derivations are often unreachableand the significance of departures from neutrality andthe statistical power of the tests are estimated by com-puter simulations based on the coalescent process (Wall

1999; Ramos-Onsins and Rozas 2002; Depaulis et al.2003).

The detection of departures from the null hypothesisof neutrality points to the violation of one or more ofits assumptions. These deviations can be due to selec-tive and/or demographic events. For example, selec-tive sweeps or population growth can produce longerexternal branches in the genealogy that result in anexcess of recent mutations over neutral predictions.In contrast, population subdivision or balancing selec-tion will result in longer internal branches and, conse-quently, in an excess of old over recent mutations. Insummary, different kinds of processes can produce sim-ilar genealogies and therefore confound the interpre-tation of tests.

Much effort has been devoted to ascertaining thepower of different statistical tests to reject the nullhypothesis of neutrality when it is actually false, as well asto defining properly which tests perform best in eachscenario. Ramos-Onsins and Rozas (2002) studied thepower of 17 statistical tests under sudden or logisticpopulation-expansion models. The tests were classifiedin three categories on the basis of the information theyused. Class I tests are based on the frequency spectrumof mutations, class II on the haplotype distribution, andclass III on the distribution of pairwise differences.Depaulis et al. (2003) studied the power of sevenstatistics under bottlenecks (both severe and moderate)and hitchhiking with positively selected mutations.

1These authors contributed equally to this work.2Corresponding author: ICREA (Institucio Catalana de Recerca i Estudis

Avancxats), Departament de Ciencies Experimentals i de la Salut Uni-versitat Pompeu Fabra Doctor Aiguader 88, 08003 Barcelona, Spain.E-mail: [email protected]

Genetics 179: 555–567 (May 2008)

Page 2: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

More recently, the power of several tests has been stud-ied under exponential population growth and bottle-necks (Sano and Tachida 2005) and populationstructure and hitchhiking (Jensen et al. 2005). However,the effect of intragenic recombination has been con-sidered in only a limited number of studies, and, in par-ticular, the joint effect of recombination and populationexpansions on the statistical power of neutrality tests hasnot, to the best of our knowledge, been explored.

The neutral model with no recombination, which iscommonly used as the null hypothesis, has larger var-iance in genealogy length than the same model in-cluding recombination (Hein et al. 2005). Such largervariance makes the assumption of no recombination aconservative assumption for many statistical tests. Inparticular, tests based on the frequency spectrum ofmutations are likely to be conservative on recombiningregions (Tajima 1989a; Fu and Li 1993; Fu 1996). On theother hand, tests based on haplotype or linkage disequi-librium (LD) are expected to be strongly affected byrecombination, since it will break down existing haplo-types and generate new ones, thus decreasing LD. More-over, as recombination can also smooth the mismatchdistribution, it is likely that statistical tests based on thisdistribution will have little power (Ramos-Onsins andRozas 2002). Finally, recombination can mimic theeffect of some demographic models, such as populationgrowth. It is therefore of general interest to distinguishthe individual effects of recombination and populationgrowth on DNA sequence variation and on neutralitytests (Schierup and Hein 2000).

The study of population expansions is also of greatinterest since their effects on genealogies (and, thus, onmany neutrality statistics) are similar to those of otherselective or demographic events. Among the former,selective sweeps caused by positively selected variants, aswell as background selection against deleterious muta-tions, lead to an excess of low-frequency variants(Charlesworth et al. 1993; Przeworski 2002). Onthe other hand, recent bottlenecks can also mimic theeffects of an expansion (Tajima 1989a,b), so these phe-nomena can be quite difficult to disentangle. In spite ofsuch difficulty, considerable progress is being made todistinguish between expansions and selective sweeps(Jensen et al. 2005; Williamson et al. 2005) or betweenbottlenecks and positive selection (Haddrill et al. 2005).

Here we use coalescent simulations to test the powerof 16 statistical tests to detect population expansions,contractions, and bottlenecks under different recombi-nation levels. The selected tests belong to the first twocategories described by Ramos-Onsins and Rozas (2002).We pay special attention to the problem of misestima-tion of recombination rates and study how the use of in-correct recombination rates when simulating neutralsamples can affect the power and the false-positive ratesof tests. We have found that statistics based on haplotypediversity are the most powerful tests for detecting pop-

ulation expansions on nonrecombining regions. In con-trast, since they are very sensitive to recombination, theiruse should be avoided when there is recombination.

MATERIALS AND METHODS

Statistics: We have considered two classes of statistics:statistics based on the frequency spectrum of mutation (classI) and statistics based on linkage disequilibrium and haplotypedistribution (class II). No statistics based on the distribution ofpairwise differences (e.g., the mismatch distribution) havebeen used, as they were shown to perform very poorly in thestudy by Ramos-Onsins and Rozas (2002). A summary of allstatistics can be found in Table 1.

Class I statistics: Class I statistics use information on thefrequency of mutations and are based on the differencesbetween estimators of the population mutation rate u ¼ 4Nm,where N is the effective population size and m is the mutationrate. From this class, we present results for Tajima’s D (Tajima

1989a), Fu and Li’s D, F, D*, and F* (Fu and Li 1993), and Fayand Wu’s H (Fay and Wu 2000). We have also included the R2

statistic (Ramos-Onsins and Rozas 2002), which is based onthe difference between the number of singletons per sequenceand the average number of nucleotide differences.

Class II statistics: Class II includes statistics based on thehaplotype distribution. They are expected to be the mostaffected by recombination. Within this class, we have studiedthe following statistics: Fu’s Fs (Fu 1997), the unbiased haplo-type diversity estimate Dh (Nei 1987, Equation 8.5), Wall’s Band Q (Wall 1999), Kelly’s ZnS (Kelly 1997), Rozas’ ZA and ZZ(Rozas et al. 2001), and two statistics based on the extendedhaplotype homozygosity (EHH) (Sabeti et al. 2002).

EHH statistics are a complex family of heuristic methods forwhich no consensus summary statistic has yet been developed.We have computed two EHH-based statistics by taking the firstthree SNPs of each sequence as a core haplotype (that is, as thelocus of interest) and then considering the distance from eachcore at which EHH decays to #0.5. Two values are given: (1)the EHH average, corresponding to the weighted average forall core haplotypes of the distance at which EHH decays to#0.5 and (2) the EHH maximum, the distance correspondingto the core haplotype that decays to #0.5 at a greater distance.If a simulated segment finishes without EHH reaching a value#0.5, taking the chromosome length as L, we arbitrarily con-sider that the position will be at 2L.

Coalescent simulations: We tested the statistical power ofthe statistics under different demographic models by runningneutral coalescent simulations using the algorithm describedby Hudson (1990) and implemented in the ms package(Hudson 2002). This program generates coalescent trees fora given sample size, recombination rate, and a demographicscenario, implementing an infinite-sites mutation model thatleads to biallelic sites.

There is an intense debate on how to best performsimulations and, specifically, on the suitability of runningcoalescent simulations by fixing either the number of segre-gating sites (S) or the population mutation rate u ¼ 4Nm(Hudson 1993; Wall and Hudson 2001; Depaulis et al. 2001,2005). Conditioning on u has the disadvantage that its valuehas to be estimated. Furthermore, even if its true value couldbe known, it produces broader confidence intervals, thusreducing the power of tests (Depaulis et al. 2005). On theother hand, although forcing a given number of segregatingsites in all trees without considering their particularities(such as branch length) is also unrealistic, conditioning onthe number of segregating sites has the advantage that S is a

556 A. Ramırez-Soriano et al.

Page 3: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

parameter that can be observed in the sample. To solve thisproblem, several strategies have been proposed to obtainrealistic samples conditioning on both u and S (Hudson 1993;Depaulis et al. 2001, 2003, 2005; Wall and Hudson 2001;Przeworski 2002). Different authors agree that simulatedparameters are more accurate if the simulations are condi-tioned on S taking into account the uncertainty of u (Tavare

et al. 1997; Pritchard et al. 1999).However, obtaining neutral models fixing the number of

segregating sites—which can be directly obtained from thesample—or estimating u from S are still widely used byresearchers (Macdonald and Long 2005; Soejima et al. 2005;Stajich and Hahn 2005; Tarazona-Santos and Tishkoff

2005). Taking into account this popularity, we have conditionedour simulations on S and the u estimator uW after propervalidation of this approach (see below). For simulations con-ditioned on the number of segregating sites, S values were set to10, 100, and 400. This corresponds to the rounded minimum,average, and maximum segregating sites found in the genesresequenced by SeattleSNPs (http://pga.gs.washington.edu/;Crawford et al. 2005), the largest ongoing human resequenc-ing project, which currently contains sequences of a lengthof 3.5–71 kb for .300 genes obtained from 23 European–-American and 24 African–American individuals. For simulations

conditioned on uW (Watterson 1975), uW values correspond tothe S values used in the previous simulations. All the values havebeen fixed assuming a panmictic and stationary neutral popula-tion, which could cause incorrect power estimations for statistics,depending on the number of segregating sites.

To ascertain the validity of our approach, results forsimulations fixing S in expansions have been compared withresults obtained considering the uncertainty of u and using therejection algorithm (Tavare et al. 1997). Comparisons havebeen performed for all S values studied and for the minimum(0) and maximum (10�7) recombination values. Comparisonshows that the differences in estimates of nominal rejectionlevel between the two methods are very small. In fact, in 96% ofthe cases they are ,5%, and in no case do they reach values.15%. Moreover, these differences become even smaller withincreasing recombination rates (results not shown). In sum-mary, we use a methodology that is accurate for neutralsimulations (Depaulis et al. 2001; Wall and Hudson 2001;Ramos-Onsins et al. 2007) and for all our expansion models(Ramos-Onsins et al. 2007; materials and methods). How-ever, our approximate method can produce deviations whencomputing statistical power under other alternative models,such as the contraction and bottleneck models (Ramos-Onsins

et al. 2007). The magnitude of these deviations depends on the

TABLE 1

Definition of the neutrality statistics used

Test Definition Reference

Class ITajima’s D Comparison of estimates of the no. of segregating sites and the

mean pairwise difference between sequencesTajima (1989a)

Fu and Li’s D (D F) and D* Comparison of the number of derived singleton mutations andthe total number of derived nucleotide variants (the asteriskindicates ‘‘without an outgroup’’)

Fu and Li (1993)

Fu and Li’s F and F* Comparison of the number of derived singleton mutations andthe mean pairwise difference between sequences (the asteriskindicates ‘‘without an outgroup’’)

Fu and Li (1993)

Fay and Wu’s H Comparison of the number of derived segregating sites at lowand high frequencies and the number of variants atintermediate frequencies

Fay and Wu (2000)

R2 Comparison of the difference between the number of singletonmutations and the average number of nucleotide differences

Ramos-Onsins andRozas (2002)

Class IIFu’s Fs Based on Ewens’ sampling distribution, taking into account

the number of different haplotypes in the sampleFu (1997)

Dh Based on the number of different haplotypes in the sample Nei (1987)EHH average Weighted average for all core haplotypes of the position at

which the haplotype homozygosity decays to #0.5Based on Sabeti et al.

(2002)EHH maximum The position corresponding to the core haplotype that decays

to #0.5 at a greater distanceBased on Sabeti et al.

(2002)Wall’s B Counts the number of pairs of adjacent segregating sites that

are congruent (if the subset of the data consisting of the twosites contains only two different haplotypes)

Wall (1999)

Wall’s Q Adds the number of partitions (two disjoint subsets whoseunion is the set of individuals in the sample) induced bycongruent pairs to Wall’s B

Wall (2000)

Kelly’s ZnS Average of the squared correlation of the allelic identity betweentwo loci over all pairwise comparisons

Kelly (1997)

Rozas’ ZA Average of the squared correlation of the allelic identity betweentwo loci over adjacent pairwise comparisons

Rozas et al. (2001)

Rozas’ ZZ Comparison between ZnS and ZA Rozas et al. (2001)

Neutrality Tests, Demography and Recombination 557

Page 4: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

particular statistic and the parameters of the model. Deviationfrom the exact strategy has been evaluated partially, and ourresults indicate that, in the studied cases, the deviation in thestatistical power is not large.

Recombination: Recombination rates were set to r ¼ 10�10,r ¼ 10�8, and r ¼ 10�7 per nucleotide pair; as simulations arescaled in units as 4N generations, assuming Ne ¼ 10,000 forhumans (Takahata et al. 1995), this would correspond topopulation recombination rates of R ¼ 4Nr equal to 4 3 10�6,4 3 10�4, and 4 3 10�3 per nucleotide, respectively. Thesevalues correspond to the rounded minimum, average, andmaximum values estimated by Kong et al. (2002) for the hu-man genome. Simulations without recombination were alsoperformed. To calculate the crossover probability, we haveassumed sequence lengths of 3000, 21,000, and 72,000 bp, asthese lengths correspond to the minimum, average, and max-imum rounded lengths of the resequenced human genes inSeattleSNPs, and Ne ¼ 10,000.

Demography: We have simulated four basic demographicmodels: stationarity, sudden population growth, sudden pop-ulation contraction, and population bottleneck, with empha-sis on the two first cases. All scenarios consider differentrecombination rates. For each scenario, we ran 10,000 simu-lations. The significance level was set to 0.05, and the criticalvalue for each statistic was obtained from the empiricaldistribution of the corresponding neutral model. As muta-tions were simulated under an infinite-sites model (whichimplies no recurrent mutation), the outgroup (for thosestatistics that require it) was set to a string of 0’s, where 0 isthe ancestral state as coded by ms.

The sudden population growth model (Rogers andHarpending 1992) assumes that a population of size N0 inequilibrium experienced a sudden growth and reached maxi-mum size (Nmax) Te generations before present. In the simula-tions, time is scaled in units of 4Ne generations. Changes inpopulation size are performed according to the standardprocedures in the ms program. Expansion times have been setto range from 0.0075 to 0.2. In humans, taking again Ne ¼10,000 and a generation time of 20 years, the simulated ex-pansion times would range from 6000 to 160,000 years ago.The latter can be taken as the earliest estimate for a hu-man expansion (Jobling et al. 2004). Since some statisticsshowed significant power at that earlier expansion time, and toproduce results applying to other species, some additionalexpansion times were added at Te¼ 0.4, Te¼ 0.6, Te¼ 0.8, andTe ¼ 1 (equivalent, respectively, to 16,000, 24,000, 32,000, and40,000 generations and, in human terms, 320,000, 480,000,640,000, and 800,000 years ago). The degree of expansion(De ¼ Nmax/N0) has been set to De ¼ 10 and De ¼ 100.

The sudden contraction model considers a population ofconstant size N0, which has instantaneously contracted to a sizeNmin; the contraction degree is defined as De ¼ Nmin/N0. Thismodel has been simulated as the opposite of our expansionmodel and thus the same Ne and Te have been used.Contraction degree has been set to Dc ¼ 0.1 and Dc ¼ 0.01.We have performed simulations for S ¼ 10 and S ¼ 100.

The bottleneck model has been simulated as in Voight et al.(2005). It assumes a population of size NA, which has beensuddenly reduced to a second size, b � NA for Tdur generationsTstart generations ago, where b is the bottleneck severity.Immediately afterward, the population has instantaneouslyrecovered its original size, NA. As in the expansion model, NA

has been set to 10,000. According to the results by Voight et al.(2005), we have simulated five bottleneck severities (b): 0.4,0.1, 0.05, 0.01, and 0.005, and for each we have simulateddurations of 0.01, 0.02, 0.03, and 0.04 (400, 800, 1200, and1600 generations). Tstart has been set to 0.02, 0.04, 0.08, and0.12 (800, 1600, 3200, and 4800 generations, respectively).

Additional points have been simulated to better illustrate thismodel in figures.

RESULTS

Exploring parameter space: We have tested the powerof 16 statistics under the different conditions of samplesize (n), number of segregating sites (S), neutralmutation parameter (u), demography, and recombina-tion rates. Although a wide range of parameters havebeen studied, only the most interesting cases are shown.Full results are provided as supplemental data A–D.Supplemental data A and B show all results with correctrecombination rates in the neutral model for theexpansion model (A) and the contraction and bottle-neck (B). Supplemental data C and D show results formispecified recombination rates in the neutral modelfor expansions (C) and contractions and bottlenecks(D). As expected, larger sample sizes improve the powerof the statistics, so we have plotted all power curves withn ¼ 100; curves for other sample sizes can be found inthe supplemental data files. Also, we have fixed thedegree of expansion, De, to 10 in all expansion figures,as De ¼ 100 produced uniformly maximal power for allstatistics. For figures involving contraction, the degreeof contraction, Dc, has been fixed to 0.1, and bottlenecksare shown with a severity of b ¼ 0.01. Other parametervalues fixed in some figures have also been chosen toavoid saturation.

The power of statistics has been shown for the tail forwhich each statistic shows power under its correspond-ing model. Since a population-growth event will causean excess of derived low-frequency nucleotide variantsand of the number of haplotypes and a reduction oflinkage disequilibrium values, this tail corresponds inexpansions to the left one for all statistics except for Dh(in the simulations fixing S) and Fay and Wu’s H. Thesame happens for most bottlenecks, but not for pop-ulation contractions or recently finished bottleneckswhere tests show power at the right tail of the distribu-tion (except Dh and H). Only some of the statistics havebeen represented to make graphs clearer. Of all Fu andLi’s statistics, only F* is shown, as it is the one withgreatest power (although for simulations based on S, ithas the same power as F). However, considering that inthe contraction model D and D* are more powerful, Dhas been chosen for these simulations. Of Wall’s B andQ, we represent only the statistic that shows more powerunder each circumstance, as with the EHH average andEHH maximum. We also leave ZZ out, as it is simply thedifference between ZnS and ZA and therefore can beeasily inferred from them. In expansions, only Te from0.0075 to 0.2 and Te ¼ 1 has been represented, as ingeneral at Te ¼ 0.4 tests have reached a power of �0.5.

Statistical power for different demographic events:Figure 1 shows the statistical power of the tests understudy in the absence of recombination and for different

558 A. Ramırez-Soriano et al.

Page 5: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

times since the expansion. Expansion ages range fromTe ¼ 0.0075 (300 generations) to Te ¼ 1 (40,000 gen-erations). Our results reproduce those from Ramos-Onsins and Rozas (2002). For all statistics, powerincreases with the number of segregating sites (Figure1, A–C), especially in class II statistics, which are basedon haplotype structure. In fact, as shown in Ramos-Onsins and Rozas (2002), Fu’s Fs is the best-performingstatistic for large sample sizes; however, ZnS and ZA

statistics become more powerful when considering 100or more segregating sites, especially to detect expan-sions older than Te ¼ 0.05 (2000 human generations).This can be explained by a saturation effect, such as thesaturation of the number of haplotypes due to a highnumber of segregating sites with respect to n (Nei

1987). Figure 1, D–F, shows the effect of increasing u

upon statistical power. Simulations based on u show thesame patterns as when conditioning on S; the maindifferences reside in the behavior of Dh, which does notshow power at the right tail of the distribution (as wewould expect; see Table 2), but rather at the left tail.Moreover, the pattern of Dh’s power is also particular, asit has its maximum at very recent times and a decay thatbecomes more pronounced as u increases. For very lowvalues of u, most genealogical trees have zero or onlyone mutation, which greatly affects the power of any

haplotype-based statistic. In both kinds of simulations(fixing either S or u), the lowest values (S ¼ 10 and u ¼1.93) show power curves that are different from thoseobtained for a higher polymorphism level, the reasonfor this behavior being that low polymorphism leads toimprecise statistics.

As for the effect of the elapsed time since the expan-sion, in both normal and high-polymorphism scenariosmost well-performing statistics (statistics with powers thatare normally .0.4, which excludes EHH estimators andFay and Wu’s H) reach their maximum power betweenTe ¼ 0.04 and Te ¼ 0.05. Exceptions to this rule are Fuand Li’s tests (D, F, D*, and F*; only F* is shown in Figure1), which have maximum power at shorter times (Te ¼0.01–0.02), and ZnS, which does not reach its maximumpower until Te ¼ 0.1. Power decays with time since theexpansion, but does it differently for class I and class IIstatistics: while class I statistics decay rapidly after reach-ing their maximum power, class II statistics have gentlerslopes (again considering only well-performing statisticsin medium and high-polymorphism scenarios). However,Fu’s Fs and R2 always decay in a very similar way. At Te¼ 1all statistics tend to 0.05 for simulations based on S, thusreaching the nominal type I error rate.

As expected (see Table 2), in the sudden contractionmodel the power of the neutrality tests behaves oppo-

Figure 1.—Power of the test depending on the time elapsed since the expansion, without recombination. N¼ 100, De¼ 10. (A)S ¼ 10. (B) S ¼ 100. (C) S ¼ 400. (D) u ¼ 1.93. (E) u ¼ 19.31. (F) u ¼ 77.26.

Neutrality Tests, Demography and Recombination 559

Page 6: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

site than in expansions. The most powerful tests areFu and Li’s D* and Fs. Maximum power to detectpopulation contractions is influenced mainly by thenumber of segregating sites and can be found betweenTc ¼ 0.1 (S ¼ 100) and Tc ¼ 0.4 (S ¼ 10). More detailscan be found in the supplemental Results and supple-mental Figure 1.

Bottlenecks have different outcomes depending onwhether or not several lineages have survived thebottleneck stage without coalescing. Thus, the effectsof severity and the age of the perturbation are non-monotonic and tests have power at both tails of thedistribution (see Table 2). For old (Tstart ¼ 0.04–0.08),strong (b ¼ 0.05 or stronger), and long-lasting bottle-necks (after which most lineages will have coalesced),the most powerful statistics are R2, Fs, and ZnS. However,

they lack the power to detect bottlenecks that have justfinished; as in this case, the fact that all sequences havecoalesced during the bottleneck produces a very shallowgenealogy and thus the statistics will behave similarly.On the contrary, in the case of weak, recent (Tstart ¼0.02–0.04) and recently finished bottlenecks, the mostpowerful tests are class I tests and Fu’s Fs. More detailscan be found in the supplemental Results and insupplemental Figure 2.

The very different genealogy shapes that can beproduced by bottlenecks (Table 2) dramatically affectthe variance of the power of certain tests, especially inintermediate bottlenecks, which, depending on therun, can lead to either shallow or deep genealogies.To study this effect, we have calculated the variance ofthe statistics for each set of bottleneck parameters. We

TABLE 2

Description of tree topologies in different demographic scenarios

Scenario Characteristics Tree topology

Neutrality Trees have regular neutral topologies.

Expansion Trees have a star-like topology, with long externalbranches which accumulate a greater number oflow-frequency mutations than expected. This leadsto a deficit of (a) most class I statistics and (b) LDstructure, including en excess of haplotypes. In thiscase, most statistics are expected to show power atthe right tail of the distribution.

Contraction Trees have long internal branches, which results in agreater number of intermediate-frequency mutations thanexpected. This leads to an excess of (a) most class Istatistics and (b) LD structure that produces a deficit ofhaplotypes. In this case, most statistics are expected toshow power at the left tail of the distribution.

Weak bottleneck Several lineages can escape the bottleneck withoutcoalescing and thus provide trees with long internalbranches. This produces an excess of (a) most class Istatistics and (b) LD structure. Most tests how powerat the left tail of the distribution.

Strong bottleneck It is most likely that no lineages survive the bottleneckwithout coalescing, resulting in star-like trees. This leadsto a deficit of (a) most class I statistics and (b) LDstructure. Most statistics will show power at the right tailof the distribution.

560 A. Ramırez-Soriano et al.

Page 7: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

focused on bottleneck severity and studied how itmodifies variance when other parameters are equal(supplemental data E). Variance patterns are quitesimilar for all n and S simulated values, and in generaltake maximum values around severities of b ¼ 0.05,ranging from b ¼ 0.01 to b ¼ 0.1. This moment ofmaximum variance depends mainly on the time of onsetand duration of the bottleneck, approaching b¼ 0.01 inrecent and short bottlenecks and b¼ 0.1 in old and long-lasting ones. Moreover, some tests are more sensitive tochanges in the bottleneck parameters Tstart and Tdur. Fuand Li’s tests and Fs tend to reach maximum variancevalues for less severe bottlenecks later than other tests(that is, for older and more long-lasting bottlenecks),while R2 and Fay and Wu’s H behave in an oppositeway. Variance patterns seem to correlate inversely to thepower of the statistics seen for the different bottleneckseverities, as tests show maximum powers for severities$0.05 while maximum variances correspond to severi-ties #0.05. This could be due to the fact that less severebottlenecks leave a much weaker signature.

Recombination: Figure 2 shows the effects of re-combination rate on statistical power in populationexpansions. While for S ¼ 10 recombination has hardlyany effect, for higher values of S the power increasesdramatically between low recombination levels (0 and10�10) and high recombination levels (10�8 and 10�7).For low recombination values (Figure 2), in most cases

power is not affected, although under some circum-stances (e.g., Fu and Li’s D and D* for S ¼ 100; data notshown) power can be lower for r ¼ 10�10. In contrast,most statistics improve their power under high recom-bination, with the exceptions of Dh, EHH, and, forancient expansions, of Fs and ZnS, which have a tendencyto decrease their power in the interval between r ¼ 10�8

to r¼ 10�7. It is noteworthy that, for ancient expansionsand high recombination values, Fay and Wu’s H showsincreased power, reaching values .0.9 for S ¼ 400(results not shown). Simulations based on u showsimilar patterns to those based on S (results not shown).As for population contractions and bottlenecks, thechanges in the power of the tests in the presenceof recombination are very similar to those found inexpansions (more details in the supplemental Resultsand supplemental Figures 3 and 4, respectively).

Recombination shuffles nucleotide variation, increas-ing the number of haplotypes through the creation ofnew recombinant ones. Thus, it is expected to stronglyaffect the mean of haplotype-based tests and also toreduce the variance of both kinds of statistics (Figure3A). Changes in the power curve reflected in Figure 2may therefore be due to two nonmutually exclusivefactors: (i) an actual increase in the amount of infor-mation in the sequence that detects population ex-pansions when recombination is acting and (ii)recombination shifting the distribution of neutrality

Figure 2.—Power of the test in the expansion model depending on recombination rates. N ¼ 100, De ¼ 10. (A) S ¼ 10, Te ¼0.02. (B) S ¼ 100, Te ¼ 0.02. (C) S ¼ 10, Te ¼ 0.15. (D) S ¼ 100, Te ¼ 0.15.

Neutrality Tests, Demography and Recombination 561

Page 8: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

statistics. To assess the relative weights of these twoexplanations, we have compared the constant-popula-tion model without recombination with the constant-population model with high recombination. As shownin Figure 3B (see also supplemental data F), Fs and ZnS

are very sensitive to recombination (left tail of the

distribution), while Dh and ZZ are sensitive at the righttail. Both EHH estimators are also able to detectrecombination (left tail) but with less power (,0.3).Thus, these statistics are liberal for detecting a popula-tion expansion acting on a recombining sequence andmay often produce false positives.

Figure 3.—(A) distribution of the values of Tajima’s D, Fu and Li’s F, R2, and Kelly’s ZnS under the neutral model and theexpansion model at Te ¼ 0.02 without recombination and with r ¼ 10�7. (B) Power of the tests to detect recombination inthe null model for the right and left tails of the distribution. (A and B) S ¼ 100, n ¼ 100.

562 A. Ramırez-Soriano et al.

Page 9: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

Unknown or misspecified recombination: Undersome circumstances, recombination rates cannot betaken into account when testing for demographic orselective events. Indeed, the default option in Arlequin3.0 (Excoffier et al. 2005) or DnaSP (Rozas et al. 2003),the two standard software packages for genetic analysis,is to compute the significance of neutrality tests withoutrecombination, although DnaSP will also produce nulldistributions with any population recombination ratesupplied by the user. When estimates of recombinationrates are available, it is important to consider that theymay be over- or underestimates. For example, Kong

et al. (2002) measured recombination rates in humanpedigrees at intervals of median length �350 kb.Considering that it has been estimated that there is ahotspot every 50 kb (Myers et al. 2005), such intervalswill most likely contain recombination hotspots, andtherefore the recombination rate estimated for thewhole interval will be much greater than the real onefor most parts of the region. On the other hand,hotspots may also result in a recombination reachingsaturation and thus to an underestimate of the averagerecombination rate in the region.

To investigate the potential errors caused by over-or underestimation of recombination, we compared thestatistical power of tests when true recombination val-ues were assumed vs. the power of the same tests as-suming erroneous rates. Figure 4A shows the difference

between the apparent power of a test (that is, comparingthe constant-size null hypothesis without recombinationwith the population-expansion alternative hypothesisusing the true recombination value) and its real power(that is, comparing the null hypothesis and the alterna-tive hypotheses using the true recombination value).When actual recombination rates are small (10�10) andthus underestimation is not too serious (e.g., using as anull hypothesis the constant nonrecombining modelwhen the actual rate is r ¼ 10�10), the true and theapparent statistical power are not appreciably different.However, for larger underestimates (when the real ratesare r ¼ 10�8 or 10�7), class I statistical tests become con-servative (with an increase in type II error) whereassome class II statistics (Dh, Fu’s Fs, EHH, and ZnS)become liberal (with an increase in type I error). Inpopulation contractions, class I statistics behave as inexpansions, while most class II statistics are liberal. Asexpected, strong bottlenecks tend to behave as expan-sions while weak ones tend to behave as contractions(see Table 2; more details are in the supplementalResults and supplemental Figures 5 and 6 for contrac-tions and bottlenecks, respectively).

Figure 4B shows the difference between the apparentpower of the tests when assuming an overestimatedrecombination of 10�7 on the constant-size null hypoth-esis and their true power (that is, using true recombi-nation values to compare the null hypothesis and

Figure 4.—Error madeby statistics when testingfor expansions when recom-bination is under- or overes-timated in the null model.S ¼ 100, n ¼ 100, De ¼ 10,Te ¼ 0.15. The cartoonshows how this error is cal-culated: first, we calculatedthe real power of the statis-tic, comparing the nullhypothesis with the alterna-tive hypothesis with thesame recombination values.Afterward, we calculatedthe probability of rejectingthe null hypothesis when re-combination has been erro-neously estimated, that is,when the recombinationrate used to generate thenull hypothesis is differentfrom that of the alternativehypothesis. The error madeis the latter (apparent power)minus the real power. (A)The apparent power of thenull hypothesis was pro-duced without recombina-tion. (B) The apparentpower of the null hypothesishas a recombination rate of10�7.

Neutrality Tests, Demography and Recombination 563

Page 10: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

alternative hypotheses). The effect of recombinationoverestimation shows the opposite pattern, and there-fore all tests are liberal with the exception of Dh, Fu’sFs, EHH, and ZnS, which become conservative. The

difference in behavior between these class II and classI statistics (Figure 3) can be due to the fact that theformer are highly dependent on the actual recombi-nation rates (Wall 2000). As in the case of under-

TABLE 3

Sudden expansion model decision table

No recombination Recombination (108)

Te Strong (De ¼ 0.01) Weak (De ¼ 0.1) Strong (De ¼ 0.01) Weak (De ¼ 0.1)

Small S (�10), small n (�20)#0.01 Fs, R2, D* Fs, R2, ZnS, D* R2, F, D*, F* R2

0.01–0.15 R2, Fs Fs, R2, ZnS R2 R2

$0.15 Fs, R2, ZnS Fs, R2, ZnS R2 R2

Small S (�10), large n (�100)#0.01 Fs, R2, D, F, F* F and F* R2, D, F, D*, F* DF

0.01–0.05 Fs, R2, D Fs R2, D DF

$0.05 Fs Fs R2, D DF

Large S (�100 or more), small n (�20)#0.01 Fs, R2, D, DF, F, D*, F*, ZnS, ZA, Dha Fs, Dha R2, D, DF, F, D*, F* R2

0.01–0.03 Fs, R2, D, DF, F, D*, F*, ZnS, ZA Fs R2, D, DF, F, D*, F* R2

0.03–0.05 R2, D, ZnS R2, ZnS R2, D, DF, F, D*, F* R2

.0.05 ZnS ZnS R2 R2

Large S (�100 or more), large n (�100)#0.02 Fs, R2, D, DF, F, D*, F*, ZnS, ZA Fs, DF, F, D*, and F* R2, D, DF, F, D*, F* DF, F, D*, F*0.02–0.05 Fs, R2, D, DF, F, D*, F*, ZnS Fs R2, D, DF, F, D*, F* DF, F, D*, F*0.02–0.15 Fs, R2, D, ZnS Fs, ZnS R2, D R2, D$0.15 ZnS ZnS R2, D R2, D

Power is at the left tail of the distribution unless otherwise indicated. DF, Fu and Li’s D.a Power is at the right tail of the distribution.

TABLE 4

Sudden contraction model decision table

No recombination Recombination (108)

Tc Strong (Dc ¼ 0.01) Weak (Dc ¼ 0.1) Strong (Dc ¼ 0.01) Weak (Dc ¼ 0.1)

Small S (�10), small n (�20)#0.10 D* D* D* D*0.10–0.40 Fs, D* Fs, D* D* D*0.40–0.80 Fs, ZnS, ZA Fs, ZnS, ZA D* D*.0.80 Fs, ZnS, ZA, B, Q Fs, ZnS, ZA, B, Q, Dha D* D*

Small S (�10), large n (�100)#0.20 D F D F D F D F

0.20–0.60 ZnS ZnS, B D F D F

$0.60 ZnS, ZA, B, Q, Fs, D F ZnS, ZA, B, Q, Fs, D F D F D F

Large S (�100 or more), small n (�20)#0.10 Fs, Dha Fs, Dha D F D F

0.10–0.40 Fs, Dha Fs, Dha D F, F, F* D F, F, F*$0.40 Dha Dha D F, D* D*

Large S (�100 or more), large n (�100)#0.02 D F, D* D F, D* D F, D* D F, D*0.02–0.10 Fs Fs F, F* F, F*0.10–0.40 Fs, Dha Fs F, F*, R2, D F, F*0.40–0.80 Fs, ZnS, ZA, B, Q, Dha ZnS, ZA D F, D* D F

.0.80 Fs, D F, D*, ZnS, ZA, B, Q ZnS, ZA, B, Q D F, D* F, F*

Power is at the right tail of the distribution unless otherwise indicated. D F, Fu and Li’s D.a Power is at the left tail of the distribution.

564 A. Ramırez-Soriano et al.

Page 11: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

estimation of recombination, when recombination isoverestimated in a scenario of population contraction,class I tests behave similarly than in expansions. This isnot the case for class II tests, which became liberal.Again, strong bottlenecks behave similarly to expan-sions while weak ones behave more like contractions(Table 2; more details are in the supplemental Resultsand supplemental Figures 5 and 6 for contractions andbottlenecks, respectively).

DISCUSSION

Power of neutrality tests: We have examined statis-tical power in detecting a sudden population expan-sion, a sudden contraction, or a bottleneck analyzingDNA polymorphism data by means of a wide range ofstatistics. The most powerful tests are those belong-ing to class II, that is, those based on haplotypefrequencies. Within those, Fu’s Fs and ZnS performbest although, for small sample sizes, R2 is alsorecommended. Class II tests, however, are stronglyaffected by recombination, particularly Fs and ZnS,which not only lose power under high recombinationrates, but also become significant with recombina-tion events when no expansion has taken place andthe population has remained constant. For this rea-son, when recombination is suspected to be actingon a sample, and especially if there is risk of over-

or underestimating it, it is not advisable to use class IItests and, thus, R2, Tajima’s D or Fu and Li’s testsshould be used instead. A similar situation, with classII statistics being more powerful than class I, can befound for contractions, although tests have power atthe opposite tail of the distribution and they start toperform well at larger times and for longer S. However,in the case of bottlenecks, class I tests are best as ageneral rule, with the exception of Fs and ZnS in someparticular cases.

Combinations of the different tests and the tail atwhich they show power can therefore provide a roughidea about the demographic event acting over a pop-ulation and about the time and strength of this event, aswell as about other factors such as recombination.

The effect of recombination on the power of thetests: As discussed above, recombination is expected tohave some effects on the power of statistical tests based onthe allele frequency spectrum and to reduce the power oftests that rely on haplotype-based statistics (see, for ex-ample, Quesada et al. 2006 and references therein). Weperformed a detailed series of simulations suggestingthat high recombination rates generally improve thepower of most statistical tests, although some class II tests(Dh, EHH, Fs, and ZnS) lose power for the maximumrecombination rates considered in this study, especiallyfor old expansions. Furthermore, even when recombi-nation is considered in the null constant-population

TABLE 5

Bottleneck decision table

No recombination Recombination (r8)

Tstart Strong (b ¼ 0.01) Weak (b ¼ 0.1) Strong (b ¼ 0.01) Weak (b ¼ 0.1)

Small S (�10), small n (�20)0.02 D, F, D*, F*, R2, Fs D*a, Fs

a Da, Fa, D*a, F*a, R2a Da, Fa, D*a, F*a, R2

a

0.04 D, R2, Fs Fsa, ZnS

a, ZAa Da, R2

a Da, Fa, D*a, F*a, R2a

0.08–0.12 R2, Fs D, D*, F*, R2, Fs, ZnS R2 D, R2

Small S (�10), large n (�100)0.02 D, F, F*, R2, Fs, Dh Da, D Fa, Fs

a D*, Dh Da, D Fa, R2a

0.04 D, R2, Fs Fsa, Ba, ZnS

a, ZAa D, D*, R2 Da, D Fa,b, R2

a

0.08 D, R2, Fs D, F, R2, Fs, Dh D, R2 D F

0.12 Fs D, R2, Fs D, R2 D F

Large S (�100 or more), small n (�20)0.02 D, D F, F, D*, F*, R2, Fs D*a, Fs

a Da, D Fa, Fa, D*a, F*a, R2a D Fa, Fa, D*a, F*a

0.04 D, D F, F, D*, F*, R2, Fs, ZnS, ZA Fsa,b, Ba, Qa, ZnS

a, ZAa Da, D Fa, Fa, D*a, F*a, R2

a Da, D Fa, Fa, F*a, R2a

0.08 D, R2, ZnS, ZA R2, Fs D, D F, F, D*, F*, R2 D, R2

0.12 R2, ZnS R2 D, R2 D, R2

Large S (�100 or more), large n (�100)0.02 D, F, D*, F*, R2 D*a, Fs

a D Fa, D*a Da, D Fa,b, Fa,b, D*a,b, F*a,b, R2a

0.04 D, D F, F, D*, F*, R2, Fs, ZnS, ZA Fsa, Ba, Qa, ZnS

a, ZAa Da, D Fa, Fa, D*a, F*a, R2

a Da, D Fa,b, Fa,b, D*a,b, F*a,b, R2a

0.08 D, D F, F, D*, F*, R2, ZnS, ZA Fs D, D F, F, D*, F*, R2 D F, F, D*, F*, R2

0.12 D, R2, ZnS Fs D, F, F*, R2 D, F, F*, R2

Power is at the left tail of the distribution unless otherwise indicated. D F, Fu and Li’s D.a Power is at the right tail of the distribution.b Only for finished bottlenecks.

Neutrality Tests, Demography and Recombination 565

Page 12: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

models, those tests have great power to ‘‘detect recombi-nation’’ in a sample. This efficiency in recognizing re-combination can be easily explained as recombinationmodifies the mean and reduces the width of the distribu-tion of class II statistics. Therefore, careful attentionshould be paid to the interpretation of those tests whenrecombination is suspected to have shaped the genealogyof the sampled sequences. We have also examined theeffect of using mistaken recombination rates, becausetheir estimation can be a problem for most organisms. Inthe case of humans, actual recombination maps madethrough the genotyping of 5136 microsatellite markersfor 146 families, with a total of 1257 meiotic events, areavailable at an �350-kb resolution (Kong et al. 2002).Beyond that level, current efforts are aimed at pinpoint-ing hotspots and recombination deserts as inferred fromlinkage disequilibrium patterns (McVean et al. 2004;Fearnhead and Smith 2005; Myers et al. 2005). Wall

(1999) observed that larger sequence sizes with high re-combination decreased the power of tests and suggestedthat this was due to the difference between the recom-bination rates of the null and the alternative hypotheses.Our results show that severe over- or underestimations ofrecombination have large impacts onpower.Whenrecom-bination is underestimated, the power of tests decreases inexpansion and bottlenecks, supporting the results inWall (1999). In contrast, when recombination is over-estimated, type I errors increase, leading to a spuriousgain in the power of the tests. Since class II ZnS, Fu’s Fs,Dh, and EHH statistics are very sensitive to recombina-tion, they have an opposite behavior. Considering allrecombination results together, a conservative recom-bination rate should be used in the null hypothesis whenusing class II statistics—if recombination could be actingover the sample, for example, it would be recommended touse a lower bound or no recombination if there is a deficitof haplotypes or an upper bound if there is an excess.

Recommendations for test usage arising from ourresults are summarized in Tables 3–5.We hope that theyhelp to produce guidelines for a rational choice amongthe wide variety of neutrality tests available.

We thank J. Alegre and R. Sangros for their helpful advice andsupport in programming, Tomas Marques-Bonet for his comments,and G. Berniell for her revision of the manuscript. This research wasfunded by grant BFU2006-15413-C02-01 to A.N. and BFU2004-02002to F.C. from the Spanish Ministry of Science and Technology, by theGenoma Espana/Genome Canada joint R1D1I projects, and by theNational Institute of Bioinformatics (http://www.inab.org), a platformof Genoma Espana. S.E.R-O. acknowledges the support of Generalitatde Catalunya (Distincio per la Promocio de la Recerca Universitaria)to Montserrat Aguade.

LITERATURE CITED

Charlesworth, B., M. T. Morgan and D. Charlesworth,1993 The effect of deleterious mutations on neutral molecularvariation. Genetics 134: 1289–1303.

Crawford, D. C., D. T. Akey and D. A. Nickerson, 2005 The pat-terns of natural variation in human genes. Annu. Rev. GenomicsHum. Genet. 6: 287–312.

Depaulis, F., S. Mousset and M. Veuille, 2001 Haplotype tests us-ing coalescent simulations conditional on the number of segre-gating sites. Mol. Biol. Evol. 18: 1136–1138.

Depaulis, F., S. Mousset and M. Veuille, 2003 Power of neutralitytests to detect bottlenecks and hitchhiking. J. Mol. Evol. 57(Suppl. 1): 190–200.

Depaulis, F., S. Mousset and M. Veuille, 2005 Detecting selectivesweeps with haplotype tests, pp. 34–54 in Selective Sweep, edited byD. Nurminsky. Landes Bioscience, Georgetown, TX.

Donnelly, P., and S. Tavare, 1995 Coalescents and genealogicalstructure under neutrality. Annu. Rev. Genet. 29: 401–421.

Excoffier, L., G. Laval and S. Schneider, 2005 Arlequin ver. 3.0:an integrated software package for population genetics data anal-ysis. Evol. Bioinform. Online 1: 47–50.

Fay, J. C., and C. I. Wu, 2000 Hitchhiking under positive Darwinianselection. Genetics 155: 1405–1413.

Fearnhead, P., and N. G. Smith, 2005 A novel method with im-proved power to detect recombination hotspots from polymor-phism data reveals multiple hotspots in human genes. Am. J.Hum. Genet. 77: 781–794.

Fisher, R. A., 1930 The Genetical Theory of Natural Selection. ClarendonPress, Oxford.

Fu, Y. X., 1996 New statistical tests of neutrality for DNA samplesfrom a population. Genetics 143: 557–570.

Fu, Y. X., 1997 Statistical tests of neutrality of mutations against pop-ulation growth, hitchhiking and background selection. Genetics147: 915–925.

Fu, Y. X., and W. H. Li, 1993 Statistical tests of neutrality of muta-tions. Genetics 133: 693–709.

Fu, Y. X., and W. H. Li, 1999 Coalescing into the 21st century: anoverview and prospects of coalescent theory. Theor. Popul. Biol.56: 1–10.

Haddrill, P. R., K. R. Thornton, B. Charlesworth and P.Andolfatto, 2005 Multilocus patterns of nucleotide variabilityand the demographic and selection history of Drosophila mela-nogaster populations. Genome Res. 15: 790–799.

Hein, J., M. H. Schierup and C. Wiuf, 2005 Gene Genealogies, Varia-tion and Evolution: A Primer in Coalescent Theory. Oxford UniversityPress, London/New York/Oxford.

Hudson, R. R., 1990 Gene genealogies and the coalescent pro-cess, pp. 1–44 in Oxford Surveys in Evolutionary Biology, edited byJ. Antonovics and D. Futuyama. Oxford University Press,Oxford.

Hudson, R. R., 1993 The how and why of generating gene geneal-ogies, pp. 23–36 in Mechanisms of Molecular Evolution, edited byN. Takahataand A. G. Clark. Sinauer Associates, Sunderland, MA.

Hudson, R. R., 2002 Generating samples under a Wright-Fisherneutral model of genetic variation. Bioinformatics 18: 337–338.

Jensen, J. D., Y. Kim, V. B. DuMont, C. F. Aquadro and C. D.Bustamante, 2005 Distinguishing between selective sweeps anddemography using DNA polymorphism data. Genetics 170: 1401–1410.

Jobling, M. A., M. E. Hurles and C. Tyler-Smith, 2004 HumanEvolutionary Genetics: Origins, Peoples and Disease. Garland Science,New York.

Kelly, J. K., 1997 A test of neutrality based on interlocus associa-tions. Genetics 146: 1197–1206.

Kimura, M., 1968 Genetic variability maintained in a finite popula-tion due to mutational production of neutral and nearly neutralisoalleles. Genet. Res. 11: 247–269.

Kingman, J. F. C., 1982a On the genealogy of large populations.J. Appl. Probab. 19A: 27–43.

Kingman, J. F. C., 1982b The coalescent. Stoch. Proc. Appl. 13: 235–248.

Kingman, J. F., 2000 Origins of the coalescent: 1974–1982. Genetics156: 1461–1463.

Kong, A., D. F. Gudbjartsson, J. Sainz, G. M. Jonsdottir, S. A.Gudjonsson et al., 2002 A high-resolution recombinationmap of the human genome. Nat. Genet. 31: 241–247.

Macdonald, S. J., and A. D. Long, 2005 Identifying signatures ofselection at the enhancer of split neurogenic gene complex inDrosophila. Mol. Biol. Evol. 22: 607–619.

566 A. Ramırez-Soriano et al.

Page 13: Statistical Power Analysis of Neutrality Tests Under ... · of 16 statistical tests to detect population expansions, contractions, and bottlenecks under different recombi-nation levels

McVean, G. A. T., S. R. Myers, S. Hunt, P. Deloukas, D. R. Bentley

et al., 2004 The fine-scale structure of recombination rate vari-ation in the human genome. Science 304: 581–584.

Myers, S., L. Bottolo, C. Freeman, G. McVean and P. Donnelly,2005 A fine-scale map of recombination rates and hotspotsacross the human genome. Science 310: 321–324.

Nei, M., 1987 Molecular Evolutionary Genetics. Columbia UniversityPress, New York.

Pritchard, J. K., M. T. Seielstad, A. Perez-Lezaun and M. W.Feldman, 1999 Population growth of human Y chromosomes: astudy of Y chromosome microsatellites. Mol. Biol. Evol. 16: 1791–1798.

Przeworski, M., 2002 The signature of positive selection at ran-domly chosen loci. Genetics 160: 1179–1189.

Quesada, H., S. E. Ramos-Onsins, J. Rozas and M. Aguade,2006 Positive selection versus demography: evolutionary infer-ences based on an unusual haplotype structure in Drosophilasimulans. Mol. Biol. Evol. 23: 1643–1647.

Ramos-Onsins, S. E., and J. Rozas, 2002 Statistical properties ofnew neutrality tests against population growth. Mol. Biol. Evol.19: 2092–2100.

Ramos-Onsins, S. E., S. Mousset, T. Mitchell-Olds and W.Stephan, 2007 Population genetic inference using a fixed num-ber of segregating sites: a reassessment. Genet. Res. 89: 231–244.

Rogers, A. R., and H. Harpending, 1992 Population growth makeswaves in the distribution of pairwise genetic differences. Mol.Biol. Evol. 9: 552–569.

Rozas, J., M. Gullaud, G. Blandin and M. Aguade, 2001 DNA var-iation at the rp49 gene region of Drosophila simulans: evolutionaryinferences from an unusual haplotype structure. Genetics 158:1147–1155.

Rozas, J., J. C. Sanchez-DelBarrio, X. Messeguer and R. Rozas,2003 DnaSP, DNA polymorphism analyses by the coalescentand other methods. Bioinformatics 19: 2496–2497.

Sabeti, P. C., D. E. Reich, J. M. Higgins, H. Z. Levine, D. J. Richter

et al., 2002 Detecting recent positive selection in the human ge-nome from haplotype structure. Nature 419: 832–837.

Sano, A., and H. Tachida, 2005 Gene genealogy and properties oftest statistics of neutrality under population growth. Genetics169: 1687–1697.

Schierup, M. H., and J. Hein, 2000 Consequences of recombina-tion on traditional phylogenetic analysis. Genetics 156: 879–891.

Soejima, M., H. Tachida, M. Tsuneoka, O. Takenaka, H. Kimura et al.,2005 Nucleotide sequence analyses of human complement 6 (C6)gene suggest balancing selection. Annu. Hum. Genet. 69: 239–252.

Stajich, J. E., and M. W. Hahn, 2005 Disentangling the effects of de-mography and selection in human history. Mol. Biol. Evol. 22: 63–73.

Tajima, F., 1989a Statistical method for testing the neutral mutationhypothesis by DNA polymorphism. Genetics 123: 585–595.

Tajima, F., 1989b The effect of change in population size on DNApolymorphism. Genetics 123: 597–601.

Takahata, N., Y. Satta and J. Klein, 1995 Divergence time andpopulation size in the lineage leading to modern humans. Theor.Popul. Biol. 48: 198–221.

Tarazona-Santos, E., and S. A. Tishkoff, 2005 Divergent patternsof linkage disequilibrium and haplotype structure across globalpopulations at the interleukin-13 (IL13) locus. Genes Immun.6: 53–65.

Tavare, S., D. J. Balding, R. C. Griffiths and P. Donnelly,1997 Inferring coalescence times from DNA sequence data.Genetics 145: 505–518.

Voight, B. F., A. M. Adams, L. A. Frisse, Y. Qian, R. R. Hudson et al.,2005 Interrogating multiple aspects of variation in a full rese-quencing data set to infer human population size changes. Proc.Natl. Acad. Sci. USA 102: 18508–18513.

Wall, J. D., 1999 Recombination and the power of statistical tests ofneutrality. Genet. Res. 74: 65–79.

Wall, J. D., 2000 A comparison of estimators of the population re-combination rate. Mol. Biol. Evol. 17: 156–163.

Wall, J. D., and R. R. Hudson, 2001 Coalescent simulations and sta-tistical tests of neutrality. Mol. Biol. Evol. 18: 1134–1135.

Watterson, G. A., 1975 On the number of segregating sites in ge-netical models without recombination. Theor. Popul. Biol. 7:256–276.

Williamson, S. H., R. Hernandez, A. Fledel-Alon, L. Zhu, R.Nielsen et al., 2005 Simultaneous inference of selection andpopulation growth from patterns of variation in the human ge-nome. Proc. Natl. Acad. Sci. USA 102: 7882–7887.

Wright, S., 1931 Evolution in Mendelian populations. Genetics 16:97–159.

Communicating editor: M. Veuille

Neutrality Tests, Demography and Recombination 567