review information theory broadens the spectrum of molecular ecology...

67
Review Information Theory Broadens the Spectrum of Molecular Ecology and Evolution W.B. Sherwin, 1,2, * A. Chao, 3 L. Jost, 4 and P.E. Smouse 5 Information or entropy analysis of diversity is used extensively in community ecology, and has recently been exploited for prediction and analysis in molec- ular ecology and evolution. Information measures belong to a spectrum (or q prole) of measures whose contrasting properties provide a rich summary of diversity, including allelic richness (q = 0), Shannon information (q = 1), and heterozygosity (q = 2). We present the merits of information measures for describing and forecasting molecular variation within and among groups, comparing forecasts with data, and evaluating underlying processes such as dispersal. Importantly, information measures directly link causal processes and divergence outcomes, have straightforward relationship to allele frequency differences (including monotonicity that q = 2 lacks), and show additivity across hierarchical layers such as ecology, behaviour, cellular processes, and nonge- netic inheritance. Information Theory Base for Evolutionary Genetics and Ecology Evolutionary ecology aims to make and test forecasts about the behaviour of variants, at all levels from molecular through species to landscapes, but until recently this eld has paid little attention to one of three main ways for doing this. Shannon information (or entropy) theory (see Glossary) is widely used to predict macropatterns from microparts, by methods such as nding the model that maximises the entropy [1], in elds as diverse as community ecology [2] and energy generation, where these methods compare favourably with alternative predictions [3]. Genes obviously carry information about evolutionary history, recent demography, and possible future trajectories [4,5] but information theory has rarely been used to investigate molecular evolution [69]. Shannons information index 1 H (=H in [10]), a fundamental compo- nent of information theory, is the most commonly used abundance-sensitive measure of species diversity within a community [11]: 1 H ¼ S s i¼1 p i lnp i Equation 1 where p i is the proportional abundance of the i th species in a community of S different species (Box 1; Box S1 which has denitions of all symbols). 1 H also applies to a population containing genetic variants (such as allelic types) [12], and can be thought of as the ability to spell out different messages by rearranging individual alleles (Box 1) with higher diversity, a greater range of messages can be spelt out. More formally, higher 1 H means that there is higher uncertainty about what type to expect when we randomly sample ONE single allele (coincidentally '1' is the superscript for 1 HBox S2). Historically, molecular ecology has quantied diversity with two alternative entropies: 0 H = S 1; and 2 H (Boxes 1, S1). 2 H is the chance of choosing TWO different allelic types from the population, called heterozygosity ( 2 H = H e , or for species in communities, called Trends Diversity of molecules or species is best summarised as a diversity prole. Such proles are useful in studies spanning bioinformatics to physical landscapes. Shannon information is a neglected but particularly informative part of the prole. Shannon now has robust theoretical background for molecular ecology and evolution. 1 Evolution and Ecology Research Centre, School of Biological Earth and Environmental Science, University of New South Wales, Sydney, NSW 2052, Australia 2 Murdoch University Cetacean Research Unit, Murdoch University, South Road, Murdoch, WA 6150, Australia 3 Institute of Statistics, National Tsing Hua University, Hsin-Chu 30043, Taiwan 4 EcoMinga Foundation, Via a Runtun, Baños, Tungurahua, Ecuador 5 Department of Ecology, Evolution and Natural Resources, School of Environmental and Biological Sciences, Rutgers University, New Brunswick, NJ 08901-8551, USA *Correspondence: [email protected] (W.B. Sherwin). 948 Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 https://doi.org/10.1016/j.tree.2017.09.012 © 2017 Elsevier Ltd. All rights reserved.

Upload: others

Post on 24-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

ReviewInformation Theory Broadensthe Spectrum of MolecularEcology and EvolutionW.B. Sherwin,1,2,* A. Chao,3 L. Jost,4 and P.E. Smouse5

Information or entropy analysis of diversity is used extensively in communityecology, and has recently been exploited for prediction and analysis in molec-ular ecology and evolution. Information measures belong to a spectrum (or qprofile) of measures whose contrasting properties provide a rich summary ofdiversity, including allelic richness (q = 0), Shannon information (q = 1), andheterozygosity (q = 2). We present the merits of information measures fordescribing and forecasting molecular variation within and among groups,comparing forecasts with data, and evaluating underlying processes suchas dispersal. Importantly, information measures directly link causal processesand divergence outcomes, have straightforward relationship to allele frequencydifferences (including monotonicity that q = 2 lacks), and show additivity acrosshierarchical layers such as ecology, behaviour, cellular processes, and nonge-netic inheritance.

Information Theory Base for Evolutionary Genetics and EcologyEvolutionary ecology aims to make and test forecasts about the behaviour of variants, at alllevels from molecular through species to landscapes, but until recently this field has paid littleattention to one of three main ways for doing this. Shannon information (or entropy) theory(see Glossary) is widely used to predict macropatterns from microparts, by methods such asfinding the model that maximises the entropy [1], in fields as diverse as community ecology [2]and energy generation, where these methods compare favourably with alternative predictions[3]. Genes obviously carry information – about evolutionary history, recent demography, andpossible future trajectories [4,5] – but information theory has rarely been used to investigatemolecular evolution [6–9]. Shannon’s information index 1H (=H in [10]), a fundamental compo-nent of information theory, is the most commonly used abundance-sensitive measure ofspecies diversity within a community [11]:1H ¼ �Ss

i¼1pi lnpi Equation 1

where pi is the proportional abundance of the ith species in a community of S different species(Box 1; Box S1 – which has definitions of all symbols). 1H also applies to a population containinggenetic variants (such as allelic types) [12], and can be thought of as the ability to spell outdifferent messages by rearranging individual alleles (Box 1) – with higher diversity, a greaterrange of messages can be spelt out. More formally, higher 1H means that there is higheruncertainty about what type to expect when we randomly sample ONE single allele(coincidentally '1' is the superscript for 1H– Box S2).

Historically, molecular ecology has quantified diversity with two alternative entropies:0H = S � 1; and 2H (Boxes 1, S1). 2H is the chance of choosing TWO different allelic typesfrom the population, called heterozygosity (2H = He, or for species in communities, called

TrendsDiversity of molecules or species isbest summarised as a diversity profile.

Such profiles are useful in studiesspanning bioinformatics to physicallandscapes.

Shannon information is a neglectedbut particularly informative part of theprofile.

Shannon now has robust theoreticalbackground for molecular ecologyand evolution.

1Evolution and Ecology ResearchCentre, School of Biological Earth andEnvironmental Science, University ofNew South Wales, Sydney, NSW2052, Australia2Murdoch University CetaceanResearch Unit, Murdoch University,South Road, Murdoch, WA 6150,Australia3Institute of Statistics, National TsingHua University, Hsin-Chu 30043,Taiwan4EcoMinga Foundation, Via a Runtun,Baños, Tungurahua, Ecuador5Department of Ecology, Evolutionand Natural Resources, School ofEnvironmental and BiologicalSciences, Rutgers University, NewBrunswick, NJ 08901-8551, USA

*Correspondence:[email protected](W.B. Sherwin).

948 Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 https://doi.org/10.1016/j.tree.2017.09.012

© 2017 Elsevier Ltd. All rights reserved.

Page 2: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

GlossaryAdaptation: refers to theevolutionary process due to naturalselection for organisms that arebetter at surviving and/orreproducing in a particularenvironment.Additivity: refers to amultidimensional table (e.g.,dimension 1 is allele type, dimension2 is location, dimension 3 is anenvironmental variable, etc). In thistype of table, measures such as log-linear contingency x2 (i.e., mutualinformation) can be partitioned intocompletely additive sub-investigations, unlike Pearson’s x2.Allele proportions andfrequencies: for conformity toconventions in information andentropy theory, statistics, and allother science (except populationgenetics), we refer to allelefrequencies when dealing withcounts ranging from zero to infinity,and to allele proportions when thesefrequencies have been converted topi ranging from zero to unity.a, b, and g diversity measures:these indicate within-locality (a)diversity, among-localitydifferentiation (b) diversity, and total(g) diversity. In other publications, a

values are often assumed to beaveraged over many locations.Where an average is made, weindicate this by an overbar, anddescribe any unequal weighting.Bottleneck: period of reducedpopulation size, which usually altersentropy and diversity levels awayfrom the previous expectations.Drift: random genetic drift is causedby the chance nature of transmissionof alleles within each family. In a finitepopulation, this chance intransmission results in fluctuations ofallele proportions in the entirepopulation. Drift erodes geneticvariation summarised by anymeasure (q = 0,1,2).Effective numbers (or truediversities) qD: conversions ofentropies (qH) into qD the number ofequally frequent alleles (or species)that would give the actual observedqH, derived from a possibly unevenarray of alleles in the observedsample.Entropy (qH): Entropies includeheterozygosity or Gini–Simpson(2H = He), Shannon (1H), and thenumber of allelic types minus one

Gini-Simpson and variants) [4,5,13].2H ¼ 1 � Ss

i¼1p2i Equation 2

With higher diversity, the chance of randomly choosing two different alleles increases, so 2Hincreases.

To create the q profile or spectrum, one converts (qH = 0H, 1H, 2H, etc) to a common scale ofeffective numbers (qD = 0D, 1D, 2D, etc), which represent the number of equally frequent allelesthat would be needed to yield the observed qH in the sample, which typically contains alleles atunequal frequencies [14–16] (Boxes 1, S1, and S2). The qD measures are sometimes calledtrue diversity, and the use of the qD profile has long been recommended in ecology, becauseeach q value provides different insights into the composition of the diversity [17,18]. Forexample, higher q values emphasise the more common variants (Box 1). As well as thesea-measures for alleles in a single population, each q value has measures of diversificationamong locations, which are the b-measures (Box 1) that are critical in evolution and conser-vation [19,20] (later and Box S1). Finally, the total diversity within and among localities is labelledg diversity.

Partial qD profiles are already used in evolution and ecology, to assess community response tochanged conditions [19], as well as possible correlations between species and gene diversity[21]. Also, methods to infer selection or population-size changes exploit the way theseprocesses have different effects on number of variants S (q = 0) and heterozygosity (q = 2)[22–24]. Many recent publications include (q = 1) (Box S3), yet few authors [25–28] haveexploited systematic q profiles (Box 1), to obtain the power described further later.

The diversity profile is most useful if we can forecast its shape under specified histories ofselection, population size, dispersal, etc. We can then either test departures from thosepredictions and/or estimate forces that underlie the diversity patterns we encounter. Classi-cally, q = 2 theory predicts values of within- and among-population measures, includingheterozygosity 2H, Jost-D, and FST, under conditions such as neutrality, selection, subdivision,dispersal, and altered population size [4,5,29]. Below we show that after a slow start [30–34],q = 1 predictive theory is now catching up to q = 2 theory [12,35–37], and 1H has beenproposed as a primary measure of evolvability [38,39].

As a reading guide, the initial sections outline how q = 1 molecular information can now predictand measure processes such as adaptation and dispersal (with additional detail in supple-ments). Those wanting less technical detail might skip directly to the penultimate sectionevaluating strengths and weaknesses of each element of the q profile, such as the effect of q onsensitivity to rare alleles (Box 1), and the poor performance of some conventional q = 2measures for analysis of among-population (b) differentiation.

We consider both neutral and adaptive genetic variants, and also discuss haploid, diploid, andclonal organisms. We consider genes (loci) with only two variants (alleles), such as typical singlenucleotide polymorphisms (SNPs), as well as multiallelic loci, evolving under either the SMM(stepwise mutation model – some microsatellite loci) or the IAM (infinite alleles model forhaplotypes or haptigs [40] of variants linked on the same DNA molecule [5]). We also discusscontinuous traits determined by variation of multiple genes.

Measuring and Predicting Shannon’s Information for Neutral AllelesWithin-population (a) equations have recently been derived to predict Shannon entropy 1H anddiversity 1D, within-populations (a), for neutral genetic information (having no effect on adap-tation), for equilibrium with constant effective population size Ne and mutation rate m (Box S1).

Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 949

Page 3: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

The new equations show good fit to both simulated and real data sets for loci evolving underseveral mutation processes – SNP [36], SMM [12,37], and IAM [12,37]. Boxes S1 and S6 showhow to calculate 1H, to compare data with theoretical predictions. For example, in a rainforesttree Elaeocarpus sedentarius, calculated and predicted 1H agree [12,41]. There are manyrecent examples of the empirical and theoretical use of Shannon measures to assess howmolecular diversity is affected by factors such as: endangerment and bottlenecks; subdivision;environmental gradients; pedigree inbreeding; and invasions, expansions, introductions, andhost jumps (Box S3). In most cases, measures with different q values yield similar interpre-tations, but we later discuss informative cases where they differ.

b-Differentiation of molecular diversity among populations, species, and landscapes can besummarised using measures of each order of q. The q = 0 b-measures are based on theproportion of allelic types that that are not shared by two populations (Box S1). Jost-D is a q = 2b-measure, and there are related q = 2 measures such as FST [29,42] and algorithms includingSTRUCTURE [43] and AMOVA [44] (Box S1). The q = 1 measures were explicitly designed to behierarchical [1,10], so partitioning of molecular diversity is particularly easy, leading to differen-tiation measures mutual information I, and Shannon differentiation. These can be derived from acontingency table of differentiation of allele frequency between geographic locations (Boxes 2and S1) [1,12,33,35–37]. Box 2 shows how dimensions can be added to incorporate diversitywithin and among different habitats, landscapes, etc. [35,45–47], because log-linear x2 (or I) iscompletely additive [48]. Hierarchical information-based diversity partitioning, including phylo-genetic information, is summarised in the guide and software iDIP, at https://chao.shinyapps.io/iDIP/ [49].

For any pair of populations, lower dispersal, smaller population size, or greater elapsed timesince separation will increase molecular differentiation (and hence, mutual information I). Atequilibrium, simulation results have been used to derive an inverse relationship between mutualinformation (I, q = 1) and effective dispersal rate (Nem), over a wide range of effective populationsizes (Ne � 10) and dispersal rates (0.001 � m � 0.30, i.e., 30% dispersing per generation) [12](Box S1). This equation can be used to estimate dispersal from genetic data, and outperformspredictions based on q = 2 in simulation studies (Figure 1A), as well as in laboratory colonies ofDrosophila with known Nem [12]. The waratah Telopea speciosissima showed a strongnegative correlation between FST (q = 2) and Nem estimated from mutual information I(q = 1), as expected because FST and I are each inversely related to Nem [50]. Dispersalcan also be assessed using I in clonal species [51], haploids, and for other ploidies such as athree-species hybrid moss Sphagnum � falcatulurm [45].

Other equations predicting equilibrium I are based on rigorous theory, rather than simulations,but require knowledge of mutation rate. An equation for biallelic SNPs [36] (Box S1) fits closelyto simulation outcomes, as well as to data from laboratory colonies of Drosophila of known Nem(Figure 1B). Also, equations for IAM and SMM loci [37] (Box S1) agree with simulation out-comes, and with observed genetic divergence for SMM (microsatellite) loci in starlings (Sturnusvulgaris) introduced to Australia [37]. There are many other recent examples of the use ofmolecular mutual information I to investigate hypotheses about temporal or spatial environ-mental gradients or barriers, in a variety of free-living and parasitic plants, animals, and fungi,plus simulations (Box S3). In most cases, different q values show similar results, but we discussdivergent cases later. Calculations, sampling, and programs are in Box S6.

Using Information Methods to Scan the Genome for Adaptive InnovationThere is a boom in searches for adaptive genomic regions, for evolutionary interest, choosingreintroduction sources for conservation [52] and identifying human disease loci [53,54].Strategies to detect these regions include evolve–resequence or transplant experiments

(S � 1, or 0H). The H symbol is usedfor entropy throughout science.Epigenetics: one type of nongeneticinheritance, due to chemicalmodification of DNA sequence, forexample, methylation, which issometimes transmissible through atleast one generation, and may havephenotypic effects. We do not usethe term epigenetics to cover alltypes of nongenetic inheritance.Frequently confused with epistasis.Epistasis: functional interactionsbetween multiple loci, at any levelfrom transcription onwards, to affecta measured phenotype. Frequentlyconfused with epigenetics.Haplotype: collective geneticcombination located on a singlemolecule of DNA, typically includingtwo or more SNPs showing variationof bases between alternativehaplotypes.Hardy–Weinberg equilibrium(HWE): for a single locus, HWE isthe condition where thecombinations of alleles in diploidgenotypes are as expected fromrandom combination of the pool ofalleles in the population. HWE can bedisrupted by mutation, selection,dispersal, nonrandom mating(including inbreeding), or randomgenetic drift.Infinite allele model of mutation(IAM): suitable for long genomicregions with rare base substitutions,so that each mutation creates anallele that has never existed before.This is approximately suitable forprotein-coding regions, and muchother nonrepetitive DNA.Information theory: this hasmultiple strands, but we focus onShannon’s Information Index 1H,which is a measure of the number ofdifferent ways that a group of objects(e.g., individuals of different species,or DNA molecules with differentsequences) can be rearranged. Theinformation index is also a measureof surprise or uncertainty, increasingin groups where we are less certainof what we would sample at random.Linkage disequilibrium (LD): this iswhen the combinations of alleles attwo or more diploid loci depart fromthose expected by random samplingfrom the population. The loci mightbe carried on the same molecule ofDNA (true LD), or on differentmolecules (sometimes calledgenotypic disequilibrium or GD). LDresults from random genetic drift,mutation, selection, dispersal,

950 Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12

Page 4: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

nonrandom mating (includinginbreeding), and clonal or asexualreproduction.Monotonicity: this means thatincreases of variable x are eitheralways associated with an increaseof y, or always associated with adecrease of y. Weak monotonicityallows plateaus of y.Neutral: refers to genetic variantsthat do not affect fitness (survivaland/or reproduction), so are notinvolved in natural selection andadaptation.Nongenetic inheritance: any typeof inheritance that does not relyupon variation of the DNA sequenceof A, T, C, and G. Examples includemicrobiome inheritance, epigeneticinheritance, niche inheritance, etc. Allof these can have profound fitnesseffects.Replication: this means that adiversity measure increases linearlywhen equally diverse and completelydistinct groups are pooled in equalproportions.Selection: based on individuals thatpossess heritable characteristics (e.g., alleles) that give them higherrelative survival and/or reproductionin a particular environment. As aresult, there will be increasedrepresentation of those heritablecharacteristics in the next generation.Such individuals are said to havehigher fitness in that environment.Stepwise mutation model (SMM):approximation of microsatelliteevolution.Single nucleotide polymorphism(SNP): single base pair locus thatvaries in the population.

[55], and assessing genomic differentiation over landscapes (landscape genomics) [56].Selection acts via differential fitness of variants of any heritable characteristic: DNA sequencevariation, expression variability due to interaction between environment and multiple genes, ornongenetic inheritance such as epigenetics [57,58]. Partly because of these many modes,there are many signals of selection, but each has a high rate of false positives [59], which can beminimised by using multiple alternative approaches, including q = 1 methods.

For detecting selection, Shannon-based metrics are particularly appropriate, in ways such astheir greater sensitivity (than q = 2) to rare alleles that may be important for conservationmanagement or detecting new (potentially adaptive) alleles [60]. For directional selection, whichfavours a single advantageous allele [5], Shannon information is proposed as a natural measureof evolvability [38,39]. The ultimate result will be a single allele at 100%, so 1H tends to zero, withdynamics analysed by logit transformation that is algebraically related to mutual informationbetween the allele distributions, before and after selection [39,61] (Box S4). Unlike directionalselection, balancing selection maintains equilibrium proportions of two or more alleles, forexample due to high heterozygote fitness [5], and the predicted allele proportions can be usedto calculate the equilibrium value of Shannon information 1H (Box S4) [35]. In addition, muchadaptive variation is based on quantitative traits controlled by multiple loci. Informationapproaches to directional selection on multilocus traits (Box S4; [62–65]) indicate that suchevolution favours gene duplication [66]. The multilocus analogue of balancing selection is calledstabilising selection, for which there are also treatments based on information statistics [67].

Detecting when Selection Changes Patterns in either Populations or Expression ProfilesFor diploids, departure from single-locus Hardy–Weinberg equilibrium (HWE) due to selection,nonrandom mating and other forces is summarised by FIS in q = 2 form (Box S4) [4,5].Equivalents for q = 1 include the conventional log-linear-x2 for fit to random-mating expecta-tions (HWE; Box S4), as well as extensions to mixed mating systems, such as mixed selfing andoutcrossing, in plants and invertebrates (Boxes S1, S3, and S6) [12,34,68].

Analysis of selection, and other evolutionary processes, requires us to evaluate haplotypes ofvariants in multiple parts of the genome, including SNPs, insertions, and deletions [40].Discriminating the individual and combined effects of haplotype elements requires analysisof linkage disequilibrium (LD), which is nonrandom association among such variants, due tophysical linkage (true LD; [69]), historical population size and admixture [70], or epistaticselection in which environmental conditions favour particular combinations of variants atmultiple positions [7,53,54,71]. LD can be expressed with Mutual Information I (q = 1) forassociation between variants at different positions, which minimises false discovery of LD(Boxes S5 and S6) [7,47,53,54,71–73] and as well as analysing binary traits, can analysemultilocus associations of both continuous normal [53] or non-normal [72] traits. This approachhas identified the combined effect of SNPs in two protein-coding loci, upon an addictivephenotype [74]. Also, I can be used to assess whether adaptive changes have resulted indifferent haplotypes in different locations (Box S5) [47,75,76]. Physical linkage is not the onlyway genes interact, and information methods are used to integrate the potentially selectiveeffects of expression networks, DNA sequence frequencies, and LD [7,71,77,78], for example,

Box 1. Tutorial on the q-Profile: Effective Numbers qD and Entropies qH

Figure I shows three samples of four haploid individuals, each with SNP allele A or T. The first row of histogram bars shows entropies qH, including heterozygosity 2H,and 1H that can be derived from the number of possible novel arrangements of the four sampled individuals, as if one was trying to spell words with their alleles. In theleft sample, with no diversity, all qH measures are zero (NB 0 ln 0 = 0). In the right sample, where alleles are equally frequent, each measure is at its maximum possiblevalue with two alleles. The second row of histograms shows the q profile of effective-number diversities qD derived from qH. Note that in moving from the left sample tothe middle sample, we add one copy of a new rare allele T. In this case, 0H and 0D show the greatest response, while heterozygosity 2H and 2D show the smallestresponse, because in Equation 2 when the proportion of a rare allele is squared, its effect becomes smaller or negligible. In contrast, when both alleles are already

Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 951

Page 5: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

present, 0H and 0D show no response to changing the numbers of copies of each allele, while 2H and 2D show the greatest response. In both cases 1H and 1D showan intermediate response. Formulas are in Box S1.

Figure II shows a diversity profiles of antimicrobial resistance variants within Salmonella typhimurium in humans (blue) and domestic animals (red) (reanalysed from[98], with correction for sampling bias Box S6). The horizontal axis is the value of q, and the vertical axis is the corresponding qD value. Shaded areas are 95%confidence limits, which overlap, except between 0.3 < q < 1.2 where the plots can be discriminated. This highlights the difficulty of comparing profiles if only q = 0and q = 2 are used. Also note the effect of evenness of observed allelic distributions: if the resistance types had been equally frequent within each host(p1 = p2 = p3 = . . . = ps) then the qD profiles would have been flat (dotted lines). However, in each host (animal or human), there is one highly abundant typeand many types only recorded once. Therefore, the profiles drop sharply, becoming flat for order q = 2 (heterozygosity or Gini–Simpson), because 2D is mainlydetermined by the abundant types.

As well as these a-measures for alleles in a single population, each q-value has measures of diversification among locations, which are the b-measures (Figure III).

A A A T A A T TA A A A

A T A A A T A TA A A A

1

0

2

1q=0 q=1 q=2 q=0 q=1 q=2 q=0 q=1 q=2

0 D 1 D 2 D0 D 1 D 2 D0 D 1 D 2 D

0H 1H 2H

0D = S1D = e1H

2D = 1/(1 – 2H)

The q-profile:number-equivalents qD

Divided by maximum qHvalue for that S,

using pA= pT = 1/S = 1/2

0H = S – 1

Si

Si1H = – Σ = 1 pi ln pi

2H = 1 – Σ = 1 pi2

Entropy qH

Allele pr pA, pTNumber of alleles S

Increasing diversity

1,0 0.75,0.25 0.5,0. 5S=1 S=2 S= 2

6 possible arrange-ments, e.g.:

No rearrangemen tcan alter info

4 possible arrange-ments, e.g.:

0H = 1 – 1 0H = 2 – 11H = – 1 ln 1 – 0 ln 0 = 0 1H = – .75 ln.75 – .25 ln .25 1H = – .5 ln .5 – .5 ln .5 = .6 92H = 1 – 12 – 02 2H = 1 – .7 52 – .252 2H = 1 – .52 – .5 2 = .5

= 0

= 0

0H = 2 – 1

0H0Hmax

1Hmax2Hmax

1H 2H 0H0Hmax

1Hmax2Hmax

1H 2H

= 1 = 1

= .3 8= .5 6

Figure I. Calculating qH and qD.

952 Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12

Page 6: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

Es�mated (an imal)

0.0 0.5 1.0

q

q

1.5 2.0

120

100

8060

4020

0

Es�mated (h uman)

Animal - i f abun dances were even

Human - i f abun dances were even

D

Figure II. q Profiles.

ALPHA measures:loca�on 2

ALPHA measures:loca�on 1

0Hα 10Dα 1 1Dα 1 2Dα 1

0Dα 2 1Dα 2 2Dα 2

1Hα 1 2Hα 1 0Hα 2 1Hα 2 2Hα 2

β: Between- loca�on measu res

(q= 1) mutual inform a�on I & Sh annon differen�a�on (B ox 2)

Also:(q=0) (ba sed on numb er of un sha red allele types, see Box S1 ).

(q= 2) Jo st-D etc. (see main text an d Box S1).

Figure III. a and b Measures of Entropy and Diversity.

Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 953

Page 7: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

helping us predict phenotypic outcomes of RNA modification (splicing) in blood-clotting factors[79]. A partial profile (q = 1,2) proved best for assessing expression patterns [80]. Programmesfor analyses are in Box S6.

Choosing Measures - Relative Merits of Information-Based Measures andOther MeasuresA full diversity profile exploits the sensitivities and strengths of each element (q = 0, 1, 2).Reassuringly, in 85% of studies q = 1 gives qualitatively similar results to q = 0 and/or q = 2(Box S3), but to understand cases where they differ, or to choose measures for particularapplications, we must assess the sensitivities, strengths, and weaknesses of each profileelement (q = 0, 1, 2). For example, sensitivity to rare alleles may be an advantage for conser-vation or adaptation studies (Box 3), but this sensitivity means that missing rare alleles canseriously degrade q = 0 estimates unless appropriate corrections are made (Box S6). Theperformance of measures must be evaluated for each evolutionary or ecological question, suchas change of molecular diversity with latitude, and estimation of dispersal from genetic markers(where mutual information outperforms other methods [12]). Here we point out some majordifferences between q = 1 versus q = 0 or q = 2.

Shannon results generally accord with biological predictions, but in some cases the full q profileaids detection (or rejection) of a predicted pattern, because the q = 1 measure and othermeasures diverge. Sometimes q = 1 has greater sensitivity than q = 0 or q = 2 measures have(Box S3), as was seen in two studies of a (within-population) diversity due to adaptation anddrift. In an invasive mosquito Aedes japonicus japonicus [81], and in crop carrots affectingnearby wild carrots Daucus carota [82], it was found that 2H (heterozygosity q = 2) was less

Box 2. Measuring Geographic Differentiation with Mutual Information I

Mutual information I (q = 1) expresses association between two variables such as allelic type and population member-ship [12], making a measure of differentiation among locations. For example, we ask: “Does knowing the alleles in asample, give information about which location was sampled?” This is NOT true in Figure I, because alleles C and T are atidentical proportions in the two locations, so there is zero mutual information between variables ‘location’ and ‘alleletype’. Figure II shows maximal mutual information: the location is identified unambiguously from the alleles, which arenot shared between the locations.

Figure III outlines calculation of mutual information I for two locations, estuary 1 (allele proportions pC1 + pT1 = p1) andestuary 2 (pC2 + pT2 = p2). Mutual information is I = x2/2n, where n is the total sample size, and x2 is log-linear-contingency-x2, with expectations for each cell calculated as shown for one example cell in Figure III [48].

Alternatively, I is calculated as the part of the total information 1Hg that is not due to variability within single locations 1Ha

(following Figure III):

I ¼ 1Hg � 1Ha

� �

¼ �pC lnpC � pT lnpTð Þ � 0:5 �pC1

p1lnpC1

p1� pT1

p1lnpT1

p1

� �� 0:5 �pC2

p2lnpC2

p2� pT2

p2lnpT2

p2

� � (Box S1).

Mutual information adjusted to range [0,1] is called Shannon differentiation = I/ln K (where K is the number of equal-sizedpopulations; see Box S1 for other cases). Shannon differentiation has useful properties (Box 4).

Figure IV shows expansion to include two habitats: each estuary has a saline habitat whose allelic data are in theforeground of the cube, and data for a brackish habitat in the background [35]. Further expansion of such tables allowscalculation of mutual information measures accommodating more than two alleles per locus, plus additional dimensionsto investigate diversity among years, landscapes, etc. This is possible because log-linear x2 (and therefore I) arecompletely additive [35,48]. The partition strategy will depend upon the hypothesis being tested [45,46,48,49].Programmes, sampling bias corrections, and example calculations are in Box S6.

With equal population sizes, I is the ecologists’ Horn measure [37]. Mutual information is closely related to the relativeentropy (Kullback–Leibler) that is used for tests for Hardy–Weinberg equilibrium (Box S4) and for comparing allelefrequencies before versus after selection, as well as the rate of change of Fisher information [93].

954 Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12

Page 8: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

0.5

0.4

0.3

0.2

0.1

0

CT

Loc 2

Loc 1

pC1

pC2

pT1

pT2

Figure I. Zero Mutual Information I.

0.5

0.4

0.3

0.2

0.1

0

CT

Loc 2

Loc 1

pC1

pT2

Figure II. Maximum Mutual Information I.

SNP allele

Expecta�on:

1

12

Loca

�on

(e.g

., es

tuar

y nu

mbe

r) C T

p1

p2

pC pT

pT1

E(pT1) = pT p 1

pT2pC2

pC1

Figure III. Data for Mutual Information between Alleles and Location.

Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 955

Page 9: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

sensitive than 1H (Shannon q = 1) to the predicted loss of variability resulting from smallpopulation size and colonisation. This is presumably because 2H emphasises common allelesthat have relatively low chance of being lost during a bottleneck that is caused by either massivemortality or small numbers of invaders. A hierarchical partition of Shannon information (similar toBox 2) in three-species hybrid moss, Sphagnum � falcatulurm, revealed differentiation due torare recent mutations to which q = 2 metrics would not have been sensitive [45]. However,q = 1 does not always show the greatest sensitivity; in temporal studies of zooplankton, q = 0was more powerful than q = 1, which was more powerful than q = 2 [28]. Examples in thisparagraph demonstrate the importance of analysing the entire q profile including q = 1.

While recognising the importance of the entire q profile, Table 1 and Boxes 3 and 4 show thatonly q = 1 measures combine a number of desirable characteristics for tracking evolutionarilyimportant phenomena. First, Shannon measures are sensitive to alleles according to theirfrequency, yet have minimal sampling problems (Table 1, Boxes 1 and 3). Second, measures ofeach q order must be able to distinguish levels of diversity that are individually important forevolution and conservation [19,20]: within-locality (a), among-locality differentiation (b), andtotal (g). Eliminating dependencies between these levels for many common q = 2 methodsrequires more complex equations or algorithms (e.g., [42]), which might complicate predictions.In contrast, the explicit hierarchical nesting of information measures (q = 1, Figure 6 in [10]),yields independent estimates of within-population (a), among-population (b), and total (g) levelsof diversity (Table 1, Box 4). Disagreements between q = 1 and q = 2 b-measures mightsometimes derive from these dependencies for q = 2 [82,83]. Third, some q = 1 measuresrespond in an intuitively appealing fashion to addition of new alleles, behaving in a predictable

12

Loca

�on

(e.g

., es

tuar

y nu

mbe

r)

Loca�on (eg estuary number)

21

SNP allele

2

1

Sea

Sea

C T

TC

Salin

e

Brackish

Figure IV. Partitioning Molecular Information with Two Variables (Location and Habitat).

956 Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12

Page 10: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

RMS

Erro

r

6

4

2

00

Dispersa l (m)

I, Ne = 1 0Fst, Ne = 1 0I, Ne = 1 00Fst, Ne = 1 00I, Ne = 1 000Fst, Ne = 1 000

0.1 0.2 0. 40.3

0.10

0.05

00

Dispersal m

Observed I Predicted I

I

0.01 0.02 0.03 0.0 4

(A)

(B)

Figure 1. Relationship of Mutual Information I to Population Size and Dispersal, from Simulations and LivingPopulations. (A) Observed mutual information I per locus from simulated microsatellite data, used to estimate dispersal(Nem) via FST or I. Root mean square error (RMSE) of Nem is plotted against dispersal rate, for several different effectivepopulation sizes (Ne). RMSE is similar to variance, except it assesses departure of (simulated) observations from theequation’s prediction, rather than from the mean of the observations. In every case, Nem calculated from I had the lowerRMSE (from [12]). (B) Comparison of analytical predictions of mutual information I with observed single nucleotidepolymorphism mutual information in Drosophila fly dispersal experiments with known Nem. For the predictions, a mutationrate of m = 10�6 was assumed, but using m = 10�3 to 10�9 made little difference to the predicted values of I (modified from[36]).

Box 3. Choosing a Measure for Within-Population Diversity (a)

Elements of the diversity profile from q = 0 to q = 2, have various strengths and weaknesses, being sensitive to aspectsof diversity relevant to different questions (Boxes 1 and S1). Measures based on counts of different types of alleles (orspecies, q = 0) are sensitive to rare alleles (Box 1), which is important when focusing on rarity, for conservation reasons,or because novel adaptive mutants are initially rare [4,5,60]. In contrast, q = 2 measures such as heterozygosity (2Ha

Equation 2, Box 1) give little weight to rare alleles, because they are based on the chance of choosing two differenttypes, given the proportions pi available. This involves squaring pi values, so that values close to zero (rare alleles)become vanishingly small. The intermediate value of q (q = 1, 1Ha Shannon information, Equation 1, Box 1), weightseach allele by its frequency, so its response to addition of a single novel allele is intermediate between the responses ofq = 2 and q = 0 measures (Box 1).

These sensitivities to rare alleles result in different sampling properties. Counts of different types (q = 0) are changedconsiderably by either adding a single individual of a different type, or else failing to sample this single individual. As aresult, even after sampling corrections, these measures often cannot distinguish diversity levels of assemblages that canbe discriminated by either q = 1 or q = 2 scales (Box 1). In contrast, q = 2 methods such as heterozygosity/Simpson’shave relatively few sampling problems. Sampling for Shannon information (q = 1) can be well addressed by moderncorrections (Box S6), so that it can distinguish between alternative assemblages (Box 1).

Frequency of use is important for comparability between studies. In community diversity, Shannon is the mostcommonly used abundance-sensitive measure (q = 1 1Ha [11]), whereas heterozygosity (q = 2) is most commonlyused in molecular ecology [4,5]. As the two fields gradually unify [13,21], it will become important to use measurescommon to both fields, and we suggest that a profile of q = 0, 1, and 2 will achieve this best.

Finally, there is extensive literature on neutral and adaptive processes in molecular ecology for both q = 0 and q = 2a-measures [4,5]. More recently, it has been proposed that the q = 1 measure Shannon information 1Ha provides anatural measure of evolvability [38,39], and Shannon predictive methods are being developed for both neutral [12,36,37]and adaptive variants [35,62–65,67].

Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 957

Page 11: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

Table 1. Relative Sensitivities, Strengths, and Weaknesses of Each Element of the Diversity Profile from q = 0 to q = 2 (Boxes 1 and S1)

Value of q Interpretation of a (within-population) and b

(between-population) measuresAdvantages Disadvantages Possible fixes for

disadvantages

q=0 Within-population measure 0Ha: one lessthan the number of different types of alleles(or species) (Box 1).

More sensitive to rare alleles thanq = 1 or q = 2.

Serious sampling problems, becausevery sensitive to rare alleles.

Sampling problemssomewhat alleviatedby method in BoxS6.

Between-population (b) measure: severalmeasures (Box S1), all based on theproportion of allelic types that are not sharedby two populations.

More sensitive to rare alleles thanq = 1 or q = 2.

Serious sampling problems, becausevery sensitive to rare alleles.

Sampling problemssomewhat alleviatedby method in BoxS6.

q=1 Within-population measure 1Ha: number ofways the array of different alleles could beset out, given the relative fractions ofdifferent alleles, pi available (Box 1). Moreformally, higher 1H means that there isreduced certainty about what type to expectwhen ONE single allele is randomly sampled.

The most commonly usedabundance-sensitive measure incommunity diversity.Sensitivity to each allele accordingto its frequency, i.e.: each copy ofeach allele is treated equally.Shannon information provides anatural measure of evolvability.

Some sampling problems. Sampling problemsfixed by method inBox S6.

Between-population (b) measures: Severalmeasures (Box S1), all related to whetherknowing the allelic type of one singleindividual will help identify the location fromwhich it was sampled (e.g., if there is nodifferentiation, then the allelic information iscompletely uninformative, but if there iscomplete differentiation, then knowing theallelic type would give certain identification ofthe location of origin).

Shannon differentiation satisfiesmonotonicity (some othertransformations for q = 1 do not).Shannon differentiation (Box 2)satisfies true dissimilarity, whichmeans that the differentiationmeasure should represent theactual proportion ofnonoverlapping alleles, whenpopulations are equally diverseand all alleles have the samefrequencies. Some othertransformations for q = 1 do notsatisfy this.

Some sampling problems. Sampling problemsfixed by method inBox S6.

q=2 Within-population measure 2Ha: chance ofchoosing TWO different types, given therelative fractions of different types, piavailable (Box 1).

Very sensitive to frequent types.Relatively few sampling problems.

Insensitive to rare types that may beimportant in conservation and long-term evolution.

Between-population (b) measures: Manymeasures (Box S1), all related to whethertwo individuals sampled from differentlocalities are likely to have different allelictypes (e.g., if there is no differentiation, theymust be the same allelic type, but if there iscomplete differentiation, they must bedifferent types).

Very sensitive to frequent types.Relatively few sampling problemsJost-D (Box S1) satisfies truedissimilarity, which means that thedifferentiation measure shouldrepresent the actual proportion ofnonoverlapping alleles, whenpopulations are equally diverseand all alleles have the samefrequencies. Some othertransformations for q = 2 do notsatisfy this.

Insensitive to rare types that may beimportant in conservation and long-term evolutionMost q = 2 measures confoundamong-population (b) diversity withone or both of within-population (a)diversity and total (g) diversity. Theseare cases of the dependence-on-aproblem and the dependence-on-gproblem.Most q = 2 measures lackmonotonicity, i.e., apparentdifference between two populationscan decrease when a new unsharedallele appears in a population,wrongly implying that this reduces thepace of speciation. The q = 2differentiation measure Jost-D (BoxS1), does not possess the expectedmonotonicity. FST and GST (Box S1),which are also sometimes used asb-differentiation measures, do notsatisfy monotonicity.

Jost-D (Box S1) fixesthe dependence-on-a problem.A measure to fix thedependence-on-gproblem is cited inthe main text.

958 Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12

Page 12: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

Box 4. Choosing an among-Population Differentiation Measure (b)

Among-population measures (q = 0, 1, and 2) inherit virtues and shortfalls of corresponding a-measures (Box 3), plusproperties specific to the b level. For three reasons, the q = 1 measures might be the best tools to track and understandevolutionary processes of divergence.

First, many measures used as b-differentiation measures actually confound this with within-population (a) or total (g)diversity. For example, FST and GST (q = 2) tend towards zero as a diversity increases [99,100], as does the relatedmeasure in Figure I. In contrast, Shannon differentiation (q = 1, Box 2) and Jost-D (q = 2, Box S1) are zero only whenallele frequencies are identical across localities, and unity only when there are no shared alleles. A measure has beenproposed that avoids dependence on g [101].

Second, to track evolving divergence among localities, there should be monotonicity: a differentiation measure shouldnever decrease when shared alleles are replaced or augmented by new unshared alleles. Surprisingly, most differ-entiation measures derived from diversity partitioning do not meet this basic criterion, wrongly implying that newunshared alleles can reduce the pace of differentiation. For example, the additive q = 2 differentiation in Figure I (Panel A,green solid line) increases then decreases as differentiation increases, as do most q = 2 measures (e.g., FST and Jost-DBox S1), and even some q = 1 measures. The only differentiation measures with strong monotonicity are q = 1 Shannondifferentiation (Box 2), and those based on q = 0, though the latter are less desirable because of their samplingproblems.

A third important property of differentiation measures is true dissimilarity, which means that the differentiation measureshould represent the actual proportion of nonoverlapping alleles, when populations are equally diverse and all alleleshave the same frequencies. This is untrue for many q = 1 or q = 2 measures, but is true for the q = 1 measure Shannondifferentiation (Box 2) and for q = 2 differentiation Jost-D [29].

0 5 10 15 20 0 5 10 15 20

Perc

ent o

f max

imum

0

5

0

100

1

2

Effec

�ve

num

ber

(A) Entropy- based (B) Numbers- equ ivalent -based

q = 0q = 1q = 2

Increasing differen�a�on

Figure I. How Differentiation Measures Respond to Increasing Allelic Differentiation. Initially two populationsshared the same two alleles. The horizontal axis shows addition of extra unshared alleles to each locality. Equations andsymbols are shown in Box S1, except that for q = 0 the entropy-based measure is Sg � Sa1 þ Sa2ð Þ=2, and for q = 2 itis 2Hg � 2Ha

� �¼ GST � 2Hg . In panel A, the vertical axis shows percent of the maximum observed value for that

order of q (modified from [99]).

Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 959

Page 13: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

way with the number of unshared alleles, and satisfying the principles of strong monotonicityand replication (Table 1, Boxes 3 and 4). Replication [15,84] means that a measure increaseslinearly when equally diverse and completely distinct groups are pooled in equal proportions, asseen in effective numbers (qD, Boxes 1 and S1). Information measures also have minimalsampling problems if appropriate estimation methods are used (Box S6). In contrast, adrawback of q = 0 measures is that although they can be forecast under specified conditions[5], the sampling to test these forecasts can be hampered by their extreme sensitivity to rarealleles (Boxes 1, 3, 4, S1, and S2).

Additionally, we have shown above that Shannon measures are catching up in areas wherethey have lagged behind q = 0 and q = 2: forecasting methods are now available for q = 1measures of neutral and adaptive variation, and the frequency of use of q = 1 methods inmolecular ecology is rising to meet their already heavy use in community ecology [11]. Also, wehave shown that Shannon (q = 1) methods outperform others for genetic estimation of dis-persal [12], and have great utility in detecting selection and gene expression patterns [76,77].Thus, we should use information methods as important contributors to the diversity profile.

The Near Future: Challenges, Opportunities and IntegrationThe major components of molecular information theory are now established, but questionsremain (see Outstanding Questions). In the long term, the most useful measure will undoubtedlybe the whole qD profile (0D, 1D, 2D) at a, b, and g levels. This will maximise understanding ofpatterns, and allow meta-analysis of the q-profile performance under a wide variety ofconditions.

To complete an information-based strategy for forecasting and analysis in molecular ecologyand evolution, Shannon’s strong performance with dispersal, mutation, and drift (earlier) mustbe extended to include asymmetric dispersal, unequal or changing population sizes, andvariants with mutation not described by models such as infinite (IAM) stepwise (SMM) or biallelic(SNP). This could be through new theory, or via Approximate Bayesian Computation using the qprofile [12,27]. It would also be ideal to develop an analogue to AMOVA [44], based oninformation-theoretic methods. This might build upon a molecular phylogenetic differentiationmeasure for all q [85], based on the neutral coalescent of evolutionary biology [4].

As well as neutral predictions, we are only beginning to explore the capacity of informationscience to integrate all analyses of adaptation and selection into a common scale, possiblyexploiting the connection to logit (log-linear) modelling. Moreover, sensitivity of tests for lociunder selection is likely to increase with a q = 1 approach, because q = 1 is more sensitive tonovel rare variants that are crucial in adaptation [60], and because many of these selection testsrely upon partition of among-population (b) versus within-population (a) measures of diversity[55,56,86], and so should benefit from the complete independence of a and b in the q = 1scale.

Understanding adaptation requires integrating all aspects of biology from subcellular biology tohabitat tolerance, and information theory is particularly well suited to this challenge [6,8,9],being both explicitly additive and hierarchical, as well as being a general forecasting method [1],already used in community ecology [2]. Community ecologists have borrowed genetic q = 2theory for both neutral [13] and adaptive genes [21], so community ecology might also benefitfrom deploying the q = 1 neutral and selective theory outlined in this article. The q = 1approaches will likely be able to make even more forecasts than with q = 2, by virtue of thefact that Shannon applies to all types of information [87], including physical habitat, behaviour[88], genetics [12,35–37], information within each sequence including codon usage biases [89],and various nongenetic inheritance modalities including epigenetics [57,58]. Moreover,

960 Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12

Page 14: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

information has been proposed as the driver of any correlations between species and moleculardiversity [90]. Incorporating information on genetic and/or functional similarity or difference ofalleles (or species) greatly improves the utility of diversity measures [84,91], and ongoingattempts to incorporate function, without violating fundamental diversity principles, shouldinclude Shannon methods [49,84,92].

We have only presented a limited aspect of information theory, but we could paint on a muchbroader canvas. Other potentially fruitful aspects of information theory include: Kullback–Leibleror relative entropy used in selection analysis; Fisher information used to compare (q = 2)measures [93]; fractals [94]; and compression algorithms proposed for both selection analysis[64] and for joint estimation of sequence alignments and phylogenies [95]. Finally, predictiveequations for molecular Shannon information might be used to advance evolutionary comput-ing (machine learning), in which the performance of the code is optimised by a process similar tobiological drift and selection [77,88,96,97].

Concluding RemarksFour common processes – dispersal, random drift, adaptation, and generation of novelty(speciation, mutation, and recombination) – unify community and molecular ecology [13,21].The unification of these fields will be accelerated by cross-fertilisation between their predictiveequations and measures, at levels from bioinformatics to physical landscapes and beyond,especially when using a diversity profile of q = 0,1,2, and adhering to criteria for robustmeasures (see ‘Choosing Measures . . . ’ section). Shannon information measures (q = 1)are a vital part of this profile, especially given that they now have predictive equations available,and excel at providing intuitive results.

AcknowledgementsWe thank reviewers and colleagues who read drafts and made valuable comments: J. Bragg, C. Brandenburger, J.

Byrnes, L. Carmel, D. Chabanne, H. Daly, F. Jabot, M. Kidd, A. Kopps, E. Kosman, T. Leinster, O. Manlik, B. Maslen, D.

NcNevin, A. Moles, R. Nichols, G. O’Reilly, R. Peakall, K. Raphael, R. Reeve, L. Rollins, R. Shofner, M. Stear, P. Sunnucks,

and M. Tanaka. TREE editor P. Craze also made extremely valuable suggestions. W.S., A.C., and L.J. first met in 2012 at a

workshop coordinated by Leinster Reeve et al., on ‘The Mathematics of Biodiversity’ at the Centre de Recerca

Matemàtica, Universitat Autònoma de Barcelona, funded by that centre, the Spanish government, and the BBSRC

(International Workshop Grant BB/J020567/1).

Supplemental InformationSupplemental information associated with this article can be found, in the online version, at https://doi.org/10.1016/j.tree.

2017.09.012.

References1. Cover, T.M. and Thomas, J.A. (1991) Elements of Information

Theory, Wiley

2. Elith, J. et al. (2011) A statistical explanation of MaxEnt forecologists. Divers. Distrib. 17, 43–57

3. Bessa, R.J. et al. (2009) Entropy and correntropy against mini-mum square error in offline and online three-day ahead windpower forecasting. IEEE Trans. Power Syst. 24, 1657–1666

4. Nielsen, R. and Slatkin, M. (2013) An Introduction to PopulationGenetics Theory and Applications, Sinauer

5. Halliburton, R. (2004) Introduction to Population Genetics,Pearson

6. Searls, D.B. (2010) The roots of bioinformatics. PLoS Comput.Biol. 6, e1000809

7. Moore, J.H. and Hu, T. (2015) Epistasis analysis using informa-tion theory. In Epistasis: Methods and Protocols, Methods inMolecular Biology (Moore, J.H. and Williams, S.M., eds),Springer

8. Glazebrook, J.F. and Wallace, R. (2012) ‘The frozen accident’ asan evolutionary adaptation: a rate distortion theory perspective

on the dynamics and symmetries of genetic coding mecha-nisms. Informatica 36, 53–73

9. Skene, K.R. (2015) Life’s a gas: a thermodynamic theory ofbiological evolution. Entropy 17, 5522–5548

10. Shannon, C.E. (1948) A mathematical theory of communication.Bell Syst. Tech. J. 27, 379-423, 623–656

11. Buddle, C.M. et al. (2004) The importance and use of taxonsampling curves for comparative biodiversity research with for-est arthropod assemblages. Can. Entomol. 137, 120–127

12. Sherwin, W.B. et al. (2006) Measurement of biological informa-tion with applications from genes to landscapes. Mol. Ecol. 15,2857–2869

13. Hubbell, S.P. (2001) The Unified Neutral Theory of Biodiversityand Biogeography, Princeton University Press

14. Jost, L. (2006) Entropy and diversity. Oikos 113, 363–375

15. Chao, A. et al. (2014) Unifying species diversity, phylogeneticdiversity, functional diversity, and related similarity and differen-tiation measures through Hill numbers. Annu. Rev. Ecol. Evol.Syst. 45, 297–324

Outstanding QuestionsRegular use of a formal qD profile of atleast q = 0, 1, and 2 measures at eachlevel (a, b, and g) will achieve maxi-mum understanding of patterns, andwill later allow meta-analysis of theperformance of the molecular q profileunder a wide variety of conditions.

Neutral forecasts need to be extendedto complex scenarios, such as asym-metric dispersal, unequal populationsizes, and bottlenecks including pos-sible separate effects of actual andeffective population size.

We need an analogue to AMOVAbased on information-theoreticmethods.

There is a need for new predictive the-ory for variants with mutation notdescribed by models such as infinite(IAM) stepwise (SMM) and biallelicSNPs.

Further analyses of adaptation andselection with q = 1 approaches willprofit from three attributes of Shannon:similarity to logit (log-linear) modelling,sensitivity to rare novel variants that arecrucial in adaptation, and indepen-dence of a and b.

Analyses of LD and expression, whichalready use measures related tomutual information, might benefit fromusing its transform to Shannondifferentiation.

Integrating all biological levels fromcommunity ecology and evolution,through to subcellular biology, cancapitalise on existing protocols forusing information and entropy meth-ods as general forecasting methods.This will include incorporating informa-tion on genetic and/or functional simi-larity or difference of alleles (orspecies). Appropriate measures arein the software iDIP (information-baseddiversity partitioning) at https://chao.shinyapps.io/iDIP/ [49].

Information and entropy theory isbroad, with an abundance of otherpossible connections within and out-side biology.

Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 961

Page 15: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

16. Gregorius, H.R. and Kosman, E. (2017) On the notion of disper-sion: from dispersion to diversity. Methods Ecol. Evol. 8, 278–287

17. Pielou, E.C. (1966) The measurement of diversity in differenttypes of biological collections. J. Theor. Biol. 13, 131–144

18. Hill, M.O. (1973) Diversity and evenness: a unifying notation andits consequences. Ecology 54, 427–432

19. Anderson, M. et al. (2011) Naviagating the multiple meanings ofb diversity: a roadmap for the practising ecologist. Ecol. Lett. 14,19–28

20. Socolar, J.B. et al. (2016) How should b-diversity inform biodi-versity conservation? Trends Ecol. Evol. 31, 67–80

21. Vellend, M. (2016) The Theory of Ecological Communities (MPB-57), Princeton University Press

22. Luikart, G. et al. (1998) Distortion of allele frequency distributionsprovides a test for recent population bottlenecks. J. Hered. 89,238–247

23. Tajima, F. (1989) Statistical method for testing the neutral muta-tion hypothesis by DNA polymorphism. Genetics 123, 585–595

24. Fu, Y.-X. and Li, W.-H. (1993) Statistical tests of neutrality ofmutations. Genetics 133, 693–709

25. Greenbaum, G. et al. (2014) Allelic richness following populationfounding events – a stochastic modeling framework incorporat-ing gene flow and genetic drift. PLoS One 9, e115203

26. Lloyd, M.W. et al. (2013) The power to detect recent fragmen-tation events using genetic differentiation methods. PLoS One 8,e63981

27. Huang, W. et al. (2011) Approximate Bayesian comparativephylogeographic inference from multiple taxa and multiple lociwith rate heterogeneity. BMC Bioinformatics 12, 1

28. Iacchei, M. et al. (2017) It’s about time: insights into temporalgenetic patterns in oceanic zooplankton from biodiversity indi-ces. Limnol. Oceanogr. 62, 1836–1852

29. Jost, L. (2008) Gst and its relatives do not measure differentia-tion. Mol. Ecol. 17, 4015–4026

30. Crow, J.F. (2001) Shannon’s brief foray into genetics. Genetics159, 915–917

31. Ewens, W.J. (1972) The sampling theory of selectively neutralalleles. Theor. Popul. Biol. 3, 87–112

32. Lewontin, R.C. (1972) The apportionment of human diversity.Evol. Biol. 6, 381–398

33. Brown, A.H.D. and Weir, B.S. (1983) Measuring genetic vari-ability in plant populations. In Isozymes in Plant Genetics andBreeding (Tanksley, S.D. and Orton, T.J., eds), pp. 219–239,Elsevier Science Publications

34. Meirmans, P.G. and Van Tienderen, P.H. (2004) GENOTYPEand GENODIVE: two programs for the analysis of genetic diver-sity of asexual organisms. Mol. Ecol. Notes 4, 792–794

35. Sherwin, W.B. (2010) Entropy and information approaches togenetic diversity and its expression: genomic geography.Entropy 12, 1765–1798

36. Dewar, R.C. et al. (2011) Predictions of single-nucleotide poly-morphism differentiation between two populations in terms ofmutual information. Mol. Ecol. 20, 3156–3166

37. Chao, A. et al. (2015) Expected Shannon entropy and Shannondifferentiation between subpopulations for neutral genes underthe finite island model. PLoS One 10 (6), e0125471

38. Day, T. (2015) Information entropy as a measure of geneticdiversity and evolvability in colonization. Mol. Ecol. 24, 2073–2083

39. Frank, S.A. (2017) Universal expressions of population changeby the Price equation: natural selection, information, and maxi-mum entropy production. Ecol. Evol. 7 (10), 3381–3396

40. Chin, S.-C. et al. (2016) Phased diploid genome assembly withsingle molecule real-time sequencing. bioRxiv http://dx.doi.org/10.1101/056887

41. Rossetto, M. et al. (2008) Dispersal limitations, rather thanbottlenecks or habitat specificity, can restrict the distributionof rare and endemic rainforest trees. Am. J. Bot. 95, 321–329

42. Hedrick, P.W. (2005) A standardized genetic differentiationmeasure. Evolution 59, 1633–1638

43. Pritchard, J.K. et al. (2000) Inference of population structureusing multilocus genotype data. Genetics 155, 945–959

44. Excoffier, L. et al. (1992) Analysis of molecular variance inferredfrom metric distances among DNA haplotypes: application tohuman mitochondrial DNA restriction data. Genetics 131, 479–491

45. Karlin, E.F. and Smouse, P.E. (2017) Allo-allo-triploid Sphagnum� falcatulum: single individuals contain most of the Holantarcticdiversity for ancestrally indicative markers. Ann. Bot. 120, 221–231

46. Smouse, P.E. et al. (2015) An informational diversity analysisframework, illustrated with sexually deceptive orchids in earlystages of speciation. Mol. Ecol. Resour. 15, 1375–1384

47. Smouse, P.E. and Ward, R.H. (1978) A comparison of thegenetic infrastructure of the Yecuana and the Yanomama: alikelihood analysis of genotypic variation among populations.Genetics 33, 611–631

48. Sokal, R.R. and Rohlf, F.J. (1995) Biometry: The Principles andPractice of Statistics in Biological Research, Freeman

49. Chao, A. and Chiu, C-H. (2017) iDIP (Information-based Diver-sity Partitioning) Online: Software for partitioning Shannon diver-sity and phylogenetic diversity under multi-level hierarchicalstructures. Program and User’s Guide. http://chao.stat.nthu.edu.tw/wordpress/software_download/ https://chao.shinyapps.io/iDIP/

50. Rossetto, M. et al. (2011) The impact of distance and a shiftingtemperature gradient on genetic connectivity across a hetero-geneous landscape. BMC Evol. Biol. 11, 126

51. Karlin, E.F. et al. (2011) One haploid parent contributes 100% ofthe gene pool for a widespread species in northwest NorthAmerica. Mol. Ecol. 20, 753–767

52. Houde, A.L.S. (2016) Restoring species through reintroduc-tions: strategies for source population selection. Restor. Ecol.23, 746–753

53. Chanda, P. et al. (2009) Information-theoretic gene-gene andgene-environment interaction analysis of quantitative traits.BMC Genomics 10, 509

54. Chanda, P. et al. (2008) Ambience: a novel approach andefficient algorithm for identifying informative genetic and envi-ronmental associations with complex phenotypes. Genetics180, 1191–1210

55. Schlötterer, C. et al. (2015) Combining experimental evolutionwith next-generation sequencing: a powerful tool to study adap-tation from standing genetic variation. Heredity 114, 431–440

56. Vandepitte, K. et al. (2014) Rapid genetic adaptation precedesthe spread of an exotic plant species. Mol. Ecol. 23, 2157–2164

57. Danchin, E. et al. (2011) Beyond DNA: integrating inclusiveinheritance into an extended theory of evolution. Nature Rev.Genet. 12, 475–486

58. Bonduriansky, R. (2012) Rethinking heredity, again. TrendsEcol. Evol. 27, 330–336

59. Jensen, J.D. et al. (2005) Distinguishing between selectivesweeps and demography using DNA polymorphism data.Genetics 170, 1401–1410

60. Rollins, L.A. et al. (2016) Selection on mitochondrial variantsoccurs between and within individuals in an expanding invasion.Mol. Biol. Evol. 33, 995–1007

61. Vuong, H.B. et al. (2017) Influences of host community char-acteristics on Borrelia burgdorferi infection prevalence in black-legged ticks. PLoS One 12, e0167810

62. Held, T. et al. (2014) Adaptive evolution of molecular pheno-types. J. Stat. Mech. http://dx.doi.org/10.1088/1742-5468/2014/09/P09029 stacks.iop.org/JSTAT/2014/P09029

63. Riedel, N. et al. (2015) Multiple-line inference of selection onquantitative traits. Genetics 201, 305–322

64. Kobayashi, T.J. and Sughiyama, Y. (2015) Fluctuation relationsof fitness and information in population dynamics. Phys. Rev.Lett. 115, 238102

65. de Vladar, H.P. and Barton, N.H. (2011) The contribution ofstatistical physics to evolutionary biology. Trends Ecol. Evol. 26,424–432

962 Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12

Page 16: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

66. Saito, N. et al. (2014) Evolution of genetic redundancy: therelevance of complexity in genotype–phenotype mapping.New J. Phys. 16, 063013

67. Nourmohammad, A. et al. (2013) Evolution of molecular phe-notypes under stabilizing selection. J. Stat. Mech. Theor. Exp.2013, P01012

68. Kosman, E. (2014) Measuring diversity: from individuals to pop-ulations. Eur. J. Plant Pathol. 138, 467–486

69. Wegmann, D. et al. (2011) Recombination rates in admixedindividuals identified by ancestry-based inference. Nat. Genet.43, 847–853

70. Holleley, C.E. et al. (2014) Testing single-sample estimators ofeffective population size in genetically structured populations.Conserv. Genet. 15, 23–35

71. Wu, C. et al. (2012) Genetic association studies: an informationcontent perspective. Curr. Genomics 13, 566–573

72. Yee, J. et al. (2015) Detecting genetic interactions for quantita-tive traits using spacing entropy measure. Biomed Res. Int.2015, ID-523641

73. Zhang, L. et al. (2009) A multilocus linkage disequilibrium mea-sure based on mutual information theory and its applications.Genetica 137, 355–364

74. Isir, A.B. et al. (2016) An information theoretical study of theepistasis between the CNR1 1359 G/A polymorphism and theTaq1A and Taq1B DRD2 polymorphisms: assessing the sus-ceptibility to cannabis addiction in a Turkish population. J. Mol.Neurosci. 58, 456–460

75. Smouse, P. (1974) Likelihood analysis of recombinational dis-equilibrium in multiple locus gametic frequencies. Genetics 76,557–565

76. Yoder, J.B. et al. (2014) Genomic signature of adaptation toclimate in Medicago truncatula. Genetics 196, 1263–1275

77. Moore, J.H. et al. (2017) Grid-based stochastic search forhierarchical gene-gene interactions in population-based geneticstudies of common human diseases. BioData Min. 10, 19

78. Wang, S. et al. (2017) Integrative information theoretic networkanalysis for genome-wide association study of aspirin exacer-bated respiratory disease in Korean population. BMC Med.Genomics 10 (Suppl 1), 31

79. von Kodolitsch, Y. et al. (2006) Predicting severity of haemophiliaA and B splicing mutations by information analysis. Haemophilia12, 258–262

80. Wang, K. et al. (2014) Differential Shannon entropy and differ-ential coefficient of variation: alternatives and augmentations todifferential expression in the search for disease-related genes.Int. J. Comput. Biol. Drug Des. 7, 183

81. Egizi, A. and Fonseca, D.M. (2015) Ecological limits can obscureexpansion history: patterns of genetic diversity in a temperatemosquito in Hawaii. Biol. Invasions 17, 123–132

82. Mandel, J.R. et al. (2016) Patterns of gene flow between cropand wild carrot, Daucus carota (Apiaceae) in the United States.PLoS One 11, e0161971

83. Cooke, G.M. et al. (2016) Understanding the spatial scale ofgenetic connectivity at sea: unique insights from a land fish and ameta-analysis. PLoS One 11 (5), e0150991

84. Leinster, T. and Cobbold, C. (2012) Measuring diversity: theimportance of species similarity. Ecology 93, 477–489

85. Chiu, C.-H. et al. (2014) Phylogenetic beta diversity, similarity,and differentiation measures based on Hill numbers. Ecol.Monogr. 84, 21–44

86. Beaumont, M. and Nichols, R.A. (1996) Evaluating loci for use inthe geentic analysis of popualtion structure. Proc. R. Soc. Lond.263, 1619–1626

87. Bergstrom, C.T. and M., L. (2005) The fitness value of informa-tion. arXiv q-bio/0510007v1 [q-bio.PE]

88. Dodig-Crnkovica, G. (2017) Nature as a network of morpholog-ical infocomputational processes for cognitive agents. Eur.Phys. J. Special Topics 226, 181

89. Zeeberg, B. (2002) Shannon information theoretic computationof synonymous codon usage biases in coding regions of humanand mouse genomes. Genome Res. 12, 944–955

90. Mashayekhi, M. et al. (2014) A machine learning approach toinvestigate the reasons behind species extinction. Ecol. Inform.20, 58–66

91. Chao, A. et al. (2010) Phylogenetic diversity measures based onHill numbers. Philos. Trans. R. Soc. B 365, 3599–3609

92. Scheiner, S.M. et al. (2016) Decomposing functional diversity.Methods Ecol. Evol. 8, 809–820

93. Verity, R. and Nichols, R.A. (2014) What is genetic differentia-tion, and how should we measure it—GST, D, neither or both?Mol. Ecol. 23, 4216–4225

94. Zhou, Q. and Yu, Y.-M. (2014) Information dimension analysis ofbacterial essential and nonessential genes based on chaosgame representation. J. Phys. D Appl. Phys. 47, 465401

95. Ané, C. and Sanderson, M.J. (2005) Missing the forest for thetrees: phylogenetic compression and Its implications for inferringcomplex evolutionary histories. Syst. Biol. 54, 146–157

96. Hadka, D. and Reed, P. (2013) Borg: an auto-adaptive many-objective evolutionary computing framework. Evol. Comput. 21,231–259

97. Sole, R. (2016) Synthetic transitions: towards a new synthesis.Philos. Trans. R. Soc. B 371, 20150438

98. Mather, A.E. et al. (2012) An ecological approach to assessingthe epidemiology of antimicrobial resistance in animal andhuman populations. Proc. R. Soc. B 279, 1630–1639

99. Jost, L. et al. (2010) Partitioning diversity for conservation anal-yses. Divers. Distrib. 16, 65–76

100. Chiu, C.-H. and Chao, A. (2014) Distance-based functionaldiversity measures and their decomposition: a framework basedon Hill numbers. PLoS One 9, e100014

101. Chao, A. and Chiu, C.-H. (2016) Bridging the variance anddiversity decomposition approaches to b diversity via similarityand differentiation measures. Methods Ecol. Evol. 7, 919–928

Trends in Ecology & Evolution, December 2017, Vol. 32, No. 12 963

Page 17: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

1

Supplementary Material For

‘Information theory broadens the spectrum of molecular ecology and evolution.’

Prediction, Sampling, Estimation, and Examples.

W.B. Sherwin 1 *, A. Chao 2, L. Jost 3, P.E. Smouse 4 1 Evolution and Ecology Research Centre, School of Biological Earth and Environmental Science, University of New South Wales, Sydney NSW 2052 AUSTRALIA. and Murdoch University Cetacean Research Unit, Murdoch University, South Road, Murdoch, WA 6150 AUSTRALIA. 2 Institute of Statistics, National Tsing Hua University, Hsin-Chu 30043, Taiwan. 3 EcoMinga Foundation, Via a Runtun, Baños, Tungurahua, Ecuador 4 Department of Ecology, Evolution and Natural Resources, School of Environmental and Biological Sciences, Rutgers University, New Brunswick, NJ 08901-8551, USA.

Contents - abbreviated titles Box S1 Measuring and Predicting Molecular Entropies and Diversities for q=1,2,3: Within-population (alpha) and Among-populations (beta)

Box S2 Derivation of the q-profile

Box S3 Studies using Molecular Information Measures plus q-profile

Box S4 Selection

Box S5 Selection and Linkage Disequilibrium (LD)

Box S6.1 Molecular Information Analyses – Example Calculations

Box S6.2 Molecular Information Analyses – Sampling Considerations

Box S6.3 Molecular Information Analyses – Estimation Programs

Box S7 References

Page 18: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

2

Box S1. Measuring and Predicting Molecular Entropies and Diversities for q=1,2,3: Within-population (alpha) and Among-populations (beta)

(Predictions are for Neutral Alleles – Predictions with Selection are in Box S4)

Any system can be summarised by a q-profile of ‘effective-number’ diversity metrics (Table S1 below) (Pielou 1966 [1]; Hill 1973 [2]; Chao et al. 2014 [3];

Chiu and Chao 2014 [4]). Box S2 shows the derivation of this equation for each order of q, written in terms of either

diversity or entropy

where all symbols are defined in the footnote of Table S1 below. This table shows some commonly used values of q that can be chosen to emphasize (or de-

emphasize) rare or common variants. Table S1 is written for single-locus diploid data, but the measures can be applied to allele proportions in haploid or

polyploid data, or other variants such as frequencies of species in an ecological community, individual behaviours in a population, physical characteristics in a

landscape, etc. (though predictive equations might need to be written separately for each of these arenas). The ‘effective number’ diversities ( )

are on a common scale – the numbers of allelic types (or different species) that would give the respective ( ) entropy values if the population had

equally frequent alleles (Chao et al. 2015 [5]). Because of this common scale, these measures are ideal for plotting diversity profiles (e.g., Box1-II in main

article). Box S6 provides example calculations, sampling considerations, and programs.

Page 19: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

3

Table S1 ‘Effective numbers’ and Entropies - Estimation and Prediction for Alpha and Beta

(All Symbols that are not defined in the body of the table are defined in the footnote)

Entropy and Diversity Construct and Description

(for alpha level)

Examples of Equations for Estimation (‘Measure’) and Prediction, for Alpha and Beta

Unless otherwise stated, predictions are for neutral alleles with constant population size and at equilibrium between random drift due to finite population size, and mutation (and dispersal, for β).

q = 0

Order Zero

- Entropy

- Number-equivalent

or Diversity

These are based on the number of different types S, such as allele types in a population, or different species in a community.

α (Within-Population)

α MEASURE

- ENTROPY: where Sα = number of different types (e.g., allele types in a population, or

different species in a community).

- EFFECTIVE NUMBER EQUIVALENT: = = number of different types of alleles.

- There are a number of other similar measures such as numbers of variable (segregating) sites in the genome or part thereof.

α PREDICTION A lengthy equation, for IAM mutation (Ewens 1972 [6]; Halliburton 2004 [7] p 401).

β (Differentiation Among Populations)

β MEASURE There are several of these. We show two examples for a pair of populations (K = 2), although any other value of K can be used (Chao et al. 2014 [3]).

- Jaccard dissimilarity = where R is the number of shared allelic types between the two

populations, and is the total (γ) number of types in both populations.

- Sørensen dissimilarity index = (Chao et al. 2014 [3]).

β PREDICTION There have only been a few algebraic attempts to predict beta differentiation for genes

(Barton & Slatkin 1986; Slatkin & Maddison 1989 [8, 9]).

Page 20: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

4

Entropy and Diversity Construct and Description

(for alpha level)

Examples of Equations for Estimation (‘Measure’) and Prediction, for Alpha and Beta

Unless otherwise stated, predictions are for neutral alleles with constant population size and at equilibrium between random drift due to finite population size, and mutation (and dispersal, for β).

q=1

Order One

- Entropy

- Number-equivalent or Diversity

These are based on Information content, i.e., ability to spell out different messages by rearranging individual alleles, species, etc. (Box1-I in main article). This is more formally regarded as the uncertainty in the allelic type of a randomly sampled allele.

is also called Shannon’s

Information, and some publications call this ,

where ‘S’ denotes ‘Shannon’, NOT number of alleles - we discourage the

usage.

α (Within-Population)

α MEASURE

ENTROPY (eqn 1 in main article) where pi is the proportional abundance of

the ith allele in a population at a particular locus . (NB, conventionally, 0ln0=0; log-2 is

sometimes used, but we use ln, log-e).

EFFECTIVE NUMBER EQUIVALENT: which is the number of equally-frequent types that

would be needed to give the same value of Shannon’s Entropy , as calculated from the observed allelic

data.

Shannon measures for departure from single locus Hardy-Weinberg equilibrium (HWE) and multilocus linkage disequilibrium (LD) are in Boxes S4 and S5, respectively.

α PREDICTION For neutrality (Chao et al. 2015 [5]):

Predictions for are obtained by substituting predicted , into the equation shown above:

Shannon predictions for balancing and directional selection are in Box S4.

Page 21: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

5

β (Differentiation Among Populations) β MEASURE

There are two entropy measures, each being based on Shannon theory.

(1) Mutual Information ( I ), which can be calculated in either of two ways:

I is linearly related to the traditional log-linear chisquare ( for a contingency table of differentiation of allele frequency between populations in different locations (Box 2 in main article):

OR

, where denotes the average of each within-population entropy

, and is caculated as for , but after the allelic dataset has been pooled over all

populations. can be weighted for population size, if one knows it (Sherwin et al. 2006 [10], where I

is called ).

(2) ‘Shannon Differentiation’ is mutual information normalized to a [0,1] scale:

, for K populations with equal weights (e.g., equal or unknown sizes, Nj). If population weights are not equal, the normalization constant (ln K) is replaced by the Shannon

entropy of relative weights, eg: .

As for the alpha-measures, the entropy measures for differentiation can be converted to effective numbers, e.g. by

taking the exponential of mutual information: .

Information-based measures of linkage disequilibrium (LD) in multiple populations are discussed in the main text section ‘Using Information Methods to Scan the Genome for Adaptive Innovation‘, and in Box S5.

Page 22: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

6

β PREDICTION

For SMM mutation there is an equation derived from simulation and curve-fitting:

For haploids, is changed to (Sherwin et al. 2006 [10]).

There is a SNP equation that is derived from rigorous population genetic theory (Dewar et al. 2011 [11]):

There are also IAM and SMM predictive equations that are derived from rigorous population genetic theory (Chao

et al. 2015 [5]). Predicted , where predictions for (for the average

population) and (for the pooled populations), are estimated based on the alpha-level prediction equations

above ( ), with the addition of dispersal (m) between populations, giving quite large

estimation equations, shown in Chao et al. (2015) [5].

Predictions for the numbers-equivalent scale, , are obtained either from predicted values of I , or by

substituting predicted values and into the equation: .

Continued on next page….

Page 23: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

7

Table S1 continued…..

Entropy and Diversity Construct and Description

(for alpha level)

Examples of Equations for Estimation (‘Measure’) and Prediction, for Alpha and Beta

Unless otherwise stated, predictions are for neutral alleles with constant population size and at equilibrium between random drift due to finite population size, and mutation (and dispersal, for β).

q=2

Order Two

- Entropy

- Number-equivalent or Diversity

These are based on the chance of choosing two different types from the array of types of alleles (or species).

α (Within Population)

α MEASURE ENTROPY has many variants:

- HWE (Hardy-Weinberg) expected heterozygosity (eqn 2 in main article).

is also called

- Heterozygosity corrected for sample size (Nei’s gene diversity, Nei 1978 [12])

- Pairwise differences of DNA sequences (Nei & Li 1979, Kosman 2003 [13, 14])

- Other variants involve squared proportions of haplotypes, SNP alleles, etc (Halliburton 2004 [7]; Nielsen and Slatkin 2013 [15])

- Gini-Simpson is the name for in ecology.

- Departure from HWE is summarised by Wright’s

(Halliburton 2004 [7], Nielsen and Slatkin 2013 [15]).

EFFECTIVE NUMBER EQUIVALENT

- is the effective number of alleles (q=2 scale) which is the number of equally-

frequent types that would be needed to give the same value of heterozygosity , as calculated from

the observed allelic data (Kimura & Crow 1964 [16])

Page 24: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

8

α PREDICTION Various predictions exist, for IAM, SMM, at equilibrium and in a bottlenecked population (reduced population size) see Halliburton (2004, p 382) [7]; Nielsen and Slatkin (2013) [15].

Predictions for effective numbers can be made by substituting predictions of in the effective-number

equivalent equation above, .

β Differentiation Among Populations

β MEASURE Again, there are many variants for q=2, involving squaring or pairwise comparisons, including:

- The original FST, one of whose estimators is where is the average

over all populations, and is calculated as for , but after the dataset is pooled over all

populations. This has dependency problems, described in the main text, for which some corrections are available (Hedrick 2005, Meirmans & Hedrick 2011 [17, 18])

- Jost-D , a beta-measure with no dependence on

alpha-values. This is the Morisita-Horn Index in Ecology (Jost 2008 [19]). Note that the use of the symbol “D” here does NOT imply that this is a number-equivalent measure – those measures are shown several lines below.

- Nei’s unbiased genetic distance (Nei 1987 [20])

- Partitions of pairwise differences of DNA sequences between populations (Nei & Li 1979 [13])

- Algorithms that rely on pairwise comparisons or squared allele proportions, such as STRUCTURE (Pritchard et al. 2000 [21]).

- and AMOVA (Excoffier et al. 1992 [22]) that use pairwise comparisons including considering distance

between any two alleles

- Methods that merge aspects of q=0 and q=2 (Slatkin & Maddison 1989 [9]).

- Many other methods (Broquet & Petit 2009 [23])

There are also effective numbers measures for beta-level (q=2) measures, eg: (Jost et al. 2010 [24])

Page 25: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

9

β PREDICTION

- Most predictions are based on (above) or an analogue. One such prediction is

, which has been used (and criticised) for conversion of molecular

differentiation to values of (Whitlock and MacCauley, 1998 [25]).

- Jost-D

where (Jost 2008 [19]).

Predictions for are obtained by substituting predicted , into the effective numbers equations, for converting

measured to measured .

q > 2

Order 3 and above Any order of q up to infinity can be used, including fractional values of q.

In practice, usually gives similar results to (Box 1-II in main text), so a diversity profile for order q

between 0 and 3 or 4 generally conveys all information on the allele frequency distribution.

Symbols not defined in the table above:

is the effective number equivalent of entropy , so is the number of equally-frequent types that would be needed to give the observed value of

entropy (Note that D is also employed for various other measures in genetics, not necessarily corresponding to number-equivalents);

e is the base of natural logarithms

is the entropy of order q (Note that H is also employed for various other measures in genetics, not necessarily corresponding to entropy);

is the proportion of heterozygotes observed in the population;

HWE is Hardy-Weinberg-Equilibrium, the condition where the combinations of alleles in diploid genotypes are as expected from random combination of the population’s pool of alleles;

Page 26: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

10

IAM is the infinite allele model;

K is the number of populations;

ln is the natural logarithm (log-base-e)

m is dispersal (island-model), i.e. exchange of individuals between populations per generation ;

N is the actual population size; Nj is the size of the j th population; NT is the total size of all populations in a metapopulation;

n is the sample size;

Ne is the effective population size;

PHWE is the expected proportion of a genotype in the population under HWE (Hardy-Weinberg Equilibrium);

pi is the proportional abundance of the ith allele in a population at a particular locus ;

POBS is the observed proportion of a genotype in the population;

q is the order of entropy qH or diversity qD (see Box S2).

R is the number of shared allelic types between two populations;

S is the number of different allelic types;

SMM is the stepwise mutation model;

SNP refers to biallelic loci such as single nucleotide polymorphisms;

α denotes a within-population measure;

β denotes an among-population differentiation;

is the log-linear chisquare, also sometimes called G (Sokal and Rohlf, 1995 [26]);

γ denotes a total for all populations;

is the digamma function;

λ = μ(1-μ);

μ is mutation rate;

.

Page 27: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

11

Box S2 Derivation of the q-Profile from a General Power-sum of Allele Proportions

The effective numbers are ideal for an intuitive understanding of the common basis of the indices in the q-profile (Chiu and Chao 2014 [4]).

The basic concept of the ‘effective number of types’ is comparison between:

- an actual population with S types at relative abundances . that has diversity , as calculated from Box S1,

and

- a reference population with types and equal abundances . This reference population would have diversity ,

for every order of q (see Box 1-II in main article).

To create the basic diversity equation, one takes the power sum of the relative abundances for the actual and reference populations, and requires them to be equal. For the actual population, the q-th power sum is:

For the reference population, the q-th power sum is:

Equating the sums for the actual and reference populations, we have:

This is called the Hill equation (Hill, 1973 [2]), and it gives rise to the equations in Box S1:

- If q =0 we obtain

- If q=2, we obtain

- If q=1, ∆ is undefined, but if we let q tend towards unity, we then obtain the limit .

The low sensitivity of and to rare alleles when q ≥ 2 can be seen to derive from the fact that p-values close to zero (i.e., rare) become very small when

squared (cubed, etc). 1H is a special case of either Tsallis or Renyi entropy (Maszczyk and Duch 2008) [27].

Page 28: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

12

Box S3 Summary of Alpha and Beta Studies using Molecular Information Measures or a q-profile in Molecular Ecology and Evolution

Table S3 shows examples of the use of Shannon molecular-genetic measures in the literature, for alpha level, for beta level, and for other uses. In most cases

there is agreement between biological interpretations provided by the (q=1) measures and other measures, but the more interesting cases are those for

which the measures give divergent interpretations. These divergent cases show the importance of using a carefully chosen profile of measures with different

values of q, because one part of the profile is likely to be more sensitive to the predicted pattern. These cases are discussed further in the main article.

Note that some confusion in the cited studies could be minimised by the use of consistent symbols in publications and programs (e.g., H, D and I , see Box

S1).

Page 29: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

13

Table S3 Examples of the use of Shannon measures in the molecular ecology and evolution literature. Cases with greater contrast

between q =1 and other measures are placed first. Microsatellites are nuclear unless otherwise stated. Abbreviations: ct – chloroplast; IAM – infinite alleles

model of mutation; indel – insertion or deletion; msat – microsatellite(s); mt – mitochondrial; RMSE – root mean square error; SMM – stepwise mutation

model; SNP – single nucleotide polymorphism; ρ - correlation coefficient.

SYSTEM

Species, Context

Loci – mutation mechanism, number, names.

Citation

THE q=1 MEASURE, and its predictive performance

PERFORMANCE COMPARISON with q=0, q=2

Cases with contrast to q =1 are bolded

ALPHA – Variation within Populations

Mosquito Aedes japonicus japonicus; recent invasion of Hawaii.

SMM; 7 msat.

Egizi et al. 2015 [28]

A temporal study showed that variability ( ) originally

seen at the low-elevation introduction area was maintained at high elevations, but lost at the original introduction site, possibly because of bottlenecks of population size.

Allelic Richness S (q=0) agreed with (q=1), but

(q=2) did not show any significant pattern, presumably because of its emphasis on common alleles, which are less likely to be lost in a bottleneck than rare alleles (the allele frequencies were not shown, so we cannot check this inference).

Wild vs crop Carrots (Daucus carota), North America.

SMM; 15 msat.

Mandel et al. 2016 [29]

clearly shows that wild populations near crops have

higher variability than wild populations further from the crops

showed a similar pattern to , but was less

sensitive, possibly because of its relative insensitivity to rare alleles entering the wild population from the crop.

Page 30: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

14

SMM Theory; and Tree Elaeocarpus grandis, Australia.

SMM; 5 msat.

Sherwin et al. 2006 [10]

was called . Predictive equations for SMM

provided a close fit to simulation data, and to data from wild populations.

Relative performance was not assessed, although

was used to estimate , for input to the

predictive equations.

IAM Theory, and Bird Acridotheres tristis, Freshwater fish Galaxias truttaceus, Land snail Theba pisana.

IAM; 39, 22,25, allozyme.

Sherwin et al. 2006 [10]

was called . Predictive equations for IAM

provided a close fit to simulation data, and to data from wild populations.

Relative performance was not assessed, although

was used to estimate , for input to the

predictive equations.

IAM and SMM Theory.

Chao et al 2015 [5]

Predictive equations for IAM and SMM fitted closely to simulation data.

Approximate conversions between q=1 and q=2 scale were derived, for IAM and SMM.

Rainforest tree Elaeocarpus sedentarius; Australia.

SMM; 7 msat.

Rossetto et al. 2008 [30]

was compatible with a very high mutation rate (10-2

per generation)

Allelic richness (q=0); (q=1); and (q=2) were all

lower in the same population (one of two populations).

Harvested Pine species Pinus strobus; North America.

SMM; 3 ct (paternal) msat.

Zinck and Rajora 2016 [31]

decreases with increasing physical separation from

the likely refugium.

strongly correlated to ctDNA haplotype diversity

(q=2; ρ=0.98) and Allelic Richness (q=0; ρ=0.86).

Page 31: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

15

New Zealand Marine Evechinus chloroticus.

SMM; 6 msat.

Nagel et al. 2015 [32]

Predictive performance of was not assessed. was strongly correlated to S (q =1; ρ=0.98), (q

=2; ρ=0.97) , (q =2; ρ=0.96).

Plants in the Delonix

complex of species;

Madagascar.

Possibly similar to IAM:

AFLP (amplified fragment

length polymorphism

bands); number not stated.

Rivers et al. 2011 [33]

was larger in non-threatened species, but not

significantly associated with population size.

(q =1) and number of polymorphic loci (a type of q=0

measure) were correlated to each other (ρ=0.41).

Puma concolor; North

America.

SMM; 46 msat.

Ernest et al. 2014 [34]

Compared to other Californian populations, a small isolate of Puma showed lower .

The same pattern was seen with allelic richness (q=0); (q =1); and (q=2).

Fungal Pathogen Puccinia

psidii; South America.

SMM; 10 msat.

Graça et al. 2013 [35]

was calculated in two ways: for all isolates, and for

isolates grouped by clonal identity. Both measures

were lower in the putative source, suggesting that the pathogen did not jump from native to crop tree species (the authors had predicted such a jump).

(q=1) agrees with other evidence, including S (q =0);

(q =2); (q=2).

Page 32: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

16

Fungal Pathogen Rhynchosporium commune; Australia.

SMM; 14 msat.

Linde et al. 2016 [36]

confirmed the predicted increase with pathogen

population size, but not the predicted increase with host diversity.

(q=1) agrees with (q=2).

Earthworm Lumbricus

terrestris, probably

introduced from Europe to

North America.

SMM; 3 msat.

Gailing et al. 2012 [37]

was similar in the introduced North American

populations and the possibly native European populations.

The same pattern was seen with (q=1) and (q =2)

Nei’s gene diversity (Table S1).

Bread Wheat land races; Turkey.

IAM; Hexaploid; Glu-A1, Glu-B1, Glu-D1 (Genotyping by Coomassie gels and polymerase chain reaction)

Temizgul et al. 2016 [38]

was calculated, but not compared with a biological

hypothesis.

High correlation between (q=1) and (q=0) ,

.

Brassica sp. crops; India and

China, 2-3 independent

introductions apparent.

SMM; 99 msat.

Chen et al. 2013 [39]

Maximal near centres of two of the introductions –

possibly the introduction localities.

No comparison of (q = 0,1,2).

Page 33: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

17

Invasive Mosquito Aedes japonicus japonicus, Europe.

SMM; 7 msat.

Zielke et al. 2014 [40]

is similar in two populations, agreeing with other

evidence suggesting a common origin and history. Another population has lower , suggesting a

bottleneck since introduction.

No comparison of (q = 0,1,2).

Fungus Fusarium oxysporum; North America – soil vs. tomato-plant populations.

IAM; 1 TEF1α sequence; two very common variants predominated.

Demers et al. 2015 [41]

was calculated, but not compared with a biological

hypothesis.

No comparison of (q = 0,1,2).

Table S3 continues on next page…

Page 34: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

18

Table S3 continued

SYSTEM

Species, Context

Loci – mutation mechanism, number, names.

Citation

THE q=1 MEASURE, and its predictive performance

PERFORMANCE COMPARISON with (q=0, q=2)

Cases with contrast to (q =1) are bolded

BETA - Differentiation Between Populations

Wild vs crop Carrots (Daucus carota), North America.

SMM; 15 msat.

Mandel et al. 2016 [29]

Mutual Information I showed that variability was lower between groups of wild populations close to crops, compared to groups of wild populations far away from crops.

FST (q=2) showed the same pattern as I (q=1), but authors stated that the pattern was more clearly demonstrated by I. This difference in results may be because of mutual information’s greater sensitivity to rare immigrant alleles, but it also could be unrelated to beta differentiation, because FST is also very sensitive to within population diversity (see main text).

Semi-marine fish Alticus Arnoldorum around coastline of Guam (Pacific Ocean).

SMM; 17 msat.

Cooke et al. 2016 [42]

I showed very low differentiation, not significant. FST (q=2) showed the same pattern of very low differentiation as I (q=1), but it was marginally significant. It is worth noting that the different behaviour could be unrelated to beta differentiation, because FST is also very sensitive to within population (alpha) diversity (see main text).

Tree Elaeocarpus sedentarius; Australia.

SMM; 7 msat.

Rosetto et al. 2008 [30]

I used to investigate dispersal level between a pair of populations.

FST (q=2) and I (q=1) were each used to estimate dispersal. The estimate from I was an order of magnitude smaller than from FST. Simulations suggest that this is because of the poor root-mean-square-error of dispersal estimates from FST (Sherwin et al.2006 [10]).

Page 35: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

19

Three-species hybrid Moss Sphagnum x falcatulurm; three holantarctic locations.

SMM; 5 msat.

Karlin and Smouse 2017 [43]

Shannon analysis ( I and partition of information) showed (a) > 90% of the allelic information in the system was inherited directly from the three ancestors, via recurrent inter-specific hybridization; (b) the inherited ancestral alleles show almost no allelic frequency profile differences from those of their respective source species; (c) the three regional populations are slightly divergent from each other, due to novel mutations (post-hybrid) that are regionally restricted.

The allelic variants in point (c) are all rare, and their importance would have been missed with classic q=2 metrics, according to the authors.

Zooplankton (Haloptilus longicornis) and (Pleuromamma xiphias); North Pacfic. IAM; mitochondrial genes cytochrome c oxidase subunit II (H. longicornis) and subunit I (P. xiphias).

Iacchei et al. 2017 [44]

q-profile (0,1,2) with sampling correction (Chao and Jost 2015 [45]); q=1 revealed temporal differentiation between populations sampled at different times but in the same location.

Greatest temporal differentiation with q=0, least (none) with q=2.

Theory (SMM); Wild tree Elaeocarpus grandis; captive fly Drosophila melanogaster with controlled and ,

Australia.

SMM; 3,4 msat;

Sherwin et al. 2006 [10]

A theoretical equation for I (called ) was

derived from curve-fitting to simulated ‘training’ data.

A simulated ‘test’ dataset showed very good agreement between the known input to the

program, and derived from the predictive

equation, and I calculated from the genotypic data output by the test simulation runs.

The fly dataset showed a 45-degree slope in the plot of predicted versus measured I .

For the simulations, estimated from (q=2)

always had a greater root-mean-square-error RMSE than estimated from I (q =1), over a wide range of

conditions (see main text).

For the tree, estimated from (q=2) often had a

much greater coefficient of variation than estimated

from I (q =1). RMSE could not be calculated because the true was not known.

Page 36: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

20

Theory (SNP); captive fly Drosophila melanogaster with controlled and ,

Australia.

SNP; 73 anonymous DNA regions.

Dewar et al. 2011 [11]

A theoretical equation for mutual information I derived from population genetic theory, fitted well to simulated dataset, and to fly data.

No comparison of (q = 0,1,2).

Theory (SMM, IAM); Bird Sturnus vulgaris; Australia.

SMM 3 msat; IAM 1 DRD4.

Chao et al. 2015 [5]

A theoretical equation for mutual information I derived from population genetic theory, fitted well to the simulated dataset for IAM and SMM, and to bird data for the SMM loci. The bird data did not fit well for the IAM locus, which could be because of chance with only one locus, or because of other influences such as selection, though the latter is unlikely (Rollins et al. 2015 [46]).

No comparison of (q = 0,1,2).

Waratah (plant) Telopea speciosissima; Australia.

SMM; 7 msat.

Rossetto et al. 2011 [47]

I was used to investigate dispersal level between populations on ecological gradients.

Strong negative correlation between (q=2) measure FST and dispersal estimated from I (q=1). The negative correlation is expected, because dispersal has an inverse relationship to I , see Box S1.

Table S3 continues on next page….

Page 37: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

21

Fungus Fusarium oxysporum; North America – soil vs. tomato-plant populations.

IAM; 1 TEF1α sequence; two very common variants predominated.

Demers et al. 2015 [41].

I (incorrectly presented as an α measure). I showed significant differentiation between soil and plant populations in one of two fields.

I (q=1) agreed with a (q=2) type measure called ‘Snn’, apparently because two variants predominated, so there were very few rare alleles to cause a difference between the two measures.

Moss Sphagnum subnitens; New Zealand, Europe, North America.

IAM; 8 msat; NB clonal species.

Karlin et al. 2011 [48]

For the three regions, I was used to calculate pairwise genetic exchange Nem, showing high dispersal NZ-Europe.

I (q=1) and two (q=2) measures (Nei’s genetic distance; Jost-D) showed the same rankings for genetic differentiation among pairs of localities.

Moss, Sphagnum palustre; Hawaiian Islands vs other populations.

SMM; 16 msat ; NB there were difficulties of identifying allelism of variants, also note this is a clonal species.

Karlin et al. 2012 [49]

Results support origin from a single founder event. I (q=1) and a (q=2) measure (Nei’s genetic distance) showed strong correlation (ρ=0.88).

Page 38: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

22

Wombat Lasiorhinus latifrons; Australia.

SMM; combination of studies with different numbers of msat markers.

Alpers et al. 2016 [50]

I was consistent with subdivision that does not accord exactly with a postulated biogeographic barrier.

I (q=1) was congruent with two (q=2) approaches: STRUCTURE and AMOVA.

Human Homo sapiens; Europe.

IAM; 1 Y-chromosome region with multi-SNP haplotypes.

Niederstatter et al. 2012 [51]

Haplotypes clustered better with language group than geographic region.

Similar results for I (q =1) and (q=2).

Rainforest tree; Podocarpus elatus.

SMM; 6 msat.

Mellick et al. 2011 [52]

I was calculated, but not compared with a biological hypothesis.

I (q=1) was congruent with two (q=2) approaches: STRUCTURE and FST.

Orchids, four Chiloglottis species; Australia.

SMM and other; 13 (nuclear) msat;

and 14 (ct) indel.

Smouse et al. 2015 [53]

Partition of beta Shannon diversity. Divergence was more rapid for haploid ctDNA than for nuclear microsatellites, as expected.

No comparison of (q = 0,1,2).

Page 39: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

23

Tree Austrocedrus chilensis; South America.

SMM; 5 msat.

Colabella et al. 2014 [54]

I shows little N:W or S:E differentiation. No comparison of (q = 0,1,2).

Humpback whales (Megaptera novaeangliae) on Antarctic feeding grounds - Southern, Indian & Pacific Oceans.

SMM; 10 msat.

Schmitt et al. 2014 [55]

Mutual information I used to define geographic stocks.

Mutual Information I (q=1) broadly agrees with (q=2)

AMOVA FST Jost-D.

Koala (Phascolarctos cinereus), Australia.

SMM ; 14 msat.

Dennison et al. 2016 [56]

Mutual Information I showed low differentiation over short distances (100-200km).

When analysis was restricted to localities with ≥ 10

samples, from I showed the expected strong

negative correlation with Fst (ρ=-0.38) or F'st (Hedrick,

2005; Meirmans 2011 [17, 18]; ρ=-0.49) because both I

and FST are inversely related to .

Page 40: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

24

Lizard (Intellagama

lesueurii) at four city parks,

three connected nonurban

and three isolated

nonurban populations,

Australia.

SMM; 13 msat.

Littleford-Colquhoun et al. 2017 [57]

Mutual information I (q=1) showed that city park populations had much higher divergence than non-urban populations, irrespective of connectedness.

Mutual information I showed high correlations with three (q=2) measures: FST ρ=0.92; GST ρ=0.92; Jost-D ρ=0.95.

American hazelnut (Corylus americana); USA.

SMM ; 10 msat

Demchik et al. 2017 [58]

“Shannon statistics” (probably and I mutual

information q=1) showed that most genetic diversity

was within populations rather than among populations.

AMOVA (q=2) showed similar partitioning of variation to Shannon.

Table S3 continues below…

Page 41: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

25

Table S3 continued

SYSTEM

Species, Context

Loci – mutation mechanism, number, names.

Citation

THE q=1 MEASURE, its predictive performance, and comparison with (q=0, q=2)

Cases with contrast to (q =1) are bolded

Other uses

Escherichia coli

IAM, Whole genome, coding regions.

Arun et al. 2016 [59]

was best for discriminating between conserved vs duplicated loci,

whereas

was best for discriminating constitutive vs housekeeping loci

Modelling.

Lloyd et al. 2014 [60]

A modelling comparison of the rates of change of measures of molecular differentiation, during the 24 generations

after cessation of dispersal between two halves of a population.

Change was more rapid for two (q=2) measures (Jost-D Box S1; and [17]) than for (q=2) and (q=1) I.

Modelling.

Greenbaum et al. 2014 [61]

Modelling showed that the ‘one-migrant-per-generation’ rule for maintenance of levels in a metapopulation of

two or more populations, is not adequate for maintaining . This finding likely has implications for

maintenance of levels of , whose weighting of rare alleles is intermediate between that of and .

Page 42: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

26

Alpaca Vicugna pacos; Domesticated population, Australia.

Jackling 2014 [62]

aided choice of loci for studying an eye-morph pedigree in alpacas.

Theory.

Sherwin 2006[10]; Putman and Carbone 2014 [63]

Comparison of and (Table S1) has been proposed as a test to identify which mutation mechanism is

occurring.

Malaria parasite Plasmodium mexicanum in North American Lizards, a temporal study.

SMM; 4 msat.

Schall and StDenis 2013 [64]

Years with no significant differentiation, measured by I , were pooled for further analysis.

Theory and Human.

Various loci.

Rosenberg et al. 2003 [65]

The authors derive a novel information-theoretic measure (In) based on a Shannon-like equation (q=1). This measure assesses the informativeness of multiallelic genetic markers that might be used for assignment to potential ancestral populations. The applications of such assignments include controlling ancestry effects in case-control genetic association studies. The authors compare In, in varying degrees of detail, with competing measures such as the closely related Kullback-Liebler, and q=2 measures. They analyse various human data to show that In

appears to be robust. They do point out that In has bias with small sample sizes, however this bias could likely be reduced considerably with modern corrections [45, 66].

Page 43: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

27

Box S4 Measuring and Predicting Effects of Selection

Departure from Hardy-Weinberg Equilibrium (due to selection or other evolutionary causes)

For the (q=1) scale, the conventional log-linear-chisquare test for goodness-of-fit to HWE is in a Kullback-Leibler form (Box 2 in main article):

where summation is over genotypes; POBS is the observed proportion of a genotype in the population; and PHWE is

its expected proportion under HWE (Hardy-Weinberg Equilibrium).

For the q=2 scale, departure from HWE is summarised by Wright’s where is the proportion of

heterozygotes observed in the population (Halliburton 2004 [7], Neilsen and Slatkin 2013[15]).

Single-locus and Multi-locus balancing selection

If the heterozygote (e.g., C/T) has highest fitness (unity), and homozygote fitness values are (1-s1) for C/C and (1-s2) for T/T (where and

), then both alleles will be retained at equilibrium proportions that depend upon the fitness values (Halliburton 2004 [7]). Substituting these

equilibrium proportions into the equation for (Box S1 or equation 1 in main text) one finds that the equilibrium allelic entropy will be:

(Sherwin 2010 [67]).

Page 44: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

28

Note that the ln s term expresses overall selection in favour of the heterozygote, and the second term depends upon the balance of selection against each of

the two homozygotes individually. Multilocus balancing selection also has Shannon-based equations (Nourmohammad 2013 [68]).

Single-locus directional selection

For single-locus directional selection, replacing one allele with another, there are equations for the decline of (the proportion of the allele that is being

replaced) for a number of different inheritance schemes, such as recessive or dominant, X-linked or autosomal (Halliburton 2004 [7]). After each generation

of selection, the equations for the new values of and can be substituted into the Shannon equation (Box S1 or

equation 1 in main text) to show how directional selection changes entropy - ultimately to zero at fixation of the favoured allele.

The traditional logit treatment that linearizes response of to directional selection (Campagne et al. 2016 [69]) is similar to relative entropy or Kullback

Leibler, (above, and Box 2 in main article) between the allele distributions before and after selection (Frank 2012, 2017 [70, 71]), and logits have been used to

analyse genetic diversity of mammals involved in transmission of the Lyme disease bacterium (Vuong et al. 2017 [72]).

Page 45: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

29

Multi-locus directional selection

For continuous (‘quantitative’) traits, information-based methods for multilocus additive directional selection are numerous. A useful general fitness

construct is the difference between (change of Shannon information due to selection) and (previous fitness), whose details are elaborate and case-dependent

de Vladar and Barton 2011; Held et al. 2014; Kobayashi and Sughiyama 2015; Nourmohammad et al. 2014; Riedel et al. 2015) [73-77].

Page 46: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

30

Box S5 Selection, Linkage Disequilibrium (LD) and Mutual Information.

Mutual Information, used to assess geographic molecular variation (Box 2 in main text), is also appropriate for dealing with linkage disequilibrium or LD –

non-random association of SNP (single-nucleotide polymorphism) alleles at different SNP loci to create multi-SNP haplotypes. Table S5.1 shows two linked

SNP loci, and the occurrence of haplotypes bearing each possible combination of the alleles at each SNP. If there is random association between the alleles at

the two SNPs, then there is linkage equilibrium (LE). On the other hand, if there is departure from multi-locus equilibrium (in other words, if there is LD), this

can be expressed with mutual information. For the two SNP loci in Table S5.1, an intuitive example of mutual information is based on considering the

frequency of molecules that carry the allele C at SNP locus 1 and allele T at SNP locus 2. If , then we can say two things. First, there is

linkage equilibrium, and second, having information about whether there is a C at locus 1 provides no additional information about the chance that there is a

T or a G at locus 2. Thus the mutual information is zero: the variables ‘allelic type at locus 1’ and ‘allelic type at locus 2’ provide no information about each

other. On the other hand, if there is complete LD, say for example (pCT = 0.5 = pAG in Table S5.1) but (pAT = 0 = pCG), then knowing that locus 1 had a C would

give complete certainty that the other locus had T, so mutual information is maximal.

Page 47: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

31

Mutual information is calculated in a number of ways, but we calculate it by applying equation 1 in the main text (or from Box S1) to each row of

Table S5.1: The first row shows the variation of locus 1 within the haplotypes whose locus 2 has T:

The second row shows the variation of locus 1 within the haplotypes whose locus 2 has G:

The final row shows the variation of locus 1 for all haplotypes:

The mutual information between the two SNP loci is then: Mutual Information

Note that although this calculation was based on the rows , the same result could be obtained by calculating values for locus 2 (the columns).

Larger values of I indicate higher LD (Smouse & Ward 1978 [78]; Zhang et al. 2009 [79]). Also, mutual information has a simple linear relationship to the

familiar log-linear-chisquare contingency test for association in a table such as Table S5.1: where I is the mutual information

calculated above, and n is the total number of individuals sampled for the table. An LD Table such as S5.1 can also be constructed for diploid genotypes (eg:

column headings C/C, C/A, A/A, row headings T/T, T/G, G/G). Moreover, interaction between multilocus LD and geographic differentiation can be analysed

by an extension of Table S5.1 – see Table S5.2, and Zhang et al. 2009 [79], exploiting the additivity of log-linear-chisquare over hierarchical levels (Sokal and

Rohlf, 1995 [26]; Box 2 in main article).

Page 48: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

32

Table S5.1. Mutual Information to Investigate Linkage Disequilibrium for a two-SNP Haplotype.

SNP locus 1 has two possible bases, C or A, whose respective proportions in the population of alleles are pC and pA (pC + pA =1). SNP locus 2 has either T or G,

at proportions pT and pG. The body of the table shows the proportional occurrence of haplotype DNA molecules bearing each possible combination of the

alleles at each SNP.

Page 49: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

33

Table S5.2. Mutual Information to Investigate Possible Interaction Between Geographic Variation and Linkage Disequilibrium for a

two-SNP Haplotype. This shows a repeat of Table S5.1, but with the two SNP loci being typed in samples from two locations (eg, summit of a mountain,

and surrounding lowland). The table for location 1 (identical to Table S5.1) is in the background at the rear of the cube, while the similar table for location 2

is in foreground at the front of the cube.

Page 50: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

34

Box S6 Carrying out Shannon-based Analyses in Molecular Ecology and Evolution

Having predicted diversity, how can one measure it to check the predictions, or estimate other parameters such as dispersal (Sherwin et al. 2006 [10])?

Table S6.1 Examples of Calculations of Molecular Information Statistics, and their Conversion from the Information (or Entropy)

Scale to the Equivalent-Numbers (or Diversity) Scale.

For a pair of localities, 1 and 2, we show an example of how to follow Box S1 to calculate measures: within-population ; conversion to the effective

numbers scale ; the between-population measure I ; the conversion of I to Shannon differentiation ([0,1] scale or Horn index, Chao et al 2015 [5]); and

selected q=2 measures. Note that in scenarios (a), (b), and (c), there are no shared alleles between the localities, so the (q=1) beta measures are identical,

but the (q=2) beta measures in some cases give different estimates of differentiation, which limits their utility as measures of beta differentiation. Similar

values are obtained from GenAlex (Peakall and Smouse 2006, 2012 [80, 81]) and SpadeR (Chao et al. 2015 [82]). The genetic exchange rate Nem was

calculated from mutual information I following [10] (Sherwin et al. 2006) (Box S1). Also note that this presentation is for a single locus, but (q=1) values can

be summed or averaged over loci (Ryman et al. 2006 [83]).

Page 51: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

35

Scenario, Locality Allele Frequencies (& Proportions) entropy (& effective

number)

entropy and derived measures

Allele 152 Allele 154 Allele 156 (& ) (& ) I [& Horn]

GST Jost-D from I

1 24 (1) 0 0 0 (1) 0 (1)

a 0.69 [1] 1 1 0.025

2 0 0 24 (1) 0 (1) 0 (1)

1 20 (0.83) 4 (0.17) 0 0.45 (1.6) 0.28 (1.38)

b 0.69 [1] 0.756 1 0.025

2 0 0 24 (1) 0 (1) 0(1)

1 12 (0.50) 12 (0.50) 0 0.69 (2) 0.50 (2)

c 0.69 [1] 0.6 1 0.025

2 0 0 24 (1) 0 (1) 0(1)

1 8 (0.33) 8 (0.33) 8 (0.33) 1.10 (3) 0.67 (3)

d 0.32 [0.46]

0.333 0.5 0.120

2 0 0 24 (1) 0 (1) 0(1)

Page 52: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

36

Box S6.2 Sampling for Estimation of Molecular Information Statistics As with any measures, unless sampling is complete, there are possible biases, so we provide methods for correcting for sampling limitations when estimating

diversity measures for any given order of q. The heavy dependence of and on very rare alleles (or species) (Table 1 and Box 1 in main article, and

Boxes S1 and S2) means that it is very difficult to obtain accurate estimates of these statistics. On the other hand, when q ≥ 2, missing some rare allelic types

will affect the estimate very little. Being an intermediate value of q, and have intermediate sensitivity to missing rare types. Luckily, there are

corrections that allow us to minimise these sampling problems for and , and reduce them for and . The empirical (or observed) entropy is

negatively biased, and the magnitude of its bias is approximately (S-1)/(2n), where S is the number of different types, and n denotes the sample size (Basharin

1959 [84]; Lande 1995 [85]). This bias cannot be corrected if the true S is unknown, but there is now a nearly unbiased estimation method for q=1 measures

based on Good-Turing’s theory (Chao et al. 2013 [66]; Chao & Jost 2015 [45]). These corrections are in the program SpadeR (Box S6.3) and can also be used

on q=0, q=2 and any non-negative q-values (including non-integers), and supersede earlier correction methods (Meirmans and Van Tienderen 2004 [86]). We

illustrate this correction with data from Mather et al. (2012) [87] on variant types within Salmonella typhimurium from domestic animals (red) and humans

(blue) in Figure S6.2. Panel (A) shows the empirical diversity profile, and panel (B) shows the corrected profiles. In these data, the under-sampling bias is

substantial: in each group (Domestic animal, Human), there is a single highly abundant type and a large number of singletons – types identified by only a

single record. Therefore, the profiles drop very sharply and become flat when q > 1.5, because diversity is dominated by the single abundant type for (q = 2)

and above. Note the broad overlap of the corrected profiles, except in the region from q = 0.3 to q = 1.2, highlighting the difficulty of comparing profiles if

only q = 0 and q = 2 are used.

Page 53: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

37

Figure S6.2 Sampling Effects, and their Correction, for Diversity Profiles. Both panels show data from Mather et al. (2012) [87] for

microorganisms in humans (blue) and domestic animals (red). The horizontal axis is the value of q, and the vertical axis is for each value of q.

(A) Original empirical data, shaded areas are 95% confidence limits.

(B) Data with correction of Chao & Jost (2015 [45]), also showing in dotted lines the values that would be obtained if the variants within each group

(animals or humans) were equally abundant – see main text and Box S1 for further explanation). Note that the confidence limits overlap extensively after

correction in plot B, except in the region between q = 0.3 and q = 1.2.

Page 54: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

38

A

Page 55: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

39

B

Page 56: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

40

Box S6.3 Estimation Programs

A number of off-the-shelf genetic analysis programs already include information-based measures. Many of these calculations can apply equally to estimation

of entropy or diversity from all types of genetic data, including SNPs, haplotypes, allozymes, microsatellites, structural DNA variants and expression variants.

However, the programs are often specific to a certain type of data input. Of course, if the variants do not follow the standard evolutionary models (neutral

IAM, SMM, SNP, etc) shown in Box S1, then any predictions that are to be compared with the results will likely require specific predictive equations for each

type of variant.

Allelic alpha and beta (q=1),

These are essentially single-locus, but note that q=1 values can be summed or averaged over loci (Ryman et al. 2006) [83].

- GenAlex (Peakall and Smouse 2006, 2012 [80, 81])

- SpadeR (Chao et al. 2015 [82]).

- iDIP Online is a new program designed to give robust partitioning of diversity under multi-level hierarchical structure, based on Shannon entropy, and

with phylogenetic information (Chao and Chiu 2017) [88].

- For complex information-based partitions of beta geographic variation, to test specific hypotheses about hierarchical arrangement of molecular

variation, log-linear (Box 2 in main article) can be calculated via the log-linear chisquare that is widely available in numerous well-known

Page 57: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

41

packages such as SAS, R, SPSS, or vassarstats.net . Also see Smouse et al. (2015) [53] and Karlin and Smouse (2017) [43], which satisfy some of the

requirements for diversity measures.

- MTML-msBayes uses Shannon as a criterion in Approximate Bayesian Computation, to fit ecological and evolutionary models of any arbitrary

complexity (Huang et al. 2011 [89]).

- entropart is an R package for partitioning species diversity, including functional differences, which could possibly be adapted to genetic diversity

(Marcon & Herault 2015 [90]).

Clonal Several programs deal with clonal organisms such as plants, or clonal variation in immune cells in humans.

- ‘Poppr’ deals with asexual and sexual organisms (Kamvar et al. 2014 [91]).

- GENODIVE can calculate information statistics by individual, as for sexual organisms, or by clonal grouping (Meirmans and Van Tienderen 2004 [86]).

Note that GENODIVE contains a correction for sampling problems, but this has been superseded by the method of Chao et al. (2013) [66] and Chao

and Jost (2015) [45].

- ‘tcR-vignette’ [92] uses information statistics to deal with immune genes such as t-cell-receptors, v-genes, and j-genes, that not only have germline

variants but also somatic amplification of clonal cell lines within each individual, followed by development of variation between those lines.

- VAT software includes tools for analysing populations of organisms with asexual or mixed modes of reproduction (Schachtel et al. 2012 [93]).

Page 58: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

42

Multilocus, Selection, and Expression

Linkage disequilibrium, comparisons of allelic arrays before and after selection, and analysis of expression networks, are all part of attempts to identify the

loci whose interactions result in traits that affect fitness in the wild, or cause clinical disorders that are essentially lowered fitness from an evolutionary point

of view. Sometimes these interactions are assessed implicitly as linkage disequilibrium, more directly as epistatic interactions between mapped loci, or

sometimes explicitly as networks of expression. Wang et al. (2011) [94] review a number of these programs. Explicitly or implicitly, most of the programs use

some multi-dimensional variant of mutual information-based measures, so various alternative approaches could be developed from the log-linear chisquare

that is widely available in well-known packages such as SPSS or vassarstats.net .

- Zhang et al. (2009) [79] showed a way to use information theory to assess multilocus linkage disequilibrium.

- MISS is a program that implements a method very similar to Zhang’s (Brunel et al. 2010 [95]).

- AMBIENCE, CSEBiORG and CHORUS calculate KWII, a multidimensional version of mutual information, to assess epistatic interactions; this suite of

programs can accept data on multiple SNP loci or on normally-distributed ‘quantitative’ traits (Chanda 2007, 2008, 2009 [96-98]).

- Hu et al. (2013) [99] used a method similar to Chanda’s to assess three-way epistatic interactions between SNPs, but did not make a program

available.

- Yee et al. (2015) [100] use a method similar to Chanda’s approach for continuous traits, but without requiring normality, by using multidimensionality

reduction methods, without explicitly providing a program.

- ‘Superlink-Online SNP’ streamlines calculations similar to Chanda’s SNP methods (Silberstein et al. 2012 [101]).

Page 59: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

43

- ‘Entropy’ has been used for information-based analysis of expression (Hausser and Strimmer 2009 [102]).

- ‘EntropyExplorer’ has been used for information-based analysis of expression (Wang et al. 2015 [103]).

- ‘CoGA’ has been used for information-based analysis of expression (Santos et al. 2015 [104]).

- ‘cummeRbund’ has been used for information-based analysis of expression (Goff 2015 [105]).

- a suite of MDR, ViSEN and ORANGE has been used for information-based analysis of expression (Moore and Hu 2015 [106]).

Page 60: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

44

Supplementary References 1. Pielou, E.C. (1966) The measurement of diversity in different types of biological collections. Journal of Theoretical Biology 13, 131-144. 2. Hill, M.O. (1973) Diversityand evenness: a unifying notation and its consequences. Ecology 54, 427-432. 3. Chao, A. et al. (2014) Unifying species diversity, phylogenetic diversity, functional diversity, and related similarity and differentiation measures through Hill numbers. Annual Review of Ecology, Evolution, and Systematics 45, 297-324. 4. Chiu, C.-H. and Chao, A. (2014) Distance-based functional diversity measures and their decomposition: A framework based on Hill numbers. PlosONE 9 (7), e100014. 5. Chao, A. et al. (2015) Expected Shannon entropy and Shannon differentiation between subpopulations for neutral genes under the finite island model. PLoS ONE 10 (6), e0125471. 6. Ewens, W.J. (1972) The sampling theory of selectively neutral alleles. Theoretical Population Biology 3, 87-112. 7. Halliburton, R. (2004) Introduction to population genetics, Pearson. 8. Barton, N.H. and Slatkin, M. (1986) A Quasi-equilibrium theory of the distribution of rare alleles in a subdivided population. Heredity 56, 409–415. 9. Slatkin, M. and Maddison, W.P. (1989) A cladisic measure of gene flow inferred from the phylogenies of alleles. Genetics 123, 603-613. 10. Sherwin, W.B. et al. (2006) Measurement of biological information with applications from genes to landscapes. Molecular Ecology 15, 2857-2869. 11. Dewar, R.C. et al. (2011) Predictions of single-nucleotide polymorphism differentiation between two populations in terms of mutual information. Molecular Ecology 20, 3156–3166. 12. Nei, M. (1978) Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics 83, 583–590. 13. Nei, M. and Li, W.H. (1979) Mathematical model for studying genetic variation in terms of restriction endonucleases. Proceedings of the National Academy of Science USA 76, 5269-5273. 14. Kosman, E. (2003 ) Nei’s gene diversity and the index of average differences are identical measures of diversity within populations. Plant Pathology 52, 533–535. 15. Nielsen, R. and Slatkin, M. (2013) An introduction to population genetics theory and applications, Sinauer. 16. Kimura, M. and Crow, J.F. (1964) The number of alleles that can be maintained in a finite population. Genetics 49, 725-738. 17. Hedrick, P.W. (2005) A standardized genetic differentiation measure. Evolution 59 (8), 1633-1638. 18. Meirmans, P.G. and Hedrick, P.W. (2011) Assessing population structure: Fst and related measures. . Molecular Ecology Resources 11, 5–18. 19. Jost, L. (2008) Gst and its relatives do not measure differentiation. Mol. Ecol. 17, 4015-4026. 20. Nei, M. (1987) Molecular Evolutionary Genetics, Columbia University Press. 21. Pritchard, J.K. et al. (2000) Inference of population structure using multilocus genotype data. Genetics 155, 945–959. 22. Excoffier, L. et al. (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131 (2), 479-491. 23. Broquet, T. and Petit, E.J. (2009) Molecular Estimation of Dispersal for Ecology and Population Genetics. Annu. Rev. Ecol. Evol. Syst. 40, 193-216. 24. Jost, L. et al. (2010) Partitioning diversity for conservation analyses. Diversity and Distributions 16, 65–76. 25. Whitlock, M.C. and McCauley, D.E. (1999) Indirect measures of gene flow and migration: F-ST not equal 1/(4Nm+1). Heredity 82 (PT2), 117-125. 26. Sokal, R.R. and Rohlf, F.J. (1995) Biometry: the principles and practice of statistics in biological research., Freeman. 27. Maszczyk, T. and Duch, W. (2008 ) Comparison of Shannon, Renyi and Tsallis Entropy used in Decision Trees. Lecture Notes in Computer Science 5097, 643-651. 28. Egizi, A. and Fonseca, D.M. (2015) Ecological limits can obscure expansion history: patterns of genetic diversity in a temperate mosquito in Hawaii. Biological Invasions 17, 123-132. 29. Mandel, J.R. et al. (2016) Patterns of gene flow between crop and wild carrot, Daucus carota (Apiaceae) in the United States. PLoS ONE 11 (9), e0161971.

Page 61: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

45

30. Rossetto, M. et al. (2008) Dispersal limitations, rather than bottlenecks or habitat specificity, can restrict the distribution of rare and endemic rainforest trees. . American Journal of Botany 95, 321–329. 31. Zinck, J.W. and Rajora, O.P. (2016) Post-glacial phylogeography and evolution of a wide-ranging highly-exploited keystone forest tree, eastern white pine (Pinus strobus) in North America: single refugium, multiple routes. BMC Evolutionary Biology 16, 56. 32. Nagel, M.M. et al. (2016) Differences in population connectivity of a benthic marine invertebrate Evechinus chloroticus (Echinodermata: Echinoidea) across large and small spatial scales. Conservation genetics 16, 965-978. 33. Rivers, M.C. et al. (2011) Genetic variation in Delonix s.l. (Leguminosae) in Madagascar revealed by AFLPs: fragmentation, conservation status and taxonomy. Conservation Genetics 12, 1333-1344 34. Ernest, H.B. et al. (2014) Fractured genetic connectivity threatens a southern California puma (Puma concolor) population. PLoS ONE 9 (10), e107985. 35. Graça, R.N. et al. (2013) Rust disease of eucalypts, caused by Puccinia psidii, did not originate via host jump from guava in Brazil. Molecular Ecology 22, 6033–6047. 36. Linde, C.C. et al. (2016) Weeds, as ancillary hosts, pose disproportionate risk for virulent pathogen transfer to crops BMC Evolutionary Biology 16, 101. 37. Gailing, O. et al. (2012) Genetic comparisons between North American and European populations of Lumbricus terrestris L. Biochemical Systematics and Ecology 45, 23–30. 38. Temizgul, R. et al. (2016) Genetic diversity of high-molecular weight glutenin subunit compositions in bread wheat landraces originated from Turkey. Plant Genetic Resources 2016, 1-11. 39. Chen, S. et al. (2013) Evidence from genome-wide simple sequence repeat markers for a polyphyletic origin and secondary centers of genetic diversity of Brassica juncea in China and India. Journal of Heredity 104, 416-427. 40. Zielke, D.E. et al. (2014) Unexpected patterns of admixture in German populations of Aedes japonicus japonicus (Diptera: Culicidae) underscore the importance of human intervention. PLoS ONE 9 (7), e99093. 41. Demers, J.E. et al. (2015) Highly diverse endophytic and soil Fusarium oxysporum populations associated with field-grown tomato plants. Applied and Environmental Microbiology 81, 81-90. 42. Cooke, G.M. et al. (2016) Understanding the spatial scale of genetic connectivity at sea: unique insights from a land fish and a meta-analysis. PLoS ONE 11 (5), e0150991. 43. Karlin, E.F. and Smouse, P.E. (2017) Allo-triploid Sphagnum x falcatulum: single individuals contain most of the Holantarctic diversity for ancestrally indicative markers. Ann. Botany In Press dx.doi.org/doi:10.7282/T3FX7CRH. 44. Iacchei, M. et al. (2017) It’s about time: Insights into temporal genetic patterns in oceanic zooplankton from biodiversity indices. Limnol. Oceanogr. 62, 1836–1852. 45. Chao, A. and Jost, L. (2015) Estimating diversity and entropy profiles via discovery rates of new species. Methods in Ecology and Evolution 6, 873–882. 46. Rollins, L.A. et al. (2015) Is there evidence of selection in the dopamine receptor D4 gene in Australian invasive starling populations. Current Zoology 61, 505-519. 47. Rossetto, M. et al. (2011) The impact of distance and a shifting temperature gradient on genetic connectivity across a heterogeneous landscape. BMC Evolutionary Biology 11, 126. 48. Karlin, E.F. et al. (2011) One haploid parent contributes 100% of the gene pool for a widespread species in northwest North America. Molecular Ecology 20, 753-767 49. Karlin, E.F. et al. (2012) High genetic diversity in a remote island population system: sans sex. New Phytologist 193, 1088-1097. 50. Alpers, D.L. et al. (2016) Evidence of subdivisions on evolutionary timescales in a large, declining marsupial distributed across a phylogeographic barrier. PLoS ONE 11 (10), e0162789. 51. Niederstatter, H. et al. (2012) Pasture Names with Romance and Slavic Roots Facilitate Dissection of Y Chromosome Variation in an Exclusively German-Speaking Alpine Region. PLoS ONE 7 (7), e41885. 52. Mellick, R. et al. (2011) Consequences of long- and short-term fragmentation on the genetic diversity and differentiation of a late successional rainforest conifer. Australian Journal of Botany 59, 351-362.

Page 62: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

46

53. Smouse, P.E. et al. (2015) An informational diversity analysis framework, illustrated with sexually deceptive orchids in early stages of speciation. Molecular Ecology Resources 15, 1375-1384. 54. Colabella, F. et al. (2014) Extensive pollen flow in a natural fragmented population of Patagonian cypress Austrocedrus chilensis. Tree Genetics and Genomes 10, 1519-1529. 55. Schmitt, N.T. et al. (2014) Mixed-stock analysis of humpback whales (Megaptera novaeangliae) on Antarctic feeding grounds. J Cetacean Res Manage. 14, 141–157. 56. Dennison, S. et al. (2016) Population genetics of the koala (Phascolarctos cinereus) in north-eastern New South Wales and south-eastern Queensland. Australian Journal of Zoology 64, 402–412. 57. Littleford-Colquhoun, B.L. et al. (2017) Archipelagos of the Anthropocene: rapid and extensive differentiation of native terrestrial vertebrates in a single metropolis. Molecular Ecology 26, 2466–248. 58. Demchik, M. et al. (2017) Genetic diversity of American hazelnut in the UpperMidwest, USA. . Agroforest Syst DOI 10.1007/s10457-017-0097-2. 59. Arun, P.V.P.S. et al. (2016) Identification and functional analysis of essential, conserved, housekeeping and duplicated genes. FEBS Letters 590, 1428–1437. 60. Lloyd, M.W. et al. (2013) The power to detect recent fragmentation events using genetic differentiation methods. PLoS ONE 8 (4), e63981. 61. Greenbaum, G. et al. (2014) Allelic richness following population founding events – a stochastic modeling framework incorporating gene flow and genetic drift. PLoS ONE 9 (12), e115203. 62. Jackling, F.C. et al. (2014) The genetic inheritance of the blue-eyed white phenotype in alpacas (Vicugna pacos). Journal of Heredity 105, 847-857. 63. Putman, A. and Carbone, I. (2014) Challenges in analysis and interpretation of microsatellite data for population genetic studies. Ecology and Evolution 4, 4399–4428. 64. Schall, J.J. and StDenis, K.M. (2013) Microsatellite loci over a thirty-three year period for a malaria parasite (Plasmodium mexicanum): bottleneck in effective population size and effect on allele frequencies. Parasitology 140, 21-28. 65. Rosenberg, N.A. et al. (2003) Informativeness of Genetic Markers for Inference of Ancestry. Am. J. Hum. Genet. 73, 1402–1422. 66. Chao, A. et al. (2013) Entropy and the species accumulation curve: a novel entropy estimator via discovery rates of new species. Methods in Ecology and Evolution 4, 1091-1100. 67. Sherwin, W.B. (2010) Review: entropy and information approaches to genetic diversity and its expression: genomic geography. Entropy 12, 1765-1798. 68. Nourmohammad, A. et al. (2013) Evolution of molecular phenotypes under stabilizing selection. Journal of Statistical Mechanics: Theory and Experiment stacks.iop.org/JSTAT/2013/P01012 doi:10.1088/1742-5468/2013/01/P01012 arXiv:1301.3981v1 [q-bio.PE]. 69. Campagne P et al. (2016) Impact of violated HDR assumptions on evolution of Bt-resistance. Evol. Appl. 9, 596-607. 70. Frank, S.A. (2012) Natural selection. V. How to read the fundamental equationsof evolutionary change in terms of information theory. J. Evol. Biol. 25, 2377–2396. 71. Frank, S.A. (2017) Universal expressions of population change by the Price equation: natural selection, information, and maximum entropy production. Ecology and

Evolution 7 (10), 3381–3396. 72. Vuong, H.B. et al. (2017) Influences of host community characteristics on Borrelia burgdorferi infection prevalence in blacklegged ticks. PLoS ONE 12 (1), e0167810. 73. Nourmohammad, A. et al. (2014) Adaptive evolution of molecular Phenotypes. Curr. Opin. Genet. Dev. 23. 74. de Vladar, H.P. and Barton, N.H. (2011) The contribution of statistical physics to evolutionary biology. Trends in Ecology and Evolution 26, 424-432. 75. Riedel, N. et al. (2015) Multiple-line inference of selection on quantitative traits. Genetics 201, 305-322. 76. Kobayashi, T.J. and Sughiyama, Y. (2015) Fluctuation relations of fitness and information in population dynamics. Physical Review Letters 115, 238102. 77. Held, T. et al. (2014) Adaptive evolution of molecular phenotypes. Journal of Statistical Mechanics: Theory and Experiment P09029. 78. Smouse, P.E. and Ward, R.H. (1978) A comparison of the genetic infrastructure of the Yecuana and the Yanomama: a likelihood analysis of genotypic variation among populations. Genetics 33, 611-631. 79. Zhang, L. et al. (2009) A multilocus linkage disequilibrium measure based on mutual information theory and its applications. Genetica 137, 355–364.

Page 63: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

47

80. Peakall, R. and Smouse, P.E. (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes (6), 288-295. 81. Peakall, R. and Smouse, P.E. (2012) GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research-an update. Bioinformatics 28, 2537-2539. 82. Chao, A. et al. (2015) Online program SpadeR (Species-richness Prediction And Diversity Estimation in R). Program and user’s guide. http://chao.stat.nthu.edu.tw/wordpress/software_download/, (accessed 16 Feb 2017). 83. Ryman, N. et al. (2006) Power for detecting genetic divergence: differences between statistical methods and marker loci. Molecular Ecology 15, 2031–2045. 84. Basharin, G.P. (1959) On a statistical estimate for the entropy of a sequence of independent random variables. Theory of Probability & Its Applications 4, 333–336. 85. Lande, R. (1995) Mutation and conservation. Conserv. Biol. 9, 782-791. 86. Meirmans, P.G. and Van Tienderen, P.H. (2004) GENOTYPE and GENODIVE: two programs for the analysis of genetic diversity of asexual organisms. Molecular Ecology Notes 4, 792–794. 87. Mather, A.E. et al. (2012) An ecological approach to assessing the epidemiology of antimicrobial resistance in animal and human populations. Proc. R. Soc. B. 279, 1630-1639. 88. Chao, A. and Chiu, C.-H. (2017) iDIP (Information-based Diversity Partitioning) Online: Software for partitioning Shannon diversity and phylogenetic diversity under multi-level hierarchical structures. Program and User’s Guide. http://chao.stat.nthu.edu.tw/wordpress/software_download/ https://chao.shinyapps.io/iDIP/, (accessed). 89. Huang, W. et al. (2011) MTML-msBayes: Approximate Bayesian comparative phylogeographic inference from multiple taxa and multiple loci with rate heterogeneity. BMC Bioinformatics 12, Article Number: 1. 90. Marcon, E. and Hérault, B. (2015) entropart: An R Package to Measure and Partition Diversity. Journal of Statistical Software 67 (8), doi: 10.18637/jss.v067.i08. 91. Kamvar, Z.N. et al. (2014) Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. PeerJ 2, e281. 92. tcR-vignette (accessed 12 Feb 2017) tcR vignette - https://cran.r-project.org/web/packages/tcR/vignettes/tcrvignette.html, (accessed). 93. Schachtel, G.A. et al. (2012) Comprehensive evaluation of virulence and resistance data: A new analysis tool. Plant Diseases 96, 1060-1063. 94. Wang, Y. et al. (2011) An empirical comparison of several recent epistatic interaction detection methods. Bioinformatics 27, 2936-2943. 95. Brunel, H. et al. (2010) MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis. Bioinformatics 26, 1811-1818. 96. Chanda, P. et al. (2007) Information-theoretic metrics for visualizing gene-environment interactions. American Journal of Human Genetics 81 (939-963). 97. Chanda, P. et al. (2009) Information-theoretic gene-gene and gene-environment interaction analysis of quantitative traits. BMC Genomics 10, 509. 98. Chanda, P. et al. (2008) Ambience: A novel approach and efficient algorithm for identifying informative genetic and environmental associations with complex phenotypes. Genetics 180, 1191-1210. 99. Hu, T. et al. (2013) An information-gain approach to detecting three-way epistatic interactions in genetic association studies. J. Am. Med. Inform. Assoc. 20 (4), 630-636. 100. Yee, J. et al. (2015) Detecting genetic interactions for quantitative traits using spacing entropy measure. BioMed. Research International 2015, ID-523641. 101. Silberstein, M. et al. (2012) A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees. Bioinformatics 29 (2), 197-205. 102. Hausser, J. and Strimmer, K. (2009) Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. J. Mach.Learn. Res. 10, 1469-1484. 103. Wang, K. et al. (2015) EntropyExplorer: an R package for computing and comparing differential Shannon entropy, differential coefficient of variation and differential expression. BMC Res. Notes 8, 832. 104. Santos, S.D. et al. (2015) CoGA: An R package to identify differentially co-expressed gene sets by analyzing the graph spectra. PLoS ONE 10 (8), e0135831. 105. Goff, L.A. et al. (2015) cummeRbund. http://compbio.mit.edu/cummeRbund/.

Page 64: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

48

106. Moore, J.H. and Hu, T. (2015) Epistasis analysis using information theory. In Epistasis: methods and protocols, methods in molecular biology (Moore, J.H. and Williams, S.M. eds), pp. 257-268, Springer Science+Business Media.

Page 65: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

Interactive Questions

Question 1:

The q=0 diversity measures are more sensitive to rare types than q=1 or q=2 measures;

q=2 measures are very insensitive to rare types, and q=1 measures have intermediate

sensitivity to rare types. This sensitivity means that:

❍(a) q=0 diversity measures are useful if we are concerned with conservation of rare

species

❍(b) q=0 diversity measures are useful if we are concerned with rare alleles that may be

vital for adaptation to changing conditions

❍(c) q=0 diversity measures can be unreliable if sampling is inadequate to reliably detect

all rare types

❍(d) it is best to use the entire profile of diversity measures, rather than just q=0 or q=1 or

q=2

● (e) all of the above are true

Explanation:

Answer (e) is correct (so all the others are too).

Question 2:

For alpha-level (within-locality) diversity, Shannon information measures (q=1) reflect:

❍(a) the uncertainty of the type of a single sampled individual

❍(b) the number of ways that one could rearrange individuals of different types in a

sample

● (c) (a) AND (b)

Explanation:

- 1 -

Page 66: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

Answer (c) is correct (so a and b are too). Explanation: Answer (a) is the formal

definition of Shannon information, answer (b) also happens to be true. The chance of

sampling TWO individuals of different (or the same) types relates to q=2 diversity (eg

Simpson or Heterozygosity) so is not relevant to q=1 information measures.

❍(d) the chance of sampling two individuals that are of different types

❍(e) the chance of sampling two individuals that are of the same type

Question 3:

For beta-level (between-locality) diversity, information-measures are based on:

❍(a) the chance of sampling two individuals that are of different types, one from each

location

❍(b) the chance of sampling two individuals that are of the same type, one from each

location

● (c) whether knowing an individual's type will provide information about which

locality it was sampled from

Explanation:

Answer (c) is correct – Explanation: That is what mutual information (I) is based on - it

considers sampling an individual, and asks whether knowing that individual's type

will provide information about which locality it was sampled from. This information will

be more reliable (so I will be higher) if the mix of types is more different at the two

localities (ie more diversity between localities, β). The chance of sampling TWO

individuals of the same or different types relates to the q=2 measures of beta diversity,

and the sharing of types (without considering their frequencies) is the basis of q=0 beta

diversity measures.

❍(d) how many types are NOT shared between the two locations

❍(e) how many types are shared between the two locations

- 2 -

Page 67: Review Information Theory Broadens the Spectrum of Molecular Ecology …chao.stat.nthu.edu.tw/wordpress/paper/127_pdf_appendix.pdf · 2017-12-12 · Review Information Theory Broadens

Question 4:

In assessing whether a measure gives a reliable assessment of biodiversity, a measure

should satisfy various conditions including:

❍(a) when all populations have identical allele sets and identical allele abundance

distributions, an among-population (beta) measure should attain a fixed minimum value

that is not constrained by the value of within-population diversity

❍(b) when there are no shared alleles among populations, an among-population (beta)

measure should attain a fixed maximum value that is not constrained by the value of

within-population diversity

❍(c) 'replication', which means that a within-population (alpha) diversity

measure increases linearly when equally diverse and completely distinct groups are

pooled in equal proportions

❍(d) 'strong monotonicity', which means that increases of allelic

differentiation are always associated with an increase of a beta diversity measure

● (e) each of (a), (b), (c) and (d) is true

Explanation:

Answer (e) is correct (so all the others are too). Unfortunately, a number of measures

do not satisfy these conditions – see main text.

- 3 -