selection in finite populations with multiple alleles i

18
SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I. LIMITS TO DIRECTIONAL SELECTION B. D. H. LATTER AND C. E. NOVITSKI Division of Plant Industry, CSIRO, Canberra, Australia Received September 6, 1968 interpretation of patterns of response to artificial selection in populations T?Fan outbreeding species is essentially a two-step process. There must ob- viously be an exploratory phase of experimentation in which the important factors influencing the behaviour of a particular quantitative character in the chosen population are identified. Pertinent evidence sought in this initial phase should include: (i) a characterization of the phenotypic variance displayed in the base population; (ii) documentation of departures from expectation based on additive genetic effects, due for example to threshold phenomena, directional dominance, or changes in the nature of the scale of measurement; (iii) detection of changes in reproductive fitness with progress under selection, and associated effects of natural selection in selected populations; (iv) indications of the segrega- tion of genes of relatively large effect, or of those identifiable by specific pleio- tropic effects; and (v) an assessment of the relative importance of sex-linked and autosomal genes in their contribution to the total response to selection. Large breeding populations under reasonably intense selection pressure are normally involved in exploratory studies of this kind (LATTER 1964; 1966~). Having identified the important static and dynamic properties of the quantita- tive genetic system, further progress towards a partially quantitative genetic model must depend largely on a knowledge of expectations based on simple models involving the relevant factors. Theoretical studies must therefore provide the basis for both the design and interpretation of more penetrating laboratory experiments, unless individual genes can be identified by their specific pleiotropic effects, or by other techniques. Particular attention should be given to effective population size, selection intensity, and the overall design of the selection pro- cedure in the determination of patterns of response, since these are factors under direct experimental control. In the past decade, algebraic solutions have been provided to a number of the basic problems of selection in finite populations, and computer techniques have been developed for the study of more complex phenomena. For independently segregating loci with only two alleles, having small additive effects on a quantita- tive character, KIMURA (1957) has given an explicit formula for the probability of ultimate fixation of an allele as a function of its initial frequency, its relative selective advantage under selection, and the effective size of the breeding p o p - lation. The pattern of change in gene frequency under directional selection has Genetics 62 : 859-876 August 1969

Upload: trinhkiet

Post on 14-Feb-2017

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I. LIMITS TO DIRECTIONAL SELECTION

B. D. H. LATTER AND C. E. NOVITSKI

Division of Plant Industry, CSIRO, Canberra, Australia

Received September 6, 1968

interpretation of patterns of response to artificial selection in populations T?Fan outbreeding species is essentially a two-step process. There must ob- viously be an exploratory phase of experimentation in which the important factors influencing the behaviour of a particular quantitative character in the chosen population are identified. Pertinent evidence sought in this initial phase should include: (i) a characterization of the phenotypic variance displayed in the base population; (ii) documentation of departures from expectation based on additive genetic effects, due for example to threshold phenomena, directional dominance, or changes in the nature of the scale of measurement; (iii) detection of changes in reproductive fitness with progress under selection, and associated effects of natural selection in selected populations; (iv) indications of the segrega- tion of genes of relatively large effect, or of those identifiable by specific pleio- tropic effects; and (v) an assessment of the relative importance of sex-linked and autosomal genes in their contribution to the total response to selection. Large breeding populations under reasonably intense selection pressure are normally involved in exploratory studies of this kind (LATTER 1964; 1966~) .

Having identified the important static and dynamic properties of the quantita- tive genetic system, further progress towards a partially quantitative genetic model must depend largely on a knowledge of expectations based on simple models involving the relevant factors. Theoretical studies must therefore provide the basis for both the design and interpretation of more penetrating laboratory experiments, unless individual genes can be identified by their specific pleiotropic effects, or by other techniques. Particular attention should be given to effective population size, selection intensity, and the overall design of the selection pro- cedure in the determination of patterns of response, since these are factors under direct experimental control.

In the past decade, algebraic solutions have been provided to a number of the basic problems of selection in finite populations, and computer techniques have been developed for the study of more complex phenomena. For independently segregating loci with only two alleles, having small additive effects on a quantita- tive character, KIMURA (1957) has given an explicit formula for the probability of ultimate fixation of an allele as a function of its initial frequency, its relative selective advantage under selection, and the effective size of the breeding p o p - lation. The pattern of change in gene frequency under directional selection has Genetics 62 : 859-876 August 1969

Page 2: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

860 B. D. H. LATTER AND C. E. NOVITSKI

been shown to depend in particular on the value of Ni a/o, where N denotes the effective population size, i the standardized selection differential, and a/u the proportionate effect of the gene (FALCONER 1960). The time scale of the process is directly proportional to N, so that time is most appropriately measured in units of t/N, where t is the number of generations of selection. It is, of course, under- stood here that the regime of selection is one in which the values of N and i remain constant from generation to generation.

ROBERTSON (1960) has explored the implications of KIMURA’S result in the context of artificial selection based on individual performance, and has in addi- tion introduced the concept of the half-life of the selection process. For individual genes, the half-life is defined as the number of generations of selection required to drive the mean gene frequency half way from its initial value to the expected limiting value. For a quantitative character, the half-life of the overall selection process can be defined as the expected number of generations of selection neces- sary to yield one-half of the total expected response. DEMPSTER (1955) has shown that for small values of Ni, the total advance under selection is expected to be 2N times the mean response observed in the first generation, and the corresponding half-life is expected to be 1.4N generations (ROBERTSON 1960).

For a quantitative character influenced by many independently segregating genes of small effect, these predictions will also be expected to hold for values of NZ common in laboratory selection experiments, provided that Nia/a is small (< 0.4) for each locus concerned, and the heritability of the character is low. If genes of larger effect are involved, predictions of total response and the time scale of the process can be made only if the joint distribution of gene frequencies and gene effects is specified. In general, solutions of this type can be provided only by use of computer simulation techniques.

The effects of linkage on selection response in finite populations have to date been studied in detail only for pairs of loci of additive effect, segregating initially in a population in equilibrium under random mating. (LATTER 1965b, 1966a, b; HILL and ROBERTSON 1966). In both these investigations, changes in gametic or zygotic frequencies under selection were treated deterministically, a process of random sampling being introduced each generation to simulate the effects of finite population size. The models were again restricted to the case of two alleles per locus.

In the present series of papers, an extension of these genetic models is to be made to include the segregation of a large number of alleles at each locus con- tributing to genetic variation in the character under selection. The first paper considers the contribution to selection response of an individual locus segregating independently of the other loci concerned, assuming additive gene action, and a base population in equilibrium under random mating.

THE GENETIC MODEL AND COMPUTER PROGRAMS

The assumption that each segregating locus is represented by only two alleles is an appropriate restriction for populations generated by crossing two homo-

Page 3: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

SELECTION WITH MULTIPLE ALLELES 86 1

zygous parents, but in most practical situations the assumption is likely to be quite unjustified. The possible number of alleles at any locus can conceivably be extremely large: a locus consisting of only 1,000 nucleotide pairs, each with four alternatives, gives rise to 3,000 alleles differing from a parent gene by a single substitution.

It is therefore of great interest to examine the implications of an extreme genetic model in which a potentially infinite series of alleles may exist at each locus, the effect of any new allele produced by mutation being only slightly different from the allele giving rise to it (KIMURA 1965). For genes with additive effects on a quantitative character with an intermediate optimal phenotype, KIMURA has shown the genotypic effects at such a locus to be normally distributed at equilibrium in a large population. A series of such alleles contributing to quantitative genetic variation can be thought of as representing a szt of isoalleles: grossly aberrant forms of the gene produced by a single mutational event (e.g. base deletions, chain-terminating mutants, and alleles of low activity) can be presumed to have been eliminated from the base population by natural selection. The present study is concerned with the effects of directional artificial selection in finite breeding lines drawn from such a base population, ignoring the possibility of subsequent mutational changes.

The genetic model: The model to be investigated involves the following assump- tions: (i) inheritance is disomic; (ii) the base population is in equilibrium under random mating; (iii) a potentially infinite number of alleles A,, i = 1,2, . . . CO, are carried at the particular autosomal locus under scrutiny; (iv) the effects of the A, on the quantitative character under selection, denoted by a%, are additive, independent of the genetic background, and initially normally distributed with zero mean and variance U: ; (v) the allelic frequencies are such that a random sample of T individuals from the base population can be assumed to carry 2T distinct alleles; (vi) artificial selection is based on the phenotypic value of the individual; (vii) the residual genetic and environmental variance is assumed to be constant throughout the period of selection response, being identical for all genotypes A,A,; and (viii) the mating of selected breeding individuals is at random, and all have equal fertility.

Note that the model is one of symmetrical genetic variation, and selection in either the “plus” or “minus” directions will therefore have the same expected consequences. We will deal exclusively with selection for maximal expression of the selected trait. The “locus” specified by the model should more generally be referred to as a given unit of segregation in the base population. This may in fact be a single cistron affecting the expression of the quantitative trait. But it may equally refer to a particular chromosome or chromosomal region within which a number of genes influencing the character is located, recombination between the genes being effectively suppressed, e.g. by the use of marked balancer chromosomes (ROBERTSON 1966).

Computer program A: The technique of computer simulation adopted in pro- gram A depends on the theory of selective value outlined by LATTER (1965a). If the effect of an independently segregating locus is small by comparison with

Page 4: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

862 B. D. H. LATTER A N D C. E. NOVITSKI

the prevailing phenotypic standard deviation, the relative selective values of genotypes ALA, at the locus can be taken to be linear functions of their mean values d,,, expressed as deviations from the population mean (KIMURA 1958; GRIFFING 1960). The genotypic value d,, is to be interpreted simply as the mean of the sub-population of individuals with the configuration A,A, at the A locus, averaging over all other genetic and environmental effects. The second-order approximation to selective value suggested by LATTER ( 1965a) considerably extends the range of gene effects which may be considered in a numerical study, while still allowing a simple expression to be given for the expected change in allelic frequencies under selection.

If we consider a set of alleles A, with frequencies p , ( i = 1, 2, . . . m ) and pLp, d,, = 0, the selective value of the genotype A,A, under individual selec-

, , I tion can be taken as

where zo is the cut-off point corresponding to selection of a proportion P of the population for breeding, and i is the standardized selection differential (Table 1). The mean selective value JZ is then equal to

w = I: pipj WTj = 1 + si (so/.) (a;/u2) i , i

where

eny of selected parents is expected to be

is the genotypic variance contributed by the A locus. It can then readily be deduced that the frequency of the allele Ai in the prog-

TABLE 1

Definition of symbols and relationships Symbol Definition

Number of individuals sampled from the base population. Effective breeding population size. Proportion of scored individuals selected. Standardized selection differential. Cut-off point expressed as a deviation from the mean. Genotypic value of AiAj as a deviation from the mean. The frequency of allele Ai. Variance contributed by the multiallelic locus. Phenotypic variance in the base population. Number of generations of artificial selection. The accumulated response to selection.

Total advance in the mean, i.e. t + The value of ag/u in the base population. The value of Nig for a given locus. Variance among replicates at the selection limit. The half-life and 0.95-life of the selection process. Expected total advance, measured on a scale with an effective range from 0 to 1. R = k E(r) /go , where k is an empirical constant.

lim z( t ) .

Page 5: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

SELECTION WITH MULTIPLE ALLELES 863

where at = pjd,j , and ~i = pi d:,

I 3

For the particular case where all allelic effects are additive, dij = a, -I- aj and ai = a,. We have the additional simplification that

~i = a: + %U: (4)

The first step in program A involves the choice of a random sample of T "indi- viduals" from the base population. A total of 2T random normal values are chosen initially in pairs (ai, ai) from a population with zero mean and variance U:, to represent the sample of allelic effects. The value of U: is given by

U; =% $ 2 2 (5) where gz denotes the value of u ~ / u * in the base population. The sum of 12 values

from a rectangular distribution has been used as a random normal deviate (TOCHER 1 9 6 3 ; MULLER 1 9 5 9 ) . Corresponding to each of the T values of d,j = ai -I- ai, relative selective values are calculated from equation (1) to give the rela- tive probability that the genotype concerned should be included in the selected sample of N parents. Together with a pseudo-random number generator, these probabilities are used to take a sample of the T individuals without replacement as the parents of the following generation. A total of 2N distinct alleles are then known to be segregating at the locus after this initial generation of selection.

In all subsequent generations, changes in allelic frequencies under selection are calculated from equations ( 3 ) and (4), followed by a random sampling of 2N gametes from the gene pool. The alleles are progressively renumbered each gen- eration as genetic sampling proceeds, to avoid extensive computations involving zero frequencies.

Computer program B: Use has also been made in this series of studies of a more general program in which binary digits 0 or 1 represent alleles at each of a number of loci, and the operations of logical algebra are combined to simulate genetic processes (FRASER 1 9 5 7 ) . A total of 46 loci each with two alleles, 23 loci each with 4 alleles, or 15 loci each with 8 alleles, etc., can be accommodated by the program, with arbitrary recombination values between each pair of adjacent loci. Interference effects have been ignored.

The addit+ie and dominance parameters may be specified for each individual locus, and epistatic gene effects can be introduced by a specification of the nature of interactions between components of the quantitative character under selection. Sex linkage, skx-influenced phenotypic expression, and differential environmental sensitivity, may all be included in the genetic model if desired. Selection is by truncation in each generation, with either extreme phenotypes, or those approxi- mating the mean or a fixed intermediate value, being chosen as parents. Offspring are generated by random mating with replacement, gamete formation being simulated by a random walk procedure.

Page 6: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

864 B. D. H. LATTER A N D C. E. NOVITSKI

PREDICTIONS FOR LOCI OF SMALL EFFECT

For an independently segregating locus with multiple alleles of additive effect, the pattern of response to selection in finite populations can readily be predicted for the limiting case in which differences among genotypes in selective value are small. The expected total response to selection, the 0.50 and 0.95 life of the process, and the expected variance among means of replicate populations at the limit, can be derived algebraically. Computer simulation studies using program A have in addition provided some useful generalizations with regard to the rate of fixa- tion of alleles for loci of small effect.

The following derivation is an extension of that given by DEMPSTER (1955) and ROBERTSON (1960) for the two-allele model. The argument is based on the assumption that for loci of small effect, changes in genetic variance due to the effects of selection can be ignored by comparison with those due to genetic sam- pling alone. KIMURA (1955) has shown that the rate of decline in heterozygosity due to random drift in gene frequencies is given by

where primes indicate frequencies in the succeeding generation. The genetic variance contributed by an additive multi-allelic locus under random mating can be written in the form

U; = EX (ai - ar) * pipi i j

using the notation of Table 1, so that the expected rate of decline in genetic vari- ance due to genetic sampling is given by

uy=u' , [ I - (1/2N)]

The changes in allelic frequencies under directional selection for a locus of small effect are

Api = ( i / u ) piai

where the ai are coded so that

following generation is then equal to

piai = 0. The additive genetic variance in the i

2 E (pi f Api)a;-Z[E (pi+Api)ail2 i i

so that

The genetic model under consideration in this paper specifies that the ai are normally distributed in the base population, so that the first term on the right hand side of equation (7) is initially zero. The requirement that changes in U', given by equation (7) should be small by comparison with those predicted by equation ( 6 ) , implies that the ratio

2(va)2 (%

/2N = NZ2 ( U J U ) ~

Page 7: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

SELECTION WITH MULTIPLE ALLELES 865

should be negligible. Since N 2 1 and ,,/U < 1, this condition will be satisfied if Nig is sufficiently small provided i is not too large.

The total response to selection at the limit will then be Lo

7- = (i/u) U; t=o

= (i/,) ( g a ) 2 5 [ l - ( 1 / 2 N ) I t

= 2Nig2u t =o

i.e. 2N times the response to selection in the first generation. A proportion a of this total response is expected to be realized after n generations, where

( Y = ( 1 / 2 N ) [ l - ( 1 / 2 N ) I t t=o

= 1 - [ l - ( 1 / 2 N ) ] " .

n = -2N loge(l - a )

(9) Provided N > 5, an approximate solution to (9) is given by

( 1 0 ) so that the half-life of the selection process is roughly equal to 1.4N generations, and the 0.95-life is expected to be 6.ON generations. For Nig small, the contribu- tion of this particular locus to the variance among replicate selection lines a t the limit is expected to be

= var (hi)

= 2g22 ( 1 1 ) i.e. twice the genetic variance in the base population.

Rate of fixation of alleles: In addition to these "observational" parameters which can be directly estimated in selection experiments, viz. total response, a-life and variance among lines at the limit, it is desirable to have some infoma- tion as to the rate of fixation of alleles under selection. In the present section we shall therefore examine the behaviour of multiallelic loci under random sampling without selection, for comparison with results to follow.

TABLE 2

Observed mean number of alleles segregating under random sampling without selection

Breeding population size ( N )

( t / N ) 5 10 20 40 SE Generation Approximate

1 3.43 3.80 4.00 4.13 +- .03 2 2.07 2.19 2.26 2.34 f. .03 3 1.55 1.64 1.66 1.63 i- .03 4 1.32 1.37 1.38 1.35 t .03 5 1.18 1.20 1 .22 1.21 f. .02 6 1.11 1.12 1.13 1.14 k .02 7 1.07 1.07 1.06 1.09 * .01 8 1.09 1.04 1.03 1.05 t .01

Page 8: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

866

80

v)

? 40 3 e Lu

20 U3 Lu A Lu A

2 10

5 4

5 2 2

ei Lu * 5

Z 3

1

B. D. H. LATTER AND C . E. NOVITSKI

N =40 -

0 N 2N 3N 4N 5N 6N 7N

GENERATIONS FIGURE 1.-The expected number of alleles segregating at a multiallelic locus under random

sampling without selection. The N parents chosen from the base population are assumed to carry 2N different alleles.

I t is clear from Figure 1 and Table 2 that the rate of loss of alleles is a function both of time measured in terms of t /N , where t is the number of generations of sampling, and also of the initial number of alleles ( 2 N ) present in parents selected from the base population sample. Of particular interest for purposes of com- parison with populations under directional selection, are the following observa- tions: -

(i) The number of alleles segregating is reduced from 2N to a mean of roughly 2.8 per locus after 1.4N generations, for values of N in the range 5 to 40: the probability of fixation at a locus prior to this time is approximately 0.03.

(ii) After a period of 6.ON generations, the mean number of alleles per locus is of the order of 1.12, and the corresponding probability of fixation is approxi- mately 0.88.

In the section to follow, these generalizations for populations of breeding size in the range 5 to 40 will provide estimates of the limiting rates of fixation of

Page 9: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

SELECTION WITH MULTIPLE ALLELES 867

alleles at the 0.50 and 0.95 life of the selection process, under conditions of weak selection.

SIMULATION O F SELECTION RESPONSE

Extensive use has been made of program A in the study of a variety of combi- nations of values of the parameters N , P and g2 . The values of gz ranged from 0.002 to 0.10, this parameter measuring the initial contribution of the locus to the total heritability of the quantitative character: for a character of heritability equal to 0.5, for example, a value of g2 = 0.10 corresponds to a locus contributing one fifth of the total genetic variance in the base population.

The valaes of N and P used in the study correspond to one or other of the selec- tion regimes shown in Table 3, chosen so that the number of individuals repre- sented in the initial sample from the base population was between 50 and 200, and the selection intensity between 10% and 50%. Note that selection regimes involving large values of N and low intensities of selection (e.g. 100/200) have not been tested because of computing time limitations.

Total advance under selection: The maximum possible response from a given sample of T individuals from the base population, i.e. the response achieved by the fixation of the allele A , with the most extreme positive value of a,, is given in Table 3 for comparison with the theoretical values predicted by normal distribu- tion theory. As expected, the maximum possible advance increases appreciably as the size of the initial sample is increased from 50 to 200 individuals. However, the extremes given by the simulation procedure based on the sum of 12 uniform variates can be seen to fall somewhat short of the theoretical values at higher sample sizes.

The observed accumulated change in the mean of the selected character due to lim changes in allelic frequencies at the specified locus, approaches a value r = t + 00

s( t ) , where z ( t ) denotes the observed mean at generation t (Table 1). In an attempt to summarize observations of mean values of r as a function of N , i and g, a transformed measure of total response has been defined as

where E ( r ) is the expected mean value of r, and k is an arbitrary constant which may be chosen to give an effective upper limit of unity to the range of values of R.

R = k E ( r ) / g a (12)

TABLE 3

Selection regimes inuolued in studies using simulation program A

Maximum possible response to Size of base Breeding to selection (gal

population sample population size ( T ) (NI Observed Theoretical'

50 5, 10, 15,20,25 3.52 3.55 100 10,20,30,40,50 3.82 3.88 200 20,30,40 4.05 4.20

* Based on expectations of the mean range in samples of size 22' from normally distributed populations.

Page 10: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

868 B. D. H. LATTER A N D C. E. NOVITSKI

.oo 0 4 8 12 16

z = N i g FIGURE 2.-The relationship between the total response to selection at the limit, R, defined

by equation (12), and the value of z = Nig for all tested regimes. A single curve of best fit has been drawn by eye, and the open circles are predictions given by equation (14).

It must of course be emphasized that the genetic model specified in this paper sets no upper limit to the possible value of R. However, it will be seen that the concept of an effective upper limit is acceptable for the range of parameter values ex- plored, these having been chosen to cover the variety of selection regimes relevant to most laboratory experimentation.

By analogy with the two-allele case, it has been found that the magnitude of the total response to selection, measured in units equal to gu, is almost completely determined by the value of the parameter z = Nig (Figure 2). For the purposes of prediction, the relationship is adequately described by a linear regression equation of the form

log [SI = a f b l o g ( l ~ + z )

A satisfactory fit to the data, which takes into account the theory outlined in the previous section for small values of z, requires that a value of k be chosen so that: (i) the least squares estimate of a in (13) is not significantly different from zero; and (ii) the corresponding estimate of b does not differ significantly from the value of 4k.

Page 11: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

SELECTION WITH MULTIPLE ALLELES 869

With IC = 0.272, the least squares estimates are a = 0.01 2 0.05 and b == 1.09 +-

0.03, the linear regression accounting €or 97% of the variation shown by the dependent variable. The empirically determined relationship can then be written

1+R log [FR] = 1.09 log (1 + z )

where R = 0.272 E ( r ) /gu

As the value of z + 0, the predicted magnitude of R given by equation (14) also tends to zero, and the differential coefficient d(R/k)/dz tends ID a limiting value of 2, satisfying the requirements of equation (8). Equation (14) in addition implies that 0 < R < 1, i.e. that there is an upper limit to the total advance under selection given by max [ E ( r ) 3 = 3.68 gu. This corresponds to the ultimate fixa- tion under selection of the most desired allele in a base population sample of approximately T = 65 individuals, i.e. of 130 different alleles.

In Figure 2 the values of R presented in Table 4 are plotted against the corre- sponding values of z. The curve drawn in the figure is the curve of best fit to the points drawn by eye, while open circles are the predictions given by equation (14). It can be seen that the imposition of an effective upper limit of unity to the value of R is not likely to be misleading for the range of parameter values involved in this study. The particular advantage of the functional relationship given by (14) is that useful comparisons are possible with the corresponding symmetrical genetic model involving two alleles per locus at an initial gene frequency of 0.5.

The appropriate measure of total response for the two-allele model with alleles A,, A, at initial frequencies pl, pz , assuming additive gene action, is

(15) where u ( p z ) is the probability of ultimate fixation of the homozygote favoured by selection (A,A,) . This latter probability has been shown by KIMURA (1957) to be

u ( p J = [I- exp (-2upa)l/[1 - exp (-2u)l where U = Ni a/u and a/u denotes the proportionate effect of the locus, i.e. the difference in effect between the two homozygotes measured as a fraction of the total phenotypic standard deviation. R* then measures the expected total response to selection as a fraction of the total possible response (LATTER 196613). For a symmetrical model with pz = 1/2 we have

and z = Nig.

R* = [ U ( P Z ) - Pzl/(l - PZ)

and a comparison of the effects of experimental manipulation of the value of Ni in the multiallelic and two-allele models, is given directly by equations (14) and (16).

Time scale of the selection process: For each combination of parameter values tested, the 0.50 life (L50) and 0.95 life (LS5) has been recorded in Table 4, ex- pressed as a multiple of the effective population size N . These parameters measure the time taken for the mean contribution of the locus to the measured trait to change from its initial value to that at the selection limit. The observed values of

Page 12: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

8 70 B. D. H. LATTER A N D C . E. NOVITSKI

m. Obs.nrd r.1- of tb 0.50 llfa, 0.95 life a d to ta l nhstioa N a p a n

fer Y i d . rase of d u n Of I - IL.

fir S"l.*

40 $5

1.30 5.83

o.% 4.07

0.82 3.80

0.U 3.W

0.77 3.04

0.62 2.80

0.56 2.28

0.66 2.72

0.62 2.66

0.52 2.16

~

se1.etioa R e B i r

z N P s 2

0.39 5 0.10 0.002

0.88 5 0.10 0.010

1.24 5 0.10 0.020

1.24 10 0.10 0.OM

1.39 5 0.10 0.025

1.75 10 0.10 0.010

1.96 5 0.10 0.050

1.98 20 0.20 0.005

2.21 10 0.20 0.025

2.48 10 0.10 0.020

mp-+

I

0.lM t 0.014

0.370 t 0.011

0.440 t 0.W

0.433 t 0.013

0.401 t 0.m)

0.543 t 0.014

0.528 t 0.025

0.550 t 0.025

0.601 t 0.023

0.198 t 0.014

2.75 15 0.30 0.025

2.77 5 0.10 0.100

2.77 10 0.10 0.025

2.80 20 0.20 0.010

3.05 20 0.40 0.025

0.45 1.92

0.55 2.09

0.48 1.99

0.50 2.02

0.46 2.00

0.45 2.11

0.b5 2.12

0.42 2.01

0.37 1.63

0.39 1.45

0.39 1.59

0.36 1.52

0.34 1.43

0.35 1.26

0.35 1.47

3.13 10 0.20 0.050

3.15 25 0.50 0.025

3.51 20 0.10 0.010

3.89 15 0.30 0.050

3.92 10 0.10 0.050

3.96 20 0.20 0.020

h.32 20 0.40 0.050

4.43 10 0.20 0.100

4.43 20 0.20 0.025

4.46 25 0.50 0.050

0.60) t 0.019

0.591 t 0.016

0.618 t 0.021

0.620 f 0.021

0.622 t 0.018

0.662 t 0.014

0.651 t 0.021

0.687 t 0.024

0.688 t 0.010

0.701 t 0.020

0.722 t 0.021

0.713 f 0.019

0.733 t 0.016

0.695 t 0.027

0.702 t 0.016

&lrr ios L*L.

8

5.m U p.30 0.1w

5.50 30 0.30 0.025

5.55 10 0.10 0.100

5.53 20 0.10 0.025

5.60 U) 0.20 0.010

6.11 20 0.40 0.100

6.11 40 0.40 0.025

6.26 20 0.20 0 . 0

6.31 2S 0.50 0.lW

6.U 50 0.50 0.025

7.37 W 0.1) 0.025

7.77 Y) 0.30 0 . 0

7.1 20 0.10 0 . 0

8.64 U ) 0.40 0.050

8.85 20 0.20 0.lW

8.1 40 0.20 0.025

e.92 sa 0.50 0.0

10.43 30 0.u 0.050

u.m 30 0.30 6.100

11.10 20 0.10 0.100

12.22 U) 0.40 0.lW

12.52 40 0.20 0.050

12.62 50 0.50 0.100

13.99 30 0.15 0.090

16.15 Y) 0.15 0.100

zlr k.d

40 4 5

Q.31 1.20

0.28 1.17

0.33 1.30

0.29 1.28

0.x 1.35

0.29 1.23

0.28 1.13

0.29 1.12

0.28 1.20

0.25 1.00

0.21 0.94

0.23 0.90

0.23 0.90

0.11 o . ~

0.21 0.80

0.20 0.87

0.20 0.87

0.18 0.75

0.18 0.65

0.18 0.68

0.16 0.68

0.15 0.57

0.16 0.67

0.15 0.55

0.15 0.51

mp.w+ i

0.7bl t 0.014

0.755 t 0.029

0.7M f 0.026

0.7s t 0.024

0.7S2 t 0.024 ~

0.135 t 0.013

o.7~ t 0.020

0.105 t o.02b

0.lU t 0.014

0.828 t 0.021

0.017 t 0.027

0.820 t 0.019

0.850 t 0.019

0.855 0.023

0.824 t 0.026

0.M2 t 0.016

0.059 t 0.020

0.864 t 0.024

0.904 t 0.019

0.855 t 0.024

0.989 t 0.013

0.905 t 0.019

0.889 t 0.013

0.007 t 0.024

0.938 t 0.017

these measures of the time scale for the smallest value of z (i.e. 0.39), can be seen to approximate to the theoretical expectations of 1.4N and 6.ON, respectively, given by equation (1 0). The plot in Figure 3 shows the half-life to be essentially determined by the magnitude of Nig, and the same is true of the L5.

A useful prediction formula for the L,,, which is ,accurate in the range 1 < z Q 12, can be derived from the following empirical relationship. The initial phase of response to selection as a function of time, for a given value of z in this range, has been found to approximate to the following equation:

[ l + x ' r ] = [I+-] t i 3 ; p = 4 k z / R 1 - xJr 2N

where x = x ( t ) as defined in Table 1. This relationship holds over a period of

Page 13: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

SELECTION WITH MULTIPLE ALLELES 871

Z 0 8 I-

I

4: I 5

Z 3 = 0.0

0 4 8 12 16

z = N i g FIGURE 3.-Observed values of the half life of the selection process, L5,, as a function of the

value of z. The curve of best fit has been drawn by eye, and open circles are the predictions given by equation (1 7) .

generations exceeding the half-life of the process, so that the L,, can be predicted approximately as

(17) These values are indicated in Figure 3 by open circles for comparison with the curve of best fit drawn by eye to pass through the point (0,1.4).

The approach to the selection limit: Having established that the time scale is a function of N and z over a wide range of parameter combinations, interest centres on the fate of the many alleles initially segregating in the population under selection, their contribution to genetic variation in the quantitative character in the approach to the selection limit, and the expected variation between replicate populations at the limit. Figure 4 summarizes the relevant data for the regimes of Table 4.

Of particular interest from an experimental point of view is the expected variance between replicate populations due to the cumulative effects of genetic sampling throughout the history of the populations, including the effects of the initial sampiing from the base population. Figure 4A shows this parameter to be closely related to the value of z = Nig, the expected variance initially falling very

L,,/N = 2 [exp (l .I /p) - 13

Page 14: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

8 72

2.0 r e - z 4 k - 8

B. D. H. LATTER A N D C. E. NOVITSKI

@ - .

Z e - sf .oo t

0 5 10 15

I 1 1

Z 5 &

0.0- 0 5 lo 15

. *

I I I

I I I I

0 5 10 15

z=Nig

FIGURE 4.-Measures of population structure in the approach to the selection limit, plotted as a function of the value of z: A. The observed variance between replicates at the limit, expressed as a multiple of g2o2; B. The proportion of the genetic variance retained at the L,,; C. The proportion of loci fixed at the LQ5; D. The mean number of alleles segregating at the LQ5.

rapidly with increasing z from the theoretical expectation of 2 g2u2 at z = 0 (equation 11). For values of z in the range 5-15, however, the rate of reduction in between-replicate variance with increasing z can be seen to be much less marked.

The L,, has been taken as the most useful single reference point in the selection history of populations under selection, since decisions concerning future breeding strategy are likely to be made when it is clear that the limit to response is being approached. The extent to which the value of z is important in determining population structure at the LS5 can be gauged from Figure 4.

Page 15: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

SELECTION WITH MULTIPLE ALLELES 873

With very low values of z, the mean number of alleles segregating at the L,, is approximately 1.12 and the probability of fixation at the locus is roughly 0.88; with a value of z of the order of 15, on the other hand, the expected number of alleles segregating is very close to 3.0 with a probability of fixation of only 0.13 (Figure 4C,D). Both parameters can be seen to be very largely determined by the value of z over the range of parameter combinations involved in this study. A corresponding general relationship between the mean number of alleles segre- gating at the L,, and the value of z has also been observed, but it is obvious that the number of alleles at this earlier stage is also markedly dependent on the value of N . The same must apply in lesser degree at the L9,. For z > 2.0 the probability of fixation at a locus at the L,, is effectively zero for all tested regimes. The corresponding probability for z + 0 is approximately 0.03, as discussed in a previous section.

It can be seen from Figure 4B that the genetic variance retained at the L,, expressed as a proportion of that contributed by the locus in the base population, is almost independent of z. The least squares regression line y = a + bz has coefficients a = 0.052 2 .002, b = -0.000,655 k .000,310. The value of a does not differ significantly from that expected for z + 0, viz. 0.0498. It is therefore apparent that as the value of z increases, the mean number of alleles segregating at the L,, is increased and the probability of fixation at the locus is considerably reduced, but the genetic variance contributed by the segregating alleles remains almost unchanged. The additional alleles segregating at loci with higher values of z appear, therefore, to be almost identical as regards their allelic effects, though they are of course independent in origin. I t must be stressed that these conclusions will not necessarily be valid when high values of z are combined with low values of N .

Total response with program B: Computer program B has been used in this study to check on the numerical predictions of equation (14)) and to test the importance of some of the restrictions specified in the genetic model. The follow- ing are the principal differences between the simulation regimes of programs A and B: the latter specifies (i) a finite number of alleles per locus, with an initially symmetrical binomial distribution of allelic effects; (ii) normally distributed environmental effects of constant variance; and (iii) selection based on ranking and truncation separately within each sex.

Table 5 sets out a series of observations of the mean total response given by program B for three different values of z, as a function of the number of alleles per locus specified to be segregating in the base population. The total heritability in each case was no greater than 0.10. With a large number of alleles (> 10) the observed responses are in good agreement with the predictions of equation (14) , indicating that differences between the two techniques of computer simulation are of minor importance. The results of these m s also suggest that for low values of z, the predictions of equation (14) can be accurate for as few as 3 or 4 alleles per locus.

DISCUSSION

It will already be clear that this study has been designed primarily to aid in the

Page 16: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

8 74 B. D. H. LATTER A N D C . E. NOVITSKI

TABLE 5

Total selection response per locus, as a function of the number of alleles segregating in the base population (program B )

Mean observed total response (R) Number of Maximum N = l O N = 10 N = 20

alleles response z = 1.90 z = 3.81 z = 8.67 (U + 1)* (k {ZU) gz == 0.02 g? = 0.05 g? = 0.10 ~

2 3 4 5 6

11 21 41

0.385 0.544 0.666 0.769 0.860 1.216 1.720 2.433

0.382 t .004 0.513 t .014 0.536 k .017 0.515 t ,021. 0.538 t .026 0.510 t .028 0.537 & .027 0.554 t ,029

0.384 2 .001 0.539 t .009 0.608 t .014 0.610 & ,012 0.638 t .033 0.671 t .026 0.702 f .026 0.710 t .021

0.382 t .003 0.546 & .003 0.662 t .005 0.756 & ,008 0.790 t .014 0.818 t .017 0.839 t ,019 0.896 t ,020

0.523 0.694 0.84sE

* U denotes the number of bits of the computer word per locus. -f Prediction based on equation (14).

interpretation of laboratory experimentation, rather than in the elaboration of precise mathematical theory. This is reflected particularly in the choice of combi- nations of the basic parameters N , i and g, each of which has been vaned only over the range of values usual in selection experiments with mice, Drosophila or Tribolium (Tables 3,4). The aim has been to pinpoint the important relationships which may be useful in practice, rather than to dwell too much on details. It is entirely possible that the relationship between total response and z in Figure 2, for example, corresponds to a family of curves depending on the value of N in- volved, but the empirical relationship given by equation (14) is nevertheless likely to be adequate for most purposes. Note, however, that intense selection pressures have not been simulated, all tested regimes involving selection intensi- ties between 10% and 50%.

A feature of the study has been the observation that many important facets of the response pattern are essentially a function of the parameter combination z = Nig. The total response realized under a given selection regime, the 0.50 life and 0.95 life of the selection process, and a number of aspects of the behaviour of populations approaching the selection limit (Figures 2, 3 ,4 ) , all fall in this cate- gory. Because of this essential simplicity of the genetic model and pattern of re- sponse, it is our view that comparisons with the results of laboratory experiments are likely to prove particularly fruitful.

A large number of possible alleles is supposed to be present in the base popula- tion at the “locus” whose response to selection has been simulated, allelic effects being additive and normally distributed. However, the locus may equally well be taken to be a specified region of a chromosome, carrying a number of separate loci affecting the quantitative character under selection, within which effective

Page 17: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

SELECTION WITH MULTIPLE ALLELES 8 75

recombination is suppressed. One of the most direct applications of the numerical results of this study may in fact be the determination of the number of effective units of segregation involved in response to selection in very small populations.

In selection experiments with Drosophila, for example, it may be known that many loci are contributing to variation in the selected trait, and assays may have been carried out to indicate the relative importance of each of the three major chromosomes. The pattern of response observed in replicate populations of small breeding size can then readily be compared with expectation based on the nu- merical results of this and similar studies. The extent to which selection has in- volved the manipulation of whole chromosomes, without appreciable recombina- tion of alleles at the individual loci concerned, can then be deduced from the measure of agreement between experimentation and simulation. An understand- ing of the fate of such major units of segregation, particularly in the approach to the selection limit, is of course essential for the design of breeding plans concerned with the release of potential genetic variation via recombination.

Quite apart from the possibilities of direct application, the results of this study provide a basis for the future elaboration of realistic models of quantitative genetic variation. Such models will almost certainly have to be described in terms of sampling distributions of gene effects and gene frequencies, and must allow for the possibility of multiallelic variation, dominance, epistasis and linkage. Models assuming only two alleles per locus obviously cannot at this stage be considered suitable for populations other than those derived from a cross between two inbred lines. Some aspects of this problem of the construction of mathematically simple yet realistic genetic models have been discussed elsewhere (LATTER 1969).

SUMMARY

A study has been made of the effects of directional artificial selection on inde- pendently segregating multiallelic loci with normally distributed allelic effects, assuming additive gene action, and a base population in equilibrium under ran- dom mating. Initial allelic frequencies have been taken to be such that a random sample of T individuals from the base population could be considered to carry 2T distinct alleles at the locus concerned. Computer methods have been used to simu- late truncation selection based on individual pedormance, N parents being chosen each generation.-Over a wide range of parameter combinations chosen for their relevance to laboratory selection experiments, the total advance under selection and the time scale of the selection process have been shown to be essentially a function of z = Nig, where i is the standardized selection differential, and g2 measures the relative contribution of the locus to the total variance in the base population. A number of aspects of the behaviour of populations approaching the selection limit have also been shown to be related to the value of z, within the range of parameter combinations tested. The implications of the study for the interpretation of selection experiments with laboratory species are briefly dis- cussed.

Page 18: SELECTION IN FINITE POPULATIONS WITH MULTIPLE ALLELES I

8 76 B. D. H. LATTER A N D C. E. NOVITSKI

LITERATURE CITED

DEMPSTER, E. R., 1955

FALCONER, D. S., 1960

FRASER, A. S., 1957

GRIFFING, B., 1960

HILL, W. G., and A. ROBERTSON, 1966

Genetic models in relation to animal breeding problems. Biometrics 11: 535-536.

Introduction to Quantitative Genetics. Ronald, New York.

Simulation of genetic systems by automatic digital computers. I. Introduc-

Theoretical consequences of truncation selection based on the individual

The effect of linkage on limits to artificial selection.

tion. Australian 5. Biol. Sci. 10 : 484-491.

phenotype. Australian J. Biol. Sci. 13: 307-343.

Genet. Res. 8 : 269-294. KIMURA, M., 1955 Random genetic drift in multi-allelic locus. Evolution 9: 419-435. __

1957 Some problems of stochastic processes in genetics. Ann. Math. Statist. 28: 88%901. __ 1958 On the change of population fitness by natural selection. Heredity 12: 145- 167. - 1965 A stochastic model concerning the maintenance of genetic variability in quantitative characters. Proc. Natl. Acad. Sci. US. 54: 731-736.

Selection for a threshold character in Drosophila. I. An analysis of the phenotypic variance on the underlying scale. Genet. Res. 5 : 198-210. __ 1965a The response to artificial selection due to autosomal genes of large effect. I. Changes in gene frequency at an additive locus. Australian J. Biol. Sci. 18: 585-598. __ 19651, The response to artificial selection due to autosomal genes of large effect. 11. The effects of linkage on limits to selection in finite populations. Australian J. Biol. Sci. 18: 1009-1023. - 1966a The response to artificial selection due to autosomal genes of large effect. 111. The effects of linkage on the rate of advance and approach to fixation in finite populations. Australian J. Biol. Sci. 19: 131-146. - 196613 The interaction between effective population size and linkage intensity under artificial selection. Genet. Res. 7 : 313-323. - Selection for a threshold character in Drosophila. 11. Homeostatic behaviour on relaxation of selection. Genet. Res. 8 : 205-218. - 1969 Models of quantitative genetic variation and computer simulation of selection response. Proc. Intern. Confer. Com- puter Application Genetics. Univ. of Hawaii Press. (in press).

A comparison of methods for generating normal deviates on digital computers. J. Assoc. Computing Machinery 6: 376-383.

A theory of limits in artificial selection. Proc. Roy. Soc. London B 153: 234-249. - 1966 Artificial selection in plants and animals. Proc. Roy. Soc. London B 164: 341-349.

The Art of Simulation. English University Press, London.

LATTER, B. D. H., 196)

1966c

MULLER, M. E., 1959

ROBERTSON, A., 1960

TOCHER, K. D., 1963