Population Genetics
Joe Felsenstein
GENOME 453, Autumn 2015
Population Genetics – p.1/73
Godfrey Harold Hardy (1877-1947) Wilhelm Weinberg (1862-1937)
Population Genetics – p.2/73
A Hardy-Weinberg calculation
5 AA 2 Aa 3 aa
0.50 0.20 0.30
Population Genetics – p.3/73
A Hardy-Weinberg calculation
5 AA 2 Aa 3 aa
0.50 0.20 0.30
0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30
Population Genetics – p.4/73
A Hardy-Weinberg calculation
0.6 A
5 AA 2 Aa 3 aa
0.50 0.20 0.30
0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30
0.6 A 0.4 a
0.4 a
Population Genetics – p.5/73
A Hardy-Weinberg calculation
0.36 AA 0.24 Aa
0.24 Aa 0.16 aa
0.6 A
5 AA 2 Aa 3 aa
0.50 0.20 0.30
0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30
0.6 A 0.4 a
0.4 a
Population Genetics – p.6/73
A Hardy-Weinberg calculation
0.6 A
5 AA 2 Aa 3 aa
0.50 0.20 0.30
0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30
0.6 A 0.4 a
0.4 a
Result:
0.36 AA
0.48 Aa0.16 aa
0.6 A
0.4 a
1/2
1/2
0.36 AA 0.24 Aa
0.24 Aa 0.16 aa
Population Genetics – p.7/73
Hardy-Weinberg mathematics
aaq 2Aapq
A
A a
a
Result:
AA Aa aa
P Q R
p
q
p qP + (1/2) Q (1/2) Q + R
p AA
q aa
p A
q a
2
22pq Aa
AAp2 Aapq
1/2
1/2
Population Genetics – p.8/73
Calculating the gene frequency (two ways)
Suppose that we have 200 individuals: 83 AA, 62 Aa, 55 aa
Method 1. Calculate what fraction of gametes bear A:
AA
Aa
aa
Genotype Number
83
62
55
Genotype frequency
0.415
0.31
0.275
Fraction of gametes
0.57
0.431/2
A
a
all
1/2
all
Population Genetics – p.9/73
Calculating the gene frequency (two ways)
AA
Aa
aa
Genotype Number
83
62
55
0.57
0.43
A
a
A’s a’s
0
1100
166
62 62
228 172 = 400
228
400=
400
172=
Method 2. Calculate what fraction of genes in the parents are A:
Suppose that we have 200 individuals: 83 AA, 62 Aa, 55 aa
+
Population Genetics – p.10/73
A numerical example of natural selection
Genotypes: AA Aa aaRelative fitnesses 1 1 0.7 (viabilities)
Initial gene frequency of A = 0.2(newborns) 0.04 0.32 0.64
×1 ×1 ×0.7Survivors: 0.04 + 0.32 + 0.448 = 0.808
Genotype frequencies among the survivors (divide by the total)0.0495 0.396 0.554
gene frequencyA : 0.0495 + 0.5 × 0.396 = 0.2475a : 0.554 + 0.5 × 0.396 = 0.7525
genotype frequencies (among newborns)0.0613 0.3725 0.5663
Population Genetics – p.11/73
The algebra of natural selection
Genotype AA Aa aa
Frequency p2 2pq q2
Relative fitnesses wAA wAa waa
After selection p2 wAA 2pq wAa q2 waa
Sum = p2 wAA + 2pq wAa + q2 waa
Survivors: p2 wAA
Sum2pq wAa
Sumq2 waa
Sum
p′ = p2 wAA + pq wAa
p2 wAA + 2pq wAa + q2 waa
p′ = p(p wAA + q wAa)p2 wAA + 2pq wAa + q2 waa
p′ = p w̄A
w̄
Population Genetics – p.12/73
Is weak selection effective?
Suppose (relative) fitnesses are:
AA Aa aa
(1+s)2 1+s 1 So in this example each change of
a to A multiplies the fitnessby (1+s), so that it increases itby a fraction s.
0.01 − 0.1 0.1 − 0.5 0.5 − 0.9 0.9 − 0.99s
The time for gene frequency change, in generations, turns out to be:
change of gene frequencies
1
0.1
0.01
0.001
3.46 3.17 3.17 3.46
25.16 23.05 23.05 25.16
240.99 220.82 220.82 240.99
2399.09 2198.02 2198.02 2399.09
Age
ne fr
eque
ncy
of
generations
0
1
0.5
(1+s) (1+s)xx
Population Genetics – p.13/73
An experimental selection curve
Population Genetics – p.14/73
Rare alleles occur mostly in heterozygotes
Genotype frequencies:
0.81 AA : 0.18 Aa : 0.01 aa
Note that of the 20 copies of a,
18 of them, or 18 / 20 = 0.9 of them are in Aa genotypes
This shows a population in Hardy−Weinberg equilibrium
at gene frequencies of 0.9 A : 0.1 a
Population Genetics – p.15/73
Overdominance and polymorphism
AA Aa aa1 − s 1 1 − t
when A is rare, most A’s are in Aa, and most a’s are in aa
The average fitness of A−bearing genotypes is then nearly 1
The average fitness of a−bearing genotypes is then nearly 1−t
So A will increase in frequency when rare
when a is rare, most a’s are in Aa, and most A’s are in AA
The average fitness of a−bearing genotypes is then nearly 1
The average fitness of A−bearing genotypes is then nearly 1−s
So a will increase in frequency when rare
0 1gene frequency of A
Population Genetics – p.16/73
Underdominance and unstable equilibrium
when A is rare, most A’s are in Aa, and most a’s are in aa
The average fitness of A−bearing genotypes is then nearly 1
when a is rare, most a’s are in Aa, and most A’s are in AA
The average fitness of a−bearing genotypes is then nearly 1
1gene frequency of A
AA Aa aa11+s 1+t
The average fitness of a−bearing genotypes is then nearly 1+t
So A will decrease in frequency when rare
The average fitness of A−bearing genotypes is then nearly 1+s
So a will decrease in frequency when rare
0 Population Genetics – p.17/73
Fitness surfaces (adaptive landscapes)
Overdominance
0 1p0 1p
stable equilibrium
(gene frequency changes)
Underdominance
w__
0 1p
(gene frequency changes)
unstable equilibrium
Is all for the best in this best of all possible worlds?
w__
Population Genetics – p.18/73
A case to consider: two interacting adaptations
Bb BB
aaAaAA
bbBB Bb bb
AA 0.8 0.9 1.0Aa 0.9 1.0 0.9aa 1.0 0.9 0.8
Population Genetics – p.19/73
A fitness surface (in a haploid case)
AB 1.1
Ab 0.9
aB 0.9
ab 1
1.08
1.06
1.04
0.00
0.20
0.40
0.60
0.80
1.00
0.92
0.94
0.96
0.98
0.98
1.00
1.02
0.96
0.94
0.92
0.00 0.20 0.40 0.60 0.80 1.00
Gene Frequency of A
Gen
e F
requ
ency
of B
Population Genetics – p.20/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.21/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.22/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.23/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.24/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.25/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.26/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.27/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.28/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.29/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.30/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.31/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.32/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.33/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.34/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.35/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.36/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.37/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.38/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.39/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.40/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.41/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.42/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.43/73
Genetic drift
Time (generations)
0 1 2 3 4 5 6 7 8 9 10 11
Gen
e fr
eque
ncy
0
1
Population Genetics – p.44/73
Distribution of gene frequencies with drift
0 1
0 1
0 1
0 1
0 1
time
Note that although the individual populations
when we have infinitely many populations)
wander, their average hardly moves (not at all
Population Genetics – p.45/73
Genetic drift in some small populations
from Hartl and Clark,Principles of Population Genetics
Population Genetics – p.46/73
They’re real (lab) populations of Drosophila
from Barton et al., EvolutionPeter Buri. 1956. Gene frequency in small populations of mutantDrosophila. Evolution 10 (4): 367-402.
Population Genetics – p.47/73
Some copies happen to have more descendants
Time
Population Genetics – p.48/73
Some copies happen to have more descendants
Time
Population Genetics – p.49/73
Some copies happen to have more descendants
Time
Population Genetics – p.50/73
Some copies happen to have more descendants
Time
Population Genetics – p.51/73
Some copies happen to have more descendants
Time
Population Genetics – p.52/73
Some copies happen to have more descendants
Time
Population Genetics – p.53/73
Some copies happen to have more descendants
Time
Population Genetics – p.54/73
Some copies happen to have more descendants
Time
Population Genetics – p.55/73
Some copies happen to have more descendants
Time
Population Genetics – p.56/73
Some copies happen to have more descendants
Time
Population Genetics – p.57/73
Some copies happen to have more descendants
Time
Population Genetics – p.58/73
Averaging of gene frequencies when populations admix
Population 1 Population 2
0.2
0.8
0.7
0.3
0.60 from Pop. 1 0.40 from Pop. 2
makes
0.60 x 0.8+ 0.40 x 0.3
which is
0.60
Population Genetics – p.59/73
A cline (name by Julian Huxley)
no migrationsome
more
geographic position
gene
freq
uenc
y
0
1
S N
Population Genetics – p.60/73
Achillea
Achillea lanulosa, the common yarrow or Queen-Anne’s Lace.
Population Genetics – p.61/73
A famous common-garden experiment
Clausen, Keck and Hiesey’s (1949) common-garden experiment inAchillea lanulosa
Population Genetics – p.62/73
Zinc tolerance in a grass species
Sweet vernal grass Anthoxanthum odoratum.
Population Genetics – p.63/73
Heavy metal
Jain and Harper’s 1966 study on zinc tolerance in the grass Anthoxanthumodoratum at an abandoned zinc mine in Wales.
Population Genetics – p.64/73
Johnson and Selander’s studies on house sparrows
male female
Incredibly common, introduced to all of North America by lunatics startingin 1852 in New York City.
Population Genetics – p.65/73
Geographic differentiation of house sparrows
Body weight as a function of temperature
Population Genetics – p.66/73
House sparrows
Color of breast feathers in four locations (reflectance at different wavelengths
Population Genetics – p.67/73
Mutation RatesCoat color mutants in mice. From
Schlager G. and M. M. Dickie. 1967. Spontaneous mutations andmutation rates in the house mouse. Genetics 57: 319-330
Locus Gametes tested No. of Mutations Rate
Nonagouti 67,395 3 4.4 × 10−6
Brown 919,619 3 3.3 × 10−6
Albino 150,391 5 33.2 × 10−6
Dilute 839,447 10 11.9 × 10−6
Leaden 243,444 4 16.4 × 10−6
——- — ————-Total 2,220,376 25 11.2 × 10−6
Population Genetics – p.68/73
Mutation rates in humans
Population Genetics – p.69/73
Forward vs. back mutationsWhy mutants inactivating a functional gene will be morefrequent than back mutations
The gene
12 places can mutate to nonfunctionality
only one place can mutate back to function
function can sometimes be restored by a "second site" mutation, too
Population Genetics – p.70/73
A sequence space
For sequences of length 1000, there are 3 × 1000 = 3000 “neighbors” onestep away in sequence space.
But there are 41000 sequences, which is about 10602 in all !
No two of them are more than 1000 steps apart. Hard to draw such aspace! How do we ever evolve? Wouldn’t it be impossible to find one ofthe tiny fraction of possible sequences that would be even marginallyfunctional?
The answer seems to be that the sequences are clustered. An example ofsuch clustering is the English language, as illustrated by a word game:
W O R DW O R EG O R EG O N EG E N E
But the word BCGHcannot be made into anEnglish word
There are also only a tiny fraction of all 456,976 four-letter words that areEnglish words But they are clustered, so that it is possible to “evolve” fromone to another through intermediates.
Population Genetics – p.71/73
Mutation as an evolutionary force
10−6
and mutation rate back is the same,
0
1
0
0.5
Mutation is critical in introducing new alleles but is very slow
in changing their frequencies
If we have two alleles A and a, and mutation rate from A to a is
500,000 generations
Population Genetics – p.72/73
Estimation of a human mutation rateBy an equilibrium calculation. Huntington’s disease. Dominant. Does notexpress itself until after age 40. 1/100, 000 of people of European ancestryhave the gene. Reduction in fitness maybe 2%.
If allele frequency is q, then 2q(1 − q) of everyone areheterozygotes. So q ≃ 0.000005.
0.02 of these die. So a fraction 0.02 of all copies of the mutant allelein the population are eliminated by natural selection eachgeneration.
So the fraction of all gene copies at that locus that are mutant allelesthat are eliminated is 0.000005 × 0.02 ≃ 10−7
If we are at equilibrium between mutation and selection, this is alsothe fraction of all gene copies at that locus that have a new mutation.
So the mutation rate is in that case 10−7
Similar calculations can be done with recessive alleles, but we mustremember that in their case each death (or reduction in fitness) kills twocopies of the mutant. Population Genetics – p.73/73