the distribution of fitness effects of mutations in humans and flies
DESCRIPTION
The Distribution of Fitness Effects of Mutations in Humans and Flies. Adam Eyre-Walker (University of Sussex). -ve. +ve. 0. Types of Mutation. Deleterious Neutral Advantageous. Deleterious Mutations. Mutation accumulation and Mutagenesis expts. d n /d s in primates.TRANSCRIPT
The Distribution of Fitness Effects of Mutations in
Humans and Flies
Adam Eyre-Walker
(University of Sussex)
Types of Mutation
• Deleterious
• Neutral
• Advantageous
0 +ve-ve
Deleterious Mutations
1/100
<10%
1/10,000
<30%
Mutationaccumulation
and Mutagenesis expts
dn/ds
in primates
Distribution of Effects
neutraldeleterious low high
Theory
Neutral sites (e.g. introns / synonymous)
Selected sites (e.g. non-synonymous)-assume all mutations neutral or deleterious
Ps=Ls i i1
i =1
n - 1
!
Pn=Ln i D(S) x(1 - x)(1 - e- S)(1 - e- S(1 - x) )## (1 - xn - (1 - x)n)2S2x
Simplication
f +Ne
b1
Theory
Neutral sites
Selected sites
Ps=Ls i i1
i =1
n - 1
!
Pn . Ln i i1
i =1
n - 1
! ai b1
Parametersn - knownLn - each geneLs- each gene - each gene - shared - shared
Estimationassume free recomb, , Bayesian estimationusing MCMC
Dataset - humans
• Environmental genome project
• 275 human genes
• 90 individuals resequenced
• 549 non-synonymous polymorphisms
• 15746 intron polymorphisms
Pn/Pi versus i
Human
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
-0.002 0.002 0.006 0.01 0.014
Pn/Pi
i
Results - human
Nes 01 110 10100 1001000 100010000
% 23 22 37 19 0.1
Shape = 0.28Nes = 240
Results - human
01 110 10100 1001000 100010000
0.38 0 0 0 0.62
0.23 0.22 0.37 0.19 0.001
0.17 0.33 0.47 0.03 0.000
Shape = 0.28 (0.03, 0.48)
Nes = 240 (90, )
Low Frequency Polymorphisms
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
low medium high
frequency
proportion
synnon-syn
Dataset - D.melanogaster
44 genes 5-55 alleles sequenced 141 non-synonymous polymorphisms 346 synonymous polymorphisms
Pn/Ps versus s
D.melanogaster
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
-0.01 0 0.01 0.02 0.03 0.04 0.05
Shape = 0.46 (0.15, 0.65)
Adaptive Mutations
Human1 CCC GCA GAG TTA CTA ATC GAAHuman2 CCG GCA GAG TTA CTA ATC GAAHuman3 CCC GCA AAG TTA CTA ATC GAAHuman4 CCC GCA AAG TTA CTA ATC GAA
Chimp CCC GCC GAG TTA GTA ATT GAA
Poly Sub
Syn 1 2
Non 1 1
Model
Poly Sub
Syn Ps≈4Neu Ds≈2ut
Non Pn≈4Ne u f Dn≈2 ut f + a
Assume - synonymous mutations are neutral - amino acid mutations are deleterious, neutral or advantageous
a=Dn - Ds Ps
Pn
a =1- DnPs
DsPn
Estimation
Parameters
n, Ln, Ls - known without error - each gene - each gene - shared, beta distributed or one per gene
Estimation by ML
Drosophila
Poly Sub
Syn 707 2489
Non 153 1054
0.22 0.42
35 genes with multiple alleles in D.simulans and one allele in D.yakuba
Result
= 0.26 (0.08, 0.41)
Proportion Constant
Gene Amino Acid Div
Hsc70 0.0023
Adh 0.036
Est-6 0.20
Model n Log(L)
One 106 -327.5
Beta distributed 107 -327.5
One per gene 140 -302.9
D.simulans & D.yakuba
600,000 aa differences
26 % adaptive
160,000 adaptive
1 every 75 years
Human-Chimp
• Environmental Genome Project• 232 human genes• 90 individuals resequenced• Non-synonymous versus intron• Human sequence aligned against
chimpanzee genome
Human Nuclear Genes
Poly Sub
Intron 17631 33223
Non 681 765
0.039 0.023
Low Frequency Polymorphisms
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
low medium high
frequency
proportion
synnon-syn
Dealing With Deleterious Mutations
• Use estimate of distribution of fitness effects from SNP data
• Assume adaptive and slightly deleterious mutations governed by one distribution
• Ignore low frequency variants
Excluding SNPs
Cutoff ML 95% CI
0% -0.62
5% 0.09 (-0.11, 0.26)
10% 0.26 (0.08, 0.41)
20% 0.31 (0.11, 0.52)
Humans & Chimpanzees
1%290,000 amino acid differences
25% adaptive72,500 adaptive differences
1 every 165 years
Conclusions
• Distribution of fitness effects of slightly/moderately deleterious mutations is highly leptokurtic in humans and drosophila
• ~25% of amino acid substitutions are driven by positive selection in humans and drosophila
• Proportion does not vary between genes
Thanks
Nick Smith
Nicolas Bierne
Gwenael Piganeau
Meg Woolfit