discussions – optional i. wednesday 3:30-4:20 p.m. noland 342 ii. friday 1:20-2:10 p.m. noland 539
TRANSCRIPT
Examples: Adaptation or not?
• After high altitude training athletes have increased number of red blood cells (RBC)
• Tibetans and Sherpas have higher RBC than lowland (<2000 m) people (Yi et al. 2010, Science 329:75-78)
Examples: Adaptation or not?
• Weeds from a cornfield have been found to grow taller than those from soybean fields when both populations are reared in common-garden conditions
• Taller weeds from the cornfields survive at a greater rate and leave more offspring
WHAT IS EPIGENETICS?
• Epigenetics – gene regulation changes that does not involve a change in DNA sequence
• Epigenetic changes can be INHERITED!!
Common mechanisms may include but not limited to:
-DNA methylation
-Histone modifications (De)Acetylation (De)Methyaltion,
Ubiquitination, Phosphorylation
-Regulatory non-coding RNAs
EPIGENETICS
Need Genetic variation upon which selection could act
This variation could occur at many hierarchical levels: At different structural levels
And at different steps leading to protein expression
Mechanisms of Adaptation
OUTLINE
• The origin of genetic variation
• Examples of structural and regulatory change by mutations
• Detection of selection (adaptation)
Sources of Variation
• Point Mutations nucleotide substitution
• Insertions or DeletionsInsertions or deletions of nucleotides Gene duplications (insertions) or loss
• Chromosomal Duplications• Whole Genome Duplications
Where does the polymorphism (genetic variation) come from?
• Mutations: change in genetic code• Recombination (sex): Intragenic recombination Gene conversion Unequal crossing over – gene duplication• Changes by Transposable Elements
Mutations
Any change in the genetic code, including errors in DNA replication or errors in DNA
repair
Mutations
Mutations that matter, in an evolutionary sense, are that get passed on to the next generation:i.e., those that occur in the cells that produce gametes (the “germ line”)
RATE OF MUTATIONSIn most species, mutation rate is LOW
Mutation rateBase pairs per base per replication per replication per sexual generation
Organism in haploid in effective pair per per haploid per effective per effectivegenome genome replication genome genome genome
T2, T4 phage 1.7*105 - 2.4*10-8 0.0041
E. coli 4.6*106 - 5.4*10-10 0.0025
S. cerevisiae 1.2*107 - 2.2*10-10 0.0026
C. elegans 8.0*107 1.8*107 2.3*10-10 0.0184 0.0041 0.036
D. melanogaster 1.7*108 1.6*107 3.4*10-10 0.0578 0.0054 0.140
Mouse 2.7*109 8.0*107 1.8*10-10 0.4860 0.0144 0.900
Human 3.2*109 8.0*107 5.0*10-11 0.1600 0.0040 1.600
Mutations: Double-Edged Sword
Most mutations are ‘neutral’ with no effect on fitness
Most mutations that arise within functional genes are harmful
Mildly deleterious mutations persist longer in a population because it takes longer to select them out
Recessive mutations remain longer because they are eliminated when homozygous, not when heterozygous
Selection for favorable mutations leads to adaptation.
Where does the polymorphism (genetic variation) come from?
• Mutations: change in genetic code• Recombination (sex): Intragenic recombination Gene conversion Unequal crossing over – gene duplication• Changes by Transposable Elements
Gene Duplications
Duplication of genes due to DNA replication
error or recombination error (unequal crossing
over)
Lynch and Connery 2000
• 0.01 duplications per gene per million years
• Half life for a gene is 3-8 million years Crossing over
Gene Duplications
• Duplicate genes in Eukaryotes are continuously created, tested, and discarded
• Duplicated genes either degenerate into pseudogenes (no function), become new genes, or subfunctionalize with an existing gene
Examples: Gene Families resulting from gene duplications
• Olfactory receptors • Steroid hormone receptors• Heat shock proteins• Ion uptake enzymes• Hemoglobins• Opsins• Melanins• Detoxification enzymes (cytochrome P450s)• Hox genes
STRUCTURAL• Primary: Amino Acid composition (Amino
Acid substitutions)• Secondary, Tertiary, Quaternary structure
REGULATORY• Protein expression (transcription, RNA
processing, translation, etc)• Protein activity (allosteric control,
conformational changes)
Hierarchical processes that are affected by Mutations
REGULATORY
Protein expression• Transcription: Mutations at promoters, enhancers, (CIS)
transcription factors (TRANS), etc
• RNA Processing: Mutations at splice sites, sites of polyadenylation, sites controlling RNA export
• Translation: Mutations in ribosomes, regulatory regions, etc
Protein activity (allosteric control, conformational changes)
Hierarchical processes that are affected by Mutations
Once these mutations have occurred creating genetic variation, selection could act on genes, gene expression, and on genetic architecture (allelic and gene interactions)
OUTLINE
• The origin of genetic variation
• Examples of structural and regulatory changes by mutations
• Detection of selection (adaptation)
Protein function
STRUCTURE• Amino acid composition (AA substitutions)• Secondary, Tertiary, Quaternary structure
REGULATORY• Protein expression (transcription, translation, etc)• Protein activity (allosteric control, conformational
changes, receptors)
Fundulus heteroclitus
Populations in Maine and Georgia have different proportions of alleles (isozymes) at LDH-B
Difference in alleles (isozymes) in North vs South
North: LDH-B b allele (cold-adapted)South: LDH-B a allele (warm-adapted)
The two alleles have a difference of 2 amino acids
Place and Powers, PNAS 1979
1° latitude change = 1°C change in mean water temperature
Place and Powers, PNAS 1979
Place and Powers, 1979
b allele homozygote
a allele homozygote
Catalytic efficiency (kcat/km) is higher for the b allele at low
temperature, and higher for the a allele at higher temperature
Place and Powers, 1979
• The two allele products (the enzymes) show genetic differences in catalytic efficiency (adaptive differences)
• They also show genotype by environment interaction: they differ in the their optimal environments (differences in plasticity)
Catalytic efficiency (kcat/km) is higher for the b allele at low
temperature, and higher for the a allele at higher temperature
Protein function
STRUCTURAL• Amino acid composition (AA substitutions)• Secondary, Tertiary, Quaternary structure
REGULATORY• Protein expression (transcription, translation, etc)• Protein activity (allosteric control, conformational
changes, receptors)
Common Garden Experiment:The Northern isozyme has BOTH higher activity and higher level of expression in fish at constant lab conditions (20°C temperature)
Crawford and Powers, 1989 activity
protein
mRNA
Higher Gene Expression of LDH-B in the Northern Maine population
Schulte et al. 2000
Maine Florida Georgia New Jersey
Transcriptional control• What controls differences in gene expression of LDH in F. heteroclitus?• Mutations within Promoter or Enhancer?
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.
Doug Crawford: Promoter Patricia Schulte: Enhancer
Gene expression
• Transcription
Cis-regulation (at or near the gene)Examples:– RNA polymerase and promoter– Enhancers
Trans-regulation (somewhere else in the genome)Examples:– Gene regulatory proteins (transcription factors)
TEMPERATURE ADAPTATIONin F. heteroclitus
• Cis-acting sequence ~ 500 bp upstream of the start site of transcription of LDH-B
• S-population - a 7-bp site identical to a mouse mammary tumor virus glucocorticoid responsive element (MTV-GRE) repressor
• N-population - this site differs from S population sequence by 1 bp and does not repress expression of LDH gene
• MTV-GRE repressor inhibits transcription in the absence of stress hormones.
• When stress hormone levels are high, the repression is removed and transcription increases
• The putative element within the F. heteroclitus LDH-B gene might behave in a similar way.
Transgenic Fish
Regulatory sequence (an enhancer) was injected into Northern and Southern Fish
An enhancer, located within a 500 base pair sequence, significantly increased gene expression of LDH
control (GRE absent)
GRE present
control(GRE absent)
GRE present
Schulte et al. 2000
Protein function
STRUCTURE• Amino acid composition (AA substitutions)• Secondary, Tertiary, Quaternary structure
REGULATORY• Protein expression (transcription, translation, etc)• Protein activity (allosteric control, conformational
changes)
Gillichthys setaHigh rocky intertidal
Gulf of California5° - 41°C
Gillichthys mirabilissloughs and estuaries
Gulf of California to Tomales Bay (38.16°N)
9–30 °C
Fields and Somero, 1997, Fields et al. 2002
• A4-LDHs from Gillichthys seta and G. mirabilis have identical amino acid sequences (no structural differences)
• But show potentially adaptive differences in substrate affinity for Pyruvate (Km) and thermal stability
Pyruvate Km (mmol/l)
Temperature °C
G. seta more tolerant of a broad temperature range; LDH less sensitive to temperature
OUTLINE• The origin of genetic variation
• Examples of structural and regulatory change by mutations
• Detection of selection (adaptation)
Neutral TheoryKimura (1968, 1983)
Kimura argued that the great majority of evolutionary changes at the molecular level are not caused by selection but by random genetic drift.
Motoo Kimura (1924-1994)Ph.D. University of Wisconsin in 1956Under James Crow
Neutral Theory: Evidence
Molecular evolution takes place at a relatively constant rate, simply through random genetic drift, enough to provide a “molecular clock” of evolution.
Selection-Neutral Debate
• Kimura’s work spawned a heated debate on the relative importance of neutral evolution (genetic drift) versus genetic variation that is a result of natural selection.
• Probability of fixation of neutral mutation:
eN2
1
Neutral Theory
• Now considered the “null model” against which evidence for selection should be tested
Detecting Natural SelectionThere are many statistical tests for detecting Natural selection (Selective Sweeps)
The approach is to test for deviations from a null neutral model (where genetic variation arises only from genetic drift)
Null hypothesis: Neutral, no selectionDeviation from Neutral: selection
Inferences regarding selection provide a powerful tool for the prediction of possible disease-related genomic regions
Methods for Detecting Selection:
A. MacDonald-Kreitman Type Tests
B. Site Frequency Spectrum Approaches
C. Linkage Disequilibrium (LD) and Haplotype Structure
D. Population Differentiation: Lewontin-Krakauer Methods
These tests could be applied to single genes, or across the whole genome.
Codon Bias in Amino Acid Substitutions
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.
• Synonymous substitutions: Mutations that do not cause amino acid change (usually 3rd position)“silent substitutions”
• Nonsynonymous substitutions: Mutations that cause amino acid change (1st, 2nd position)“replacement substitutions”
(1) Ka/Ks TestNonsynonymous substitutions Ka
Synonymous substitutions Ks
• Need coding sequence (sequence that codes proteins)
• Ks is used here as the “control”, proxy for neutral evolution so Ka/Ks = 1 neutral evolution
• A larger nonsynonymous substitution rate (Ka) than synonymous (Ks) is used as an indication of selection (Ka/Ks >1)
• Ka/Ks < 1 ?
> 1
A. MacDonald-Kreitman Type Tests
(2) MacDonald-Kreitman Test
Need coding sequenceNeed two species to determine divergence (D)Under neutral scenario we would expect:
Dn (nonsynonymous substitutions) = Pn (nonsynonymous polymorphism)
Ds (synonymous substitutions) Ps (synonymous polymorphism)
•Dn/Ds > Pn/Ps indicates adaptive substitutions
MK test at the ADH locus in 3 Drosophila species
Fixed difference Polymorphic
Nonsynonymous 7 2
Synonymous 17 42
41.0177
s
nF
F 05.0422
PP
sn
s
n
s
nP
PF
F
68 sites of ADH locus in total compared
p<0.01
McDonald and Kreitman, 1991. Nature, 351:652-654
Fn
Fs
B. Site Frequency Spectrum
• Selection affects the distribution of alleles within populations, typically reducing allele frequency
• Method examines site frequency spectrum and compares to neutral expectations
• Could be applied to a single locus. Now used often for genomic scans for selective sweeps
• Lactose gene in humans, disease alleles• Domestication alleles (corn, rice)
The frequency spectrum: an examplecount of number of mutations
Site
Sequence
Frequency class:
A G G C T T A A AA T G C T C G A AG T G T T C A C GA G G C T C A A GA G A C C C G A A
163
975
1972
2188
3529
4424
4961
5286
7019
1
2
3
4
5
1 2 1 1 1 4 2 1 3
Ancestral Derived
1 2 3 4
1
2
3
4
5
Frequency class
Cou
nt
The frequency spectrum
sing
leto
ndo
uble
ton
sing
leto
n
trip
leto
n
Site Frequency Spectrumcount of number of mutations
Tests:
Tajima’s D
Fu’s Fs
Fay and Wu’s H
selective sweep
positive selection (2Ns=5)
negative selection (2NS=-5)
neutral(no selection, constant population size, no subdivision)
0
0.1
0.2
0.3
0.4
0.5
0.6
Number of copies of derived allele
Frequency
1 3 5 7 9 11 13 15 17 19
Excess of rare
alleles
Excess ofcommonalleles
C. Linkage Disequilibrium (LD)• The nonrandom association of alleles from
different loci, where they are found more or less frequently together than expected
• Selection increases levels of linkage disequilibrium during the process of selection
D. Population Differentiation: Lewontin-Krakauer Methods
• Selection would often increase the degree of genetic distance between populations
• Compute pairwise genetic distances (FST) for many loci between populations
• When a locus shows extraordinary levels of genetic distance relative to other loci, this locus is a candidate for positive selection
Estimates of adaptive evolution
% substitutions fixed by selection:• ~50% in Drosophila• ~56% E. coli, Salmonella• ~75% env gene in HIV in a patient• ~85% hemagluttin gene in human influenza virus• But only 0.08-6% in Humans
• Species with large effective population size (smaller organisms) evolve faster
-More mutations arise in the population because there are more individuals generating more mutations… more opportunity on which selection could act-faster generation time
Evolution of the gene encoding lactase (LCT) in humans (Tishkoff et al. 2007)
• Mutations in LCT is associated with the ability to digest milk in adults
• This ability is prevalent in North Africa and Europe, and is largely absent throughout the world
• The mutant C/T-13910 confers lactase persistence in Europeans. A study of 470 Tanzanians, Kenyans, and Sudanese found three mutants associated with lactase persistence (G/C-14010, T/G-13915, C/G-13907)
The mutant C/T-13910 confers lactase persistence in Europeans. A study of 470 Tanzanians, Kenyans, and Sudanese found three mutants associated with lactase persistence (G/C-14010, T/G-13915, C/G-13907)
Evidence for Selection
• Evidence of a selective sweep; high frequency of C-14010 allele
• Extensive LD on chromosomes with the C-14010 allele, with haplotype homozygosity extending > 2 kilobases
Yum Kaax: Mayan god of cornJohn Doebley http://www.wisc.edu/teosinte/index.htm
Evolution of Corn from Teosinte
Domesticated about 7000 yrs ago in Southern Mexico
Selection for changes in a few developmental genes
Morphological differences between teosinte and maize
• Maize with tb1 knocked out
maizeteosinte
• Has branching patterns like teosinte
Major morphological differences are due to directional selection on 5 genes
Genes: • Teosinte branched1 (tb1): single mutation affects branching and inflorescence• Regulator of tb1• tga glume (outer coating) reduction on chromosome X• teosinte – ~8-12 kernels F1 hybrid 8 rows, corn 20+rows
Evidence for selection in 2-4% of genes, ~1200 genes
Teosinte
Corn
F1 Hybrid
Genes selected for in Corn