the human genome and human evolution y chromosome dr derakhshandeh, phd
TRANSCRIPT
The Human Genome and Human EvolutionY Chromosome
Dr Derakhshandeh, PhD
2
Outline
• Information from fossils and archaeology• Neutral (or assumed-to-be-neutral) genetic
markers– Classical markers– Y chromosome
• Genes under selection– Balancing selection:
• Balancing selection can arise by the heterozygotes having a selective advantage, as in the case of sickle cell anemia
• It can also arise in cases where rare alleles have a selective advantage
– Positive selection
3
Why Y?• "Adam passed a copy of his Y chromosome
to his sons
• The Y chromosome is paternally inherited
• the Y chromosome a father passes to his son is, in large measure, an unchanged copy of his own
4
5
• But small changes (called polymorphisms) do occur
• passed down from generation to generation
6
CHROMOSOME CHANGES
• indels– insertions into or deletions of the DNA at
particular locations on the chromosome
• YAP– which stands for ”Y chromosome Alu
Polymorphism” – Alu is a sequence of approximately 300 letters
(base pairs) which has inserted itself into a particular region of the DNA
7
• Snips– "single nucleotide polymorphisms“– Stable indels and snips are relatively rare – so infrequent – they have occurred at any particular position in
the genome only once in the course of human evolution
– Snips and stable Alus have been termed "unique event polymorphisms" (UEPs)
8
• microsatellites – short sequences of nucleotides (such as GATA)– repeated over and over again a variable number
of times in tandem – The specific number of repeats in a particular
variant (or allele) usually remains unchanged from generation to generation
– but changes do sometimes occur and the number of repeats may increase or decrease
9
• increases or decreases in the number of repeats take place in single steps
• for instance from nine repeats to ten
• whether decreases in number are as common as increases has not been established
10
• Changes in microsatellite length occur much more frequently than new UEPs arise (Snips and stable Alus : "unique event polymorphisms)
• while we can reasonably assume that a UEP has arisen only once
• the number of repeat units in a microsatellite may have changed many times along a paternal lineage
11
The microsatellite data
• can facilitate the estimation of population divergence times
• which can then be compared (and contrasted) with estimated mutational ages of the polymorphic markers
• the combination of these two kinds of data:– offers a powerful tool with which to assess
patterns of migration, admixture, and ancestry
12
• minisatellites
– 10-60 base pairs long – the number of repeats often extends to several
dozen – Changes during the copying process take place
more frequently in minisatellites than in microsatellites
13
the evolutionary clock
– the UEPs as the hour hand
– the microsatellite polymorphisms as the minute hand
– the minisatellites as a sweep second hand
14
a further benefit of using “Y chromosome” to study evolution
• most of the Y chromosome does not exchange DNA with a partner
• all the markers are joined one to another along its entire length
• linkage of markers
15
The human Y chromosome
• can also be used to draw evolutionary trees • the relationships of the Y chromosomes of
other primates • The different polymorphic loci are
distinguished from each other by their chain lengths
• it can be measured using an automatic DNA sequencer
16
Gene scan output of microsatellite DNA analysis from a single individual
The microsatellite peaks are sorted by size, the different colors representing different microsatellites. The small red peaks are size
markers
17
new UEP arises in a certain man
• As the new UEP is copied from generation to generation
• The UEP does not change but, albeit not very often: – increasing– decreasing in length
• The longer the time since the UEP arose– the greater will be the number of different UEP
allele
18
• Such a process:– differentiates one population from another– the more closely two populations– display common haplotype frequencies– the more closely related is their biological
history likely to be
19
IN ANCIENT TIMES
• only the analysis of DNA obtained from our contemporaries
• suggested ways in which we might deduce past history from an interpretation of those data:– DNA can be extracted from ancient remains
20
Amelogenin gene
• exists in two forms:– the one on the X chromosome being different in
length from the one on Y
• Small portions of:– cranial bones– and teeth
• were crushed to powder and decalcified
21
The amelogenin gene
• is a single copy gene
• homologues of which are located on:– Xp22.1-Xp22.3– and Yp 11.2
22
Yp 11.2Yp 11.2
23
• DNA was purified• copied by PCR using primers flanking the region• the size of the products was measured by agarose
gel electrophoresis• Since Y chromosomes yield fragments 218 base
pairs long• while X chromosome products contain 330 base
pairs• they should be clearly distinguishable:
– if the specimen yields the shorter gene, it must come from a Y chromosome fragment and thus from a male.
24
Disadvantages• DNA is often degraded
• so that continuous fragments are no longer present
• cannot be copied
• substances may be present:
– inhibit both purification and amplification
25
The first two human Y chromosome marker
• studies appeared in 1985 (Casanova et al. 1985; Lucotte and Ngo 1985)
• It was not until almost a decade later that Torroni and co-workers (1994a) published the first Y chromosome data on Native Americans
• Numerous surveys of variation on the non-recombining portion of the Y chromosome (NRY)
26
Who are our closest living relatives?
Chen FC & Li WH (2001) Am. J. Hum. Genet. 68 444-456
27
• selected 53 autosomal / Y Ch intergenic nonrepetitive DNA segments from the
• human genome and sequenced them in a human, a chimpanzee, a gorilla, and an orangutan.
28
The average sequence
• divergence was only 1.24% +/- 0.07% for the human-chimpanzee pair
• 1.62% +/- 0.08% for the human-gorilla Pair
• and 1.63% +/- 0.08% for the chimpanzee-gorilla pair
29
• Taking the orangutan speciation date as 12 to 16 million years ago
• an estimate of 4.6 to 6.2 million years for the Homo-Pan divergence
• an estimate of 6.2 to 8.4 million years for the gorilla speciation date
• gorilla lineage branched off 1.6 to 2.2 million years earlier than did the human-chimpanzee divergence
12 to 16 million
4.6 to 6.2 million
6.2 to 8.4 million
1.6 to 2.2 million
30
Phenotypic differences between humans and other apes
*Carroll (2003) Nature 422, 849-857
31
Chimpanzee-human divergence
Chimpanzees Humans
6-8millionyears
Hominids or hominins
32
Origins of hominids
• Sahelanthropus tchadensis
• Chad (Central Africa)• Dated to 6 – 7 million
years ago• Posture uncertain, but
slightly later hominids were bipedal
‘Toumai’, Chad, 6-7 MYABrunet et al. (2002) Nature 418, 145-151
33
Hominid fossil summary
Found only in Africa Found both in Africa and outside, or only outside Africa
34
Origins of the genus Homo
• Homo erectus/ergaster ~1.9 million years ago in Africa
• Use of stone tools• H. erectus in Java ~1.8
million years ago
Nariokatome boy, Kenya, ~1.6 MYA
35
Additional migrations out of Africa
• First known Europeans date to ~800 KYA
• Ascribed to H. heidelbergensis
36
Origins of modern humans (1)
• Anatomically modern humans in Africa ~130 KYA
• In Israel by ~90 KYA
Omo I, Ethiopia, ~130 KYA
37
Origins of modern humans (2)
• Modern human behaviour starts to develop in Africa after ~80 KYA
• By ~50 KYA, features such as complex tools and long-distance trading are established in Africa
The first art? Inscribed ochre, South Africa, ~77 KYA
38
Expansions of fully modern humans
• Two expansions:• Middle Stone Age
technology in Australia ~50 KYA
• Upper Palaeolithic technology in Israel ~47 KYA
Lake Mungo 3, Australia, ~40 KYA
39
the Upper Paleolithic period
• In the Upper Paleolithic period:– Neanderthal man disappears– and is replaced by a variety of Homo sapiens
40
Routes of migration?archaeological evidence
50 KYA
47 KYA40 KYA
39 KYA
MiddleStone Age
Upper Paleolithic
~130KYA
41
Strengths and weaknesses of the fossil/archaeological records
• Major source of information for most of the time period
• Only source for extinct species
• Dates can be reliable and precise
– need suitable material, C calibration required
14
42
Mixing or replacement?
43
Human genetic diversity is low
44
Modern human mtDNA is distinct from Neanderthal mtDNA
Krings et al. (1997) Cell 90, 19-30
45
Nature Genetics 33, 266 - 275 (2003)
The application of molecular genetic approaches to the study of
human evolution
L. Luca Cavalli-Sforza1 & Marcus W. Feldman2
46
• Haploid markers from mitochondrial DNA and the Y chromosome have proven invaluable for generating a standard model for evolution of modern humans
• earlier research on protein polymorphisms
• Co-evolution of genes with language and some slowly evolving cultural traits, together with the genetic evolution
47
Evolutionary events affecting genomic variation (1)
• All genetic variation is caused by mutations
• The most common and most useful for many purposes are SNPs
• which can be detected by DNA sequencing
48
Evolutionary events affecting genomic variation (2)
• Allelic frequencies change in populations owing to two factors:– natural selection:– population variation among individual genotypes in
their probabilities of survival and/or reproduction, random genetic drift
– next generation– Both natural selection and genetic drift can ultimately
lead to the elimination or fixation of a particular allele• In the presence of mutation and in the absence of
selection:– neutral conditions:
• the rate of neutral evolution of a finite population is equal to the mutation rate!
49
Evolutionary events affecting genomic variation (3)
• The earliest evidence of selection :
– heterozygotes of the hemoglobin A/S
• polymorphism have greater resistance to malaria than do AA or SS homozygotes
– G6PD locus:
• resistance to malaria
50
Evolutionary events affecting genomic variation (4)
• Strong directional selection : for FOXP2
– a two amino-acid difference between the human protein and in primates
– selectively important for the evolution of speech and language in modern humans
51
Evolutionary events affecting genomic variation (5)
• the agent of selection is not at all obvious:
– the CCR5 gene seems :
• related to HIV resistance
– mutations in the BRCA1 gene:
• produce an increased risk of female breast cancer
52
Migration is another important factor in human evolution that can
profoundly affect genomic variation within a population
53
Summary tree of world populations.Phylogenetic tree based on polymorphisms of 120 protein genes in 1,915
populations
Cavalli-Sforza & Feldman (2003) Nature Genet. 33, 266-275
54
For populations that are geographically close, genetic and geographic distances are
often highly correlated
Cavalli-Sforza & Feldman (2003) Nature Genet. 33, 266-275
55
Dating the origin of our species using genetic data (1)
• The mutation rate of the NRY is comparable to that of nuclear DNA
• polymorphisms are more difficult to find but genealogies are easier to reconstruct
• The greater length of DNA on the NRY (perhaps 30 million bases of euchromatic DNA) lower mutation rate
• Even though the NRY behaves effectively as a single locus• usually insufficient for evolutionary analyses• it has provided results that are consistent across many
studies and in agreement with many archeological findings
56
High resolution history using haploid markers
• SNPs on the NRY and mtDNA :– higher resolution of population history
through the reconstruction of the phylogenetic relationships of extant Y chromosomes and mtDNA
• the Y Chromosome Consortium:– the first two haplogroups (A and B) are
almost completely African and even today represent mostly their descendants
57NRY
SiberiaSiberia
India
Eskimo
58
The migration of modern Homo sapiens.begins with a radiation from East Africa to the rest of Africa about 100 kya and from the same area to Asia, southern and northern between 60 and 40 kya. Oceania, Europe and America were settled from Asia in
that order.
Cavalli-Sforza & Feldman (2003) Nature Genet. 33, 266-275
59
NRY
• Slow growth is indicated by the accumulation of many mutations within a branch, as in most descendants of haplogroup A and B
• and in those of the earliest branches of haplogroups C, D, E and F
•
60
NRY
• By contrast, when there are many branches (called a starburst) after a specific mutation or group of mutations, we can infer rapid growth
• The major expansions are those of haplogoup F (seven branches) after an initial lag in population growth, and even more remarkable is the later expansion of haplogroup K (nine branches).
61
haplogroup K (nine branches)
• These began in the last 40 kya and led to the major settlement of all continents from Africa, first to Asia, and from Asia to the other three continents.
62
mtDNA
• The tree of mtDNA is more bushy, but there are more haplogroups because of the higher mutation rate!
63mtDNA
64
mtDNA• The earliest branches all remain in Africa• in both trees they clearly refer to the
slowly growing hunter-gatherers• In both trees the major growth in Africa
is due to a late branch, taking place in the second part of the last 100,000 years and clearly connected with the expansion to Asia
65
Language families of the world
66
Phylogeographic studies
• Analysis of the geographical distributions of lineages within a phylogeny
• Nodes or mutations within the phylogeny may be dated
• Extensive studies of mtDNA and the Y chromosome
67
Phylogenetic trees commonly indicate a recent origin in Africa
A B C D E F* G H I J K* L M N O P* Q R
10
20
30
40
60
50
70
90
80
0
KYA
90 (50 - 130) KYA, Hammer and Zegura59 (40 - 140) KYA, Thomson et al.
69 (56 - 81) KYA, Hammer and Zegura40 (35 - 89) KYA, Thomson et al.
Y chromosome
68
Y haplogroup distribution
Jobling & Tyler-Smith (2003) Nature Rev. Genet. 4, 598-612
A B C D E F* G H I J K* L M N O P* Q R
69
70
71
72
An African origin
A B C D E F* G H I J K* L M N O P* Q R
73
SE Y haplogroups
A B C D E F* G H I J K* L M N O P* Q R
74
NW Y haplogroups
A B C D E F* G H I J K* L M N O P* Q R
75
Did both migrations leave descendants?
• General SE/NW genetic distinction fits two-migration model– Basic genetic pattern established by initial
colonisation• All humans outside Africa share same subset
of African diversity (e.g. Y: M168, mtDNA: L3)
– Large-scale replacement, or migrations were dependent
• How much subsequent change?
76
Fluctuations in climate
Ice ages
Antarcticice core data
Greenland ice core data
77
Possible reasons for genetic change• Adaptation to new environments• Food production – new diets• Population increase – new diseases
78
Debate about the Paleolithic-Neolithic transition
• Major changes in food production, lifestyle, technology, population density
• Were these mainly due to movement of people or movement of ideas?
• Strong focus on Europe
79
Estimates of the Neolithic Y contribution in Europe
• ~22% (=Eu4, 9, 10, 11); Semino et al. (2000) Science 290, 1155-1159
• >70% (assuming Basques = Paleolithic and Turks/Lebanese/ Syrians = Neolithic populations); Chikhi et al. (2002) Proc. Natl. Acad. Sci. USA 99, 11008-11013
80
The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y
chromosome perspective (1)• It was derived from 22 markers of the
nonrecombining Y chromosome (NRY)• Ten lineages account for >95% of the 1007
European Y chromosomes • Geographic distribution and age estimates
of alleles are compatible with two Paleolithic and one Neolithic migratory episode (Semino et al. (2000)
81
The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a
Y chromosome perspective (2)• that have contributed to the modern
European gene pool• A significant correlation between the NRY
haplotype data and principal components based on 95 protein markers was observed
• indicating the effectiveness of NRY polymorphisms in the characterization of human population composition and history(Semino et al. (2000)
82
More recent reshaping of diversity
• ‘Star cluster’ Y haplotype originated in/near Mongolia ~1,000 (700-1,300) years ago• Now carried by ~8% of men in Central/East Asia, ~0.5% of men worldwide• Suggested association with Genghis Khan
Zerjal et al. (2003) Am. J. Hum. Genet. 72, 717-721
83
Mongolia (1)(Zerjal et al. (2003) Am. J. Hum. Genet. 72, 717-721)
• It was found in 16 populations • throughout a large region of Asia• stretching from the Pacific to the Caspian
Sea• present at high frequency:
– ∼8% of the men in this region carry it– ∼0.5% of the world total
• behavior
84
Mongolia (2)(Zerjal et al. (2003) Am. J. Hum. Genet. 72, 717-721)
• The pattern of variation within the lineage:– it originated in Mongolia 1,000 years ago∼
• Such a rapid spread cannot have occurred by chance
• it must have been a result of selection• The lineage is carried by likely male-line
descendants of Genghis Khan• propose that it has spread by a novel form of
social selection
85
Is the Y a neutral marker?
• Recurrent partial deletions of a region required for spermatogenesis
• Possible negative selection on multiple (14/43) lineages
Repping et al. (2003) Nature Genet. 35, 247-251
86
1.6-Mb deletion (1)
• Polymorphism for a 1.6-Mb deletion of the human Y chromosome
• persists through balance between:– recurrent mutation– and haploid selection
Repping et al. (2003) Nature Genet. 35, 247-251
87
AZF
88
1.6-Mb deletion (2)
• Many human Y-chromosomal deletions:– severely impair reproductive fitness– precludes their transmission to the next
generation – ensures their rarity in the population
Repping et al. (2003) Nature Genet. 35, 247-251
89
1.6-Mb deletion (3)
• 1.6-Mb deletion that persists over generations • It is sufficiently common to be considered a
polymorphism• They hypothesized that this deletion might affect
spermatogenesis • because it removes almost half of the Y
chromosome's AZFc region (1.6 Mb)• a gene-rich segment that is critical for sperm
production1
90
gr/gr deletion Y chromosomes
• lower penetrance with respect to spermatogenic failure than previously characterized Y-chromosomal deletions
• it is often transmitted from father to son• the existence of this deletion:
– as a polymorphism – reflects a balance between haploid selection– and homologous recombination– which continues to generate new gr/gr deletions
Repping et al. (2003) Nature Genet. 35, 247-251
91
Selection in the human genome
time
NeutralNegative
(Purifying,Background)
BalancingPositive
(Directional)
Bamshad & Wooding (2003) Nature Rev. Genet. 4, 99-111
92
Selection in the human genome (1)
• Natural selection leaves signatures in our genome that can be used to identify the genes that might underlie variation in disease resistance or drug metabolism
• Evidence of positive selection acting on genes is beginning to accumulate
93
Selection in the human genome (2)
• Demographic processes should affect all loci in a similar way, whereas the effects of selection should be restricted to specific loci
94
Demographic changes
Population has expanded in range and numbers
95
The Prion protein gene and human disease
• Prion protein gene PRNP linked to ‘protein-only’ diseases e.g. CJD, kuru
• A common polymorphism, M129V, influences the course of these diseases
• the MV heterozygous genotype is protective• Kuru acquired from ritual cannibalism was
reported (1950s) in the Fore people of Papua New Guinea, where it caused up to 1% annual mortality
96
Creutzfeldt-Jakob Disease (CJD)
• a neurodegenerative disease called Kuru • found in cannibalistic Pacific Islanders• a disorder diagnosed in one person per million• common symptoms:
– gait disorders– jerky movements– dementia that lead to death months after the first
appearance of symptoms
97
Balancing selection at PRNP• Deep division between the M and V lineages, estimated at 500,000 years
• Kuru imposed strong balancing selection on the Fore• essentially eliminating PRNP 129 homozygotes• Worldwide PRNP haplotype diversity and coding allele frequencies :
– strong balancing selection at this locus – during the evolution of modern humans
98
Neutral Selection
Derived allele of SNP
Effect of positive selection
99
What changes do we expect?
• New genes
• Changes in amino-acid sequence
• Changes in gene expression (e.g. level, timing or location)
• Changes in copy number
100
How do we find such changes?
• Chance– φhHaA type I hair keratin gene inactivation in
humans
• Identify phenotypic changes, investigate genetic basis
• Identify genetic changes, investigate functional consequences
101
Human type I hair keratin pseudogene φhHaA
• This mutant protein is unable to activate hair keratin gene expression
• the nude phenotype• has functional orthologs in the chimpanzee
and gorilla: – evidence for recent inactivation of the human
gene after the Pan-Homo divergence – 5. 5 million years ago
102
Inheritance of a language/speech defect in the KE family
Lai et al. (2000) Am. J. Hum. Genet. 67, 357-367
Autosomal dominant inheritance pattern
103
A forkhead-domain gene is mutated in a severe speech and language
disorder
• the gene FOXP2• encodes a putative transcription factor • Containing:
– a polyglutamine tract– a forkhead DNA-binding domain
• disrupted by the translocation or point mutation • the KE family that alters an invariant amino-acid
residue in the forkhead domain
104
Mutation and evolution of the FOXP2 gene
Chr 77q31
FOXP2 gene
Nucleotide substitutions
silent replacement
Enard et al. (2002) Nature 418, 869-872
105
Positive selection at the FOXP2 gene
• Resequence ~14 kb of DNA adjacent to the amino-acid changes in 20 diverse humans, two chimpanzees and one orang-utanOrang Gorilla Chimp Human
silent(synonymous)
dS
replacement(non-synonymous)
dN
Human-specific increase in dN/dS ratio (P<0.001)
Constant rate of amino-acid replacements? Positive selection in humans?
Enard et al. (2002) Nature 418, 869-872
106
A gene affecting brain size
Microcephaly (MCPH)
• Small (~430 cc v ~1,400 cc) but otherwise ~normal brain, only mild mental retardation
• MCPH5 shows Mendelian autosomal recessive inheritance
• Due to loss of activity of the ASPM gene
ASPM-/ASPM- control
Bond et al. (2002) Nature Genet. 32, 316-320
107
Evolution of the ASPM gene (1)
Summary dN/dS values
Orang Gorilla Chimp Human
Human-specific increase in dN/dS ratio (P<0.03)
1.44
0.560.56
0.53
0.52
0.62
Sliding-window dN/dS analysis
Evans et al. (2004) Hum. Mol. Genet. 13, 489-494
108
What changes?• The Drosophila homolog of ASPM codes for a
microtubule-binding protein that influences spindle orientation and the number of neurons
do Carmo Avides and Glover (1999) Science 283, 1773-1735
DNA
Microtubules
asp
• Subtle changes to the function of well-conserved genes
109
Genome-wide search for protein sequence evolution
• 7645 human-chimp-mouse gene compared
• Most significant categories showing positive selection include:– Olfaction: sense of smell– Development: e.g. skeletal– Hearing: for speech perception– brain size: IQ
Clark et al. (2003) Science 302, 1960-1963
110
Gene expression differences in human and chimpanzee cerebral cortex
Increased expression Decreased expression
Caceres et al. (2003) Proc. Natl. Acad. Sci. USA 100, 13030-13035
• Affymetrix oligonuclotide array (~10,000) genes• 91 show human-specific changes, ~90% increases
111
Copy number differences between human and chimpanzee genomic DNA
Human male reference genomic DNA hybridised with female chimpanzee genomic DNA
Locke et al. (2003) Genome Res. 13, 347-357
112
Selection at the CCR5 locus• CCR532/CCR532 homozygotes are resistant to
HIV and AIDS
• The high frequency and wide distribution of the 32 allele suggest past selection by an unknown agent
113
The Role of the Chemokine Receptor Gene CCR5 and Its Allele (
del32 CCR5)
• Since the late 1970s
• 8.4 million people worldwide
• including 1.7 million children, have died of AIDS
• an estimated 22 million people are infected with human immunodeficiency virus (HIV)
114
CCR5 and Its Allele ( del32 CCR5)
T-cell line (Tl)
monocyte/macrophage (M),
a circulating T-cell (T)
115
Lactase persistence
• All infants have high lactase enzyme activity to digest the sugar lactose in milk
• In most humans, activity declines after weaning, but in some it persists:
LCT*P
116
Molecular basis of lactase persistence
• Lactase level is controlled by a cis-acting element• Linkage studies show association of lactase
persistence with the T allele of a T/C polymorphism 14 kb upstream of the lactase gene
Enattah et al. (2002) Nature Genet. 30, 233-237
117
The lactase-persistence haplotype
• The persistence-associated T allele occurs on a haplotype (‘A’) showing over > 1 Mb
• Association of lactase persistence and the A haplotype is less clear outside Europe
118
Selection at the G6PD gene by malaria
• Reduced G6PD enzyme activity (e.g. A allele) confers some resistance to falciparum malaria
Extended haplotype homozygosity at the A allele
Sabeti et al. (2002) Nature 419, 832-837
119
Final wordsIs there a genetic continuum between us and our ancestors and the great apes?
If there is, then we can say that:
these [i.e. microevolutionary] processes are
genetically sufficient to fully account for human uniqueness