human evolutionary genomics: lessons from duf1220 protein domains, cognitive disease and human brain...
TRANSCRIPT
Human Evolutionary Genomics: Lessons from DUF1220 Protein Domains, Cognitive
Disease and Human Brain Evolution
James M. Sikela, Ph.D.Department of Biochemistry & Molecular Genetics
Human Medical Genetics and Neuroscience ProgramsUniversity of Colorado School of Medicine
Advanced Genome Analysis CourseUniversity of Colorado School of
Medicine March 5, 2015
Primate EvolutionPrimate Evolution
New World Monkeys (e.g. squirrel monkey,spider monkey)New World Monkeys (e.g. squirrel monkey,spider monkey)
Old World Monkeys (e.g. baboon, rhesus, etc.)Old World Monkeys (e.g. baboon, rhesus, etc.)
GibbonsGibbons
OrangutanOrangutan
GorillaGorilla
HumanHuman
ChimpChimp
BonoboBonoboB/C = ~ 2C/H = ~ 5HC/G = ~ 8HCG/O = ~ 13HCG/O/Gib = ~20Hom/OWM = ~ 25HomOWM/NW = ~ 40
40 MYA
25 MYA
20 MYA
13 MYA
8 MYA
5 MYA
2 MYA
Chimpanzee
Gorilla
Bonobo
Orangutan
More Primates!
---- some things have changed!---- some things have changed!
Human CharacteristicsHuman Characteristics• Body shape and thorax• Cranial properties (brain
case and face)• Small canine teeth• Skull balanced upright on
vertebral column• Reduced hair cover• Enhanced sweating• Dimensions of the pelvis• Elongated thumb and
shortened fingers• Relative limb length
• Body shape and thorax• Cranial properties (brain
case and face)• Small canine teeth• Skull balanced upright on
vertebral column• Reduced hair cover• Enhanced sweating• Dimensions of the pelvis• Elongated thumb and
shortened fingers• Relative limb length
• Neocortex expansion• Enhanced language &
cognition• Advanced tool making
• Neocortex expansion• Enhanced language &
cognition• Advanced tool making
modified from S. Carroll, Nature, 2005
Reports of “human-specific” genesReports of “human-specific” genes
• FOXP2– Mutated in family with language disability
• ASPM/MCPH– Mutated in individuals with microcephaly
• HAR1F– Gene sequence highly changed in humans
• SRGAP2 (neuronal migration?)– Partial human-specific gene duplication
• DUF1220 protein domains– Highly increased in copy number in humans;
expressed in important brain regions
• FOXP2– Mutated in family with language disability
• ASPM/MCPH– Mutated in individuals with microcephaly
• HAR1F– Gene sequence highly changed in humans
• SRGAP2 (neuronal migration?)– Partial human-specific gene duplication
• DUF1220 protein domains– Highly increased in copy number in humans;
expressed in important brain regions
HAR1F Gene
Marques-Bonet, et al Ann Rev Genomics 2009
Molecular mechanisms driving genome evolution
Molecular mechanisms driving genome evolution
• Single nucleotide substitutions
- change gene expression
- change gene structure • Genome rearrangement• Gene/segmental duplication
- copy number change
- value of redundancy
• Single nucleotide substitutions
- change gene expression
- change gene structure • Genome rearrangement• Gene/segmental duplication
- copy number change
- value of redundancy
Gene Duplication & Evolutionary Change•“There is now ample evidence that gene
duplication is the most important mechanism for generating new genes and new biochemical
processes that have facilitated the evolution of complex organisms from primitive ones.”
- W. H. Li in Molecular Evolution, 1997
•“Exceptional duplicated regions underlie exceptional biology”
- Evan Eichler, Genome Research, 2001
Fig 1. Measuring genomic DNA copy number alteration using cDNA microarrays (array CGH). Fluorescence ratios are depicted in a pseudocolor scale, such that red indicates increased, and green decreased, gene copy number in the test (right) compared to reference sample (left).
Interhominoid cDNA Array-Based Comparative Genomic Hybridization (arrayCGH)
Fortna, et al, PLoS Biol. 2004Fortna, et al, PLoS Biol. 2004
Human & Great Ape Genes Showing Lineage-Specific Copy Number Gain/Loss
Human
Gorilla Orang
ChimpBonobo
IMAGE:814107IMAGE:261219IMAGE:665496
HB
CG
O
BAC-FISH with clone containing SLC35F5 geneBAC-FISH with clone containing SLC35F5 gene
PLA2G4B/SPTBN5 gene copy number increases in African great apes
0 50 100 140 170 2501p36 1p34 1p31 1p22 1q21 1q23 1q32 1q41Mb
1p1320 210
1
2
0 90 130 170 200 2402p24 2p11 2q14 2q31 2q33 2q37Mb 2q2130 110 2p1650
3
Mb 0 80 160 180 2003p25 3p12 3q13 3q263q2520 130 3p21 50 3q21 3q28
Mb 0 100 1904p16 4q24 4q12 4q3410 80 4p12 50 4q31
4
140
Mb
5
0 100 1905p15 5q23 5q1320 70 5q1150 5q34150130
Mb
6
0 50 1706p25 6q126p21 30 130 10 6p22 40 6q1490 6q22 6q25
Mb
7
0 100 1607p21 7q11 7q21 60 30 7p14 90 7q31130 7q35140 7q22
Mb
8
0 120 1508p21 8q12 40 20 8p12 80 8q21100 8q2460
Mb
9
120 1509p23 9q21 40 30 9p13 80 9q22 100 9q34600
120 14010p15 10q21 10q24 40 2010p11 80 100 10q26500Mb
10
10q25
Mb
11
90 14011p15 11q12 11q13 20 10 11p14 70 11q24500 80 120 11q22 11q14
1 2 3
12
Mb 110 13012p13 12q13 12q14 30 10 12p12 70 12q24500 90 12q21
13
Mb 13q12 13q21 13q33 3013q14 110500 90
14
Mb 14q11 14q31 50 3014q13 700 1009014q32
15
Mb 15q13 700 10015q26 20 40 5015q21 15q2415q22
4 5
5q15
8q22
9q33
14q22
6 7
8
9
10
11
12 13
14
15
16Mb
16p13 700 9016q24 10 20 3016p12 16q22 5016q12
17p13 70 9017q23 10 20 3017q11 17q21 5017q12 17q25
17Mb
18Mb
18p110 10 20 8018q12 5018q21
19Mb
19p13 500 60 10 20 40 19q12 19q13 19p11
20Mb
20p130 10 30 6020q11 5020q13 20
21Mb
0 30 40 5021q22
22Mb
0 3022q11 50 22q13 20 40
XMb
Xp220 50 150Xp11 130Xq21 20 100 70 Xq26 Xq28
YMb
0Yp11 50 20
19q11
16
17
18
19
20
21 22
23
2>_0.5<_
Test/Reference ratio:
1
Human (Homo Sapiens)Bonobo (Pan Paniscus)Chimpanzee (Pan Troglodytes)
Orangutan (Pongo Pygmaeus)Gorilla (Gorilla Gorilla)
3
H
B
C
G
O
6
H
B
C
G
O
9
H
B
C
G
O
13
H
B
C
G
O
Human lineage-specific amplification of AQP7
Human lineage-specific amplification of AQP7
9p22
9q22
Oranutan
Chimpanzee
BaboonMarmosetLemur
HumanBonobo
Gorilla
GibbonMacaqueTest/Reference Ratio:
< 0.4 1 > 2.5
-1.6
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
Hum
an
Bon
obo
Chi
mp
Gor
illa
Ora
ngut
an
Gib
bon
Mac
aque
Bab
oon
Mar
mos
et
Lem
ur
aCG
H l
og
2 F
luo
resc
ent
Rat
io
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Qu
anti
tati
ve R
eal
Tim
e P
CR
C
op
y N
um
ber
aCGH
Q-PCR
r2=0.9532
AQP7AQP7
Human Chromosome 9
SMA SMA Chr5q13Chr5q13
Williams Beuren Williams Beuren Chr7q11.2Chr7q11.2
Prader-Willi Prader-Willi Chr15q11.1Chr15q11.1
DiGeorge DiGeorge Chr22q11Chr22q11
BLAT-Predicted Intronless vs. Intron-Containing HLS Gene Copies in Human, Chimp, and Macaque Genomes
0
5
10
15
20
25
30
35
40
45
50
3214
7047
0930
7813
8559
4438
8432
7612
1223
129
6679
3838
2311
9768
1262
2913
5010
2343
7627
9874
5090
429
7084
2986
8529
8862
3237
9645
1080
4702
6148
8945
6268
4270
4320
7303
9874
1841
7673
4581
1138
8235
8896
9906
1030
854
1031
047
1467
026
1468
074
1474
402
1557
341
1638
749
1641
894
1641
988
1683
035
1699
118
1759
573
1856
246
1874
052
1946
251
IMAGE Clone
Num
ber
of B
LAT
Hits
0
5
10
15
20
25
30
35
40
45
50
Human intron-containingChimp intron-containingMacaque intron-containingIntronless
*
DUF1220Repeat Unit
Popesco, et al, Science 2006
Synonymous and NonsynonymousDifferences Between Aligned Sequences
Synonymous and NonsynonymousDifferences Between Aligned Sequences
Ks = Average number of synonymous changesKa = Average number of nonsynonymous changesKs = Average number of synonymous changesKa = Average number of nonsynonymous changes
T h r P h eA C T T T T
A C C G T TT h r V a l
Nonsynonymous and SynonymousSites in Codons
Nonsynonymous and SynonymousSites in Codons
T h r Ph e
ACT T TT
T h r Ph e
ACT T TT
NNNN
NNNN
1/3 S1/3 SSS
2/3 N2/3 N
What will be the Ka/Ks values for most proteins?
Ka/Ks Distribution
0
200
400
600
800
1000
1200
1400
16000
.00
0.0
8
0.1
6
0.2
4
0.3
2
0.4
0
0.4
8
0.5
6
0.6
4
0.7
2
0.8
0
0.8
8
0.9
6
1.0
4
1.1
2
1.2
0
1.2
8
1.3
6
1.4
4
1.5
2
1.6
0
1.6
8
1.7
6
1.8
4
1.9
2
2.0
0
Ka/Ks value
Nu
mb
er
of
ge
ne
s p
er
bin
Intra-primate comparison mean:0.91Rodent-primate comparison
mean: 0.61
Num
ber o
f gen
es p
er b
in
Ka/Ks Distribution
Ka/Ks Value
• DUF1220 shows greatest human specific copy number expansion of any protein coding sequence in the human genome
• Show signs of positive selection
• Human increase primarily due to domain amplification (rather than gene duplication)
Genome PDE4DIPTotal
DUF1220NBPFGenes
Human 2 272 23Chimp 3 125 19Gorilla 3 99 15Orangutan 4 92 11Gibbon 3 53 10Macaque 1 35 10Marmoset 1 31 11Mouse Lemur 1 2 1Bushbaby 1 3 2Tarsier 1 1 0Rabbit 1 8 3Pika 1 1 0Mouse 1 1 0Rat 1 1 0Guinea Pig 1 1 1Squirrel 1 1 1Tree Shrew 1 4 3Cow 1 7 3Dolphin 1 4 1Pig 1 3 1Horse 1 8 3Dog 1 3 1Panda 1 2 1Cat 1 3 2Megabat 1 1 0Microbat 1 1 0Hedgehog 1 1 0Shrew 1 1 0
O’Bleness et al. Evolutionary History and Genome Organizationof DUF1220 Protein Domains. G3 (Bethesda). Sept (2012).
* Branch points in millions of years.* Branch points in millions of years.
A Chronology of DUF1220 Domain EvolutionA Chronology of DUF1220 Domain Evolution
O’Bleness, et al, G3: Genes, Genomes, Genetics, 2012
Consensus Tree of Evolutionary Relationships of 429 Primate DUF1220 Sequences
Ancestral DUF1220 found in human PDE4DIP
NBPF-type DUF1220 Domains
Clades CON1-3 are conserved DU1220 sequences among primates
Clades HLS1-3 refers to a three-DUF1220 domain unit that has expanded only in the human lineage
CON1 CON2 HLS1 HLS2 HLS3 CON3
DUF1220 triplet
NBPF12
HLS1 HLS2 HLS3
CON1 CON2
DUF1220 triplet
HLS1 HLS2 HLS3
CON3
DUF1220 Duplication and Protein Domain Classifications
Chimpanzee Human
DUF1220/NBPF Genome Organization in Chimp & Human
O’Bleness, et al, G3: Genes, Genomes, Genetics, 2012
Pres
tain
ed M
arke
rFr
onta
l Lob
eTe
mpo
ral L
obe
Parie
tal L
obe
Occip
ital L
obe
Cere
bellu
mPl
acen
ta
Western analysis of Normal Adult Human Brain regions with DUF1220 antibody: Total protein lysates (50ug) from normal adult human brain regions (male and female; ages ranging from 22-82yrs) were electrophoresed on 4-20% denaturing SDS-PAGE gels and blotted with: A) DUF1220 affinity purified antibody B) GAPDH.
36kDa5037.525
GAPDH
A
B
Popesco, et al Science 2006
DUF1220 antibody staining in the human cerebellum (77yr old white female). A) DUF1220 affinity purified antibody; B) Double labeling with DUF1220 affinity purified antibody and Neurofilament 160kDa; C) same as B-higher magnification; D) Double labeling with DUF1220 affinity purified antibody and GFAP; E) DUF1220 preimmune and GFAP; F) DUF1220 Adsorption control. Blue labeling represents DAPI for nuclear staining.
D E F
A B C
P
denigl
ml
DUF1220 Protein Expression in Adult Human Brain
Popesco et al Science 2006
(30yr old female)Hippocampus-CA regions-DUF1220 Affinity purified+ GFAP + DAPI
GFAP DAPI
DUF1220AffinityPurified Antibody
(30yr old female)Cortical regions-Hippocampus-DUF1220 Affinity purified+ GFAP + DAPI
GFAP
DUF1220AffinityPurified Antibody
DAPI
Noteworthy DUF1220 Copy Number Totals
DUF1220 Copies
Total in Human Genome 272
Total in Chimp Genome (CLS) 125 (23)
Total in Last Common Ancestor of Homo/Pan 102
Total of Newly Added Copies in Human Lineage 167
Total Human-Specific Copies Added via Domain Amplification 146
Total Human-Specific Copies Added via Gene Duplication 21
Avg. Number Added to Human Lineage Every Million Years 28
O’Bleness, et al, G3: Genes, Genomes, Genetics, 2012
Sequences Encoding DUF1220 Domains• Show the largest human lineage-specific increase in copy Show the largest human lineage-specific increase in copy
number of any protein coding region in the genome (160 number of any protein coding region in the genome (160 HLS; >270 total in haploid genome) HLS; >270 total in haploid genome)
• Show signs of positive selection especially in primatesShow signs of positive selection especially in primates• In brain, are expressed only in neuronsIn brain, are expressed only in neurons• Are highly amplified in human, reduced in great apes, further Are highly amplified in human, reduced in great apes, further
reduced in monkeys, single-or-low copy in prosimians and reduced in monkeys, single-or-low copy in prosimians and non-primate mammals, and absent in non-mammalsnon-primate mammals, and absent in non-mammals
• Have increased in human primarily by domain hyper-Have increased in human primarily by domain hyper-amplification involving DUF1220 tripletamplification involving DUF1220 triplet
Key Human-Specific Evolutionary Features of 1q21.1 Region
O’Bleness, et al, Nat Rev Genet, 2012
‡*
1q21.1 Deletions linked to Microcephaly*1q21.1 Duplications linked to Macrocephaly*
• Recurrent Reciprocal 1q21.1 Deletions and Duplications Associated with Microcephaly or Macrocephaly and Developmental and Behavioral Disorders
Brunetti-Pierri, et al, Nature Genetics 2008
• Recurrent Rearrangements of Chromosome 1q21.1 and Variable Pediatric Phenotypes
Mefford, et al, N. Engl. J. Med. 2008
• *Implies the copy number (dosage) of one or more genes in this region is influencing brain size in a dose-dependent manner
• These CNVs encompass or are immediately flanked by DUF1220 sequences (Dumas & Sikela, Cold Spring Harbor Symposium Quant. Biol., 2009)
DUF1220/NBPF Sequences & Recurrent Disease-associated 1q21.1 CNVs
Human Evolutionary Genomics: Relevant Reviews
Sikela, J.M. (2006). The Jewels of Our Genome: The Search for the Genomic Changes Underlying the Evolutionarily Unique Capacities of the Human Brain. PLoS Genet. 2, e80.
O’Bleness, M.S., Searles, V., Varki, A., Gagneux, P., and Sikela, J.M. (2012). Evolution of genetic and genomic features unique to the human lineage. Nat. Rev. Genet., 13, 853-866.