the tangled genome gil mcvean. the real heroes
TRANSCRIPT
![Page 1: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/1.jpg)
The tangled genome
Gil McVean
![Page 2: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/2.jpg)
![Page 3: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/3.jpg)
The real heroes
![Page 4: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/4.jpg)
PanMap – Genome sequencing of 10 Western Chimpanzees
• Patterns of small insertion and deletion are quite different and reveal details of DNA repair pathways
• Patterns of recombination in humans and chimpanzees are highly diverged at the fine-scale, but largely conserved at broad scales
• There are a surprising number (6+ now ‘confirmed)’) of trans-specific polymorphisms, probably maintained through host-pathogen interactions
![Page 5: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/5.jpg)
A tangle of sequence
![Page 6: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/6.jpg)
![Page 7: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/7.jpg)
Difficulties of working with an incomplete reference
![Page 8: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/8.jpg)
Using de novo assembly to find variants
![Page 9: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/9.jpg)
Entire populationEntire population
![Page 10: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/10.jpg)
Sample 1
![Page 11: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/11.jpg)
Sample 2
![Page 12: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/12.jpg)
Chromosome 1
![Page 13: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/13.jpg)
Using Cortex leads to a high quality set of variants
![Page 14: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/14.jpg)
Diversity in Western Chimpanzees
• Similar diversity as humans of European origin (0.06%-0.08%)• Excess of common variants• 1% variants shared with humans
![Page 15: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/15.jpg)
Non-slippage indels are strongly biased to deletions
13:1 bias toward deletions.Unexpected peak at 4bp
![Page 16: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/16.jpg)
Indels as indicators of DNA repair processes
Insertions deletions
5 10 2015 25
5
10
20
15
25
5
10
20
15
25
5 10 2015 25Indel size Indel size
Longest word agreement
![Page 17: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/17.jpg)
TGACGAACTTATACTGCTTGAATA
TGACGAAC
ATTGAATA
TGAC--ATACTGAATATGACTTAT
Losing GAAC
![Page 18: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/18.jpg)
A tangle of trees
![Page 19: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/19.jpg)
Myers et al. 2005
![Page 20: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/20.jpg)
The zinc-finger protein PRDM9 determines hotspot location
Myers et al. 2010
![Page 21: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/21.jpg)
PRDM9 Zinc fingers are radically different between humans and chimps
Perhaps the most diverged gene between humans and chimpanzees
Repeatedly hit by adaptive evolution across mammals
Only known ‘speciation gene’ in mammals
Polymorphic in humans – leads to variation in hotspots and genome instability
![Page 22: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/22.jpg)
Questions
• We know from previous work in a few regions that hotspot locations tend not to be shared between humans and chimpanzees
• Calculations suggested that only 40% of human hotspots were driven by PRDM9 binding
• But..– Is there any hotspot sharing?– Do we conservation of recombination rates at any scale?– What features determine hotspot location in chimpanzees?
![Page 23: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/23.jpg)
The first genome-wide fine-scale map of recombination for a non-reference organism
Auton et al. 2012
![Page 24: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/24.jpg)
![Page 25: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/25.jpg)
Chimpanzee recombination is dominated by hotspots in a manner similar to humans
![Page 26: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/26.jpg)
But the hotspots are not in the same locations
![Page 27: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/27.jpg)
Fine-scale profiles around genes are similar
![Page 28: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/28.jpg)
As is rate variation around CpG islands
![Page 29: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/29.jpg)
Substantial PRDM9 diversity, but overlap in predicted binding sequences
![Page 30: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/30.jpg)
No signal for predicted binding sequences
![Page 31: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/31.jpg)
Similarities at 1Mb scale
![Page 32: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/32.jpg)
Human and chimp recombination rates are correlated at the chromosomal scale
![Page 33: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/33.jpg)
Human and chimp recombination rates are only correlated at broad scales
![Page 34: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/34.jpg)
Lower correlation in structural rearrangements
• All, bar one, of the inverted regions are pericentric so change in position wrt to centromere does not contribute
• Change in proximity to telomere is important
![Page 35: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/35.jpg)
chimphuman
C.A.
2a
2b
2a
2b
2
t
A natural experiment: chromosomal fusion
![Page 36: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/36.jpg)
Fusion region shows 3-fold decrease in recombination rate
![Page 37: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/37.jpg)
Fusion region shows 3-fold decrease in recombination rate
![Page 38: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/38.jpg)
A tangle of histories
![Page 39: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/39.jpg)
Distribution of sickle allele
Of malaria
![Page 40: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/40.jpg)
![Page 41: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/41.jpg)
How many variants are shared through descent?
![Page 42: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/42.jpg)
SNPs shared by humans and chimpanzees (33,906 autosomal and 527 X chromosome)
Human polymorphism 9.4 million autosomal and 261,000 X chromosome SNPs from 1000 genomes Pilot 1 YRI (59 individuals)
Chimpanzee polymorphism3.8 million autosomal and 102,000 X chromosome SNPs from PanMap Pan troglogdytes verus (10 individuals)
Human-chimpanzee shared haplotypesAt least two shared SNPs in 4kb with the same
LD
reduce recurrentmutation
Human-chimpanzee shared coding SNPs
identify potentially functional coding variants
reduce artifactual sharing due to known or cryptic paralogs by filtering out SNPs with low 50 bp mappability, with high read depth, or not found in 1000 Genomes Phase 1
130 regions with shared haplotypes
outside the MHC
135 shared non-synonymous SNPs1 shared premature stop SNP200 shared synonymous SNPs
outside the MHC
7 resequenced using Sanger sequencing
8 with more than two pairs in LD
![Page 43: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/43.jpg)
Outside of the MHC, six clear-cut cases of trans-species polymorphisms
All non-coding and putatively regulatory
FREM3/GYPE MTRR IGFBP7
![Page 44: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/44.jpg)
In intron of IGFBP7
TFBS conserved in human/mouse/rat
Chromatin state segmentationby HMMDNaseI hypersensitive sites
Human-Chimpanzee shared SNPs
Primate phastCons score
TFBS identified by ChIP-seq
IGFBP7 gene structure
RelACUTL1
4kb
Regulatory region in HUVEC Regulatory region in NHEK and HMECWeak
enhancerWeak
enhancerStrong
enhancerStrong
enhancer
SRF Bach1
STAT3GATA-2
ISGF-3
Weak enhancer
20kb
Aver
age
pairw
ise
diffe
renc
esOpen chromatin by FAIRE
![Page 45: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/45.jpg)
• In total, 130 regions with shared human-chimpanzee haplotypes. Six clear-cut cases of ancient balanced polymorphisms.
• None are protein-coding. Eleven occur in non-coding genes (e.g., 7 in lincRNAs). Eleven compelling cases of regulatory regions.
• What do these regions have in common?
![Page 46: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/46.jpg)
SNPs shared by humans and chimpanzees
Shared haplotypesShared coding SNPs
Closest gene within 20 kb of a human-chimp shared haplotype (n=26, p=2x10-5, FDR=0.03)
Genes human-chimp coding shared SNP (n=99, p=0.017, FDR=0.20)
Enrichment of membrane glycoproteins-> host-pathogen interactions
Glycoproteins Glycoproteins
![Page 47: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/47.jpg)
Project Participants
• University of OxfordAdam AutonRory BowdenPeter HumburgZam IqbalGerton LunterJulian MallerSimon MyersSusanne PfeiferIsaac TurnerOliver VennPeter Donnelly (PI)Gil McVean (PI)
• Biomedical Primate Research CentreRonald Bontrop
• University of ChicagoAdi Fledel-AlonRyan Hernandez (UCSF)Ellen LefflerCord MeltonLaure SegurelMolly Przeworski (PI)
• FundersHoward Hughes Medical InstituteNational Institute of HealthRoyal SocietyWellcome Trust
![Page 48: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/48.jpg)
Where next?
![Page 49: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/49.jpg)
Remarkable structural and sequence diversity in chimp PRDM9
![Page 50: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/50.jpg)
Variation greater than in human populations
![Page 51: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/51.jpg)
Little correlation in fine-scale structure around DNA repeat elements
![Page 52: The tangled genome Gil McVean. The real heroes](https://reader036.vdocument.in/reader036/viewer/2022062314/56649dd15503460f94ac7796/html5/thumbnails/52.jpg)
No activating motif discovered in chimp
CCTCCCT