next-generation sequencing eric jorgenson epidemiology 217 2/28/12
TRANSCRIPT
Next-Generation Sequencing
Eric Jorgenson
Epidemiology 217
2/28/12
Outline
• Overview of Sequencing
• Example Next Generation Sequencing Study: Whole Genome, Exome, Families (IBD), Cancer
• PTC Taste Sensitivity
http://www.bloomberg.com/video/84364498/
http://www.bloomberg.com/video/84364540/
http://www.bloomberg.com/video/86406762/
http://www.bloomberg.com/news/2012-01-17/search-genome-as-tennis-thrice-weekly-no-barrier-to-decoded-dna.html
http://www.bloomberg.com/news/2012-02-15/harvard-mapping-my-dna-turns-scary-as-threatening-gene-emerges.html
Links to videos and articles
QuickTime™ and a decompressor
are needed to see this picture.
Huntington’s Disease Testing
Almqvist AJHG 1999
Number of Genetic Markers for Genetic StudiesGenome-wide Linkage Studies
300-400 Microsatellite Markers
Genome-wide Association Studies
100,000-2,500,000 SNPs
Exome Sequencing Studies
30,000,000 Basepairs
Gene-based Studies
22,000 Genes
Whole Genome Sequencing Studies
3,200,000,000 Basepairs
Variant detection through next generation sequencing
QuickTime™ and a decompressor
are needed to see this picture.
Meyerson et al. NRG 2010
ANNOVAR: Using Annotation to Narrow the Search Space
openbioinformatics.org/annovar
Outline
• Overview of Sequencing
• Example Next Generation Sequencing Study: Whole Genome, Exome, Families (IBD), Cancer
• PTC Taste Sensitivity
Sequencing of a Single Individual with Family Data
Lupski et al. NEJM 2010
QuickTime™ and a decompressor
are needed to see this picture.
CMT Subtypes: Many Genes
QuickTime™ and a decompressor
are needed to see this picture.
Phenotypes in Unsequenced Family Members
QuickTime™ and a decompressor
are needed to see this picture.
SNP Distribution in Proband
QuickTime™ and a decompressor
are needed to see this picture.
The First 8 Human Genomes
QuickTime™ and a decompressor
are needed to see this picture.
Nonsynonymous SNPs in Known Disease Genes
QuickTime™ and a decompressor
are needed to see this picture.
Family Pedigree
QuickTime™ and a decompressor
are needed to see this picture.
Putative Causal Variant at a Conserved Amino Acid
QuickTime™ and a decompressor
are needed to see this picture.
Exome Sequencing Identifies a Tibetan Adaptation
Yi et al. Science 2010
QuickTime™ and a decompressor
are needed to see this picture.
Family Sequencing for Rare Diseases
Roach et al. Science 2010
QuickTime™ and a decompressor
are needed to see this picture.
Sequence Data Improves Identity By Descent Resolution
QuickTime™ and a decompressor
are needed to see this picture.
Su and Jorgenson under review
Cancer: Tumor vs. Normal
Lee et al. Nature 2010
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
Molenaar et al. Nature 2012
Nonsynonymous Somatic Mutations in Neuroblastoma
QuickTime™ and a decompressor
are needed to see this picture.
Molenaar et al. Nature 2012
Mutation count associated with age, stage, and survival
Outline
• Overview of Sequencing
• Example Next Generation Sequencing Study: Whole Genome, Exome, Families (IBD), Cancer
• PTC Taste Sensitivity
Distribution of PTC Phenotype
PTC Score
Nu
mb
er
of S
ubje
cts
TAS2R38 Receptor Structure
Kim et al. J Dent Res 2004
3 SNPs Form 3 Haplotypes
P A V
A V I
A A V
Taster
Non-taster
Rare
PTC Phenotype by TAS2R38 Diplotype
PTC Score
Nu
mb
er
of S
ubje
cts
Outliers After Adjusting for TAS2R38 Diplotype
PTC Score
Nu
mb
er
of S
ubje
cts
Unusual PTC Phenotypes (AVI Homozygotes in Green)
11
8
108 9 93 912
9
Unusual PTC Phenotypes (AVI Homozygotes in Green)
14 10 9 11410 2
11
11
10 Genomes, 5 Hard Drives
Summary of Variation Utah
Sample 1 Sample 2 Sample 3 Sample 4 Sample 5Gender Female Female Male Female FemaleTotal Sequence (Gb) 214 220 218 243 219Percent fully called 0.95 0.96 0.96 0.97 0.96Coverage (X fold) 53 55 53 63 54
SNPs 3,270,920 3,269,487 3,278,557 3,355,266 3,341,154Insertions 184,763 190,633 197,830 210,805 206,120Deletions 195,419 200,495 208,031 221,532 216,578
Synonymous SNPs 9,666 9,547 9,808 10,004 9,981Missense SNPs 9,253 9,135 9,350 9,486 9,581Nonsense SNPs 90 97 82 88 92Frameshift Insertions 103 102 97 112 127Frameshift Deletions 99 101 91 108 116
Novel SNPs 0.04 0.04 0.04 0.04 0.04Novel Insertions 0.18 0.18 0.19 0.20 0.19Novel Deltions 0.22 0.22 0.23 0.23 0.23
Quality Control:99.8% Concordance
Sample 1 Genotyping
Sequencing Homozygous Reference
Heterozygous Homozygous Variant
Homozygous Reference
479,773 429 422
Heterozygous426 234,156 293
Homozygous Variant
65 168 172,479
Variant Distribution in Utah
Variant Distribution in Utah
Using Relatedness
11
8
108 9 93 912
9
Identity By Descent
Identity By Descent
Chromosome Gene Start Stop1 ANXA9 150958836 1509588361 S100A7A 153391729 1533917295 PCDHB5 140517174 1405171746 HLA-A 29910604 299106046 HLA-A 29912856 299128567 ZAN 100371474 1003714749 BAG1 33264540 3326454010 WDR96 105957714 10595771410 GSTO1 106027059 10602705911 MUC6 1018116 101811611 OR4C45 48367311 4836731111 TRIM64 89701844 8970184412 NANOGNB 7917936 791793614 ZBTB1 64988830 6498883019 PSG3 43243238 43243238X HDHD1 6975782 6975782X ZCCHC16 111698036 111698036X SAGE1 134994005 134994005X GPR101 136112707 136112707
Nonsynonymous Variants
How can whole genome sequence influence treatment?• Identify Genes with Protein Altering
Mutations
• Determine Variation in Specific Genes
Genes with Protein Altering Variants
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
Sample 7
Sample 8
Sample 9
Sample 10
KnownNovel
ABO Blood Group
Determination of ABO Type
Population Genotype Blood Type261 796 803
UtahSample 1 -/- G/G C/C OO OSample 2 -/- G/G C/C OO OSample 3 -/- G/G C/C OO OSample 4 C/- G/G C/C AO ASample 5 C/- G/G C/C AO A
Costa RicaSample 6 C/- G/G C/C AO ASample 7 C/- G/G C/C AO ASample 8 -/- G/G C/C OO OSample 9 C/? G/G C/C AO or AA ASample 10 -/- G/G C/C OO O
Position
Appendix: Study Design Considerations in Sequencing
• Extreme Sampling
• Families
Sampling from the Extremes of a Quantitative Distribution
QuickTime™ and a decompressor
are needed to see this picture.
Su and Jorgenson under revision
Relative Power of Sampling from Various Phenotype Deciles
QuickTime™ and a decompressor
are needed to see this picture.
Su and Jorgenson under revision
QuickTime™ and a decompressor
are needed to see this picture.
Roach et al. Science 2010
Families can reduce error rates
QuickTime™ and a decompressor
are needed to see this picture.
Roach et al. Science 2010
Families can reduce error rates