the $0 genome & personalgenomes - harvard...
TRANSCRIPT
![Page 1: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/1.jpg)
1
3:50-4:20 PM GC at CGC 9-Jun-2009
Thanks to:
The $0 Genome & PersonalGenomes.org
Azco
RBH
![Page 2: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/2.jpg)
2
What does $0 to the consumer mean?
1991 Linux1993 WWW2001 Wikipedia1998 Google Search, Maps, Translate, Health..
![Page 3: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/3.jpg)
3
Specifications other than cost1. Speed (really real-time)2. No reagents or stable in harsh conditions 3. Portability (Instrument size)4. Read length (Mbp)5. Keep DNA parts together in mixtures6. Subsequence targeting (e.g. drug resistance)
emulsionoilH2O
Microbe chromosomes
barcode
![Page 4: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/4.jpg)
4
DNA Explorer, $80 (Ages 10 and up) www.discovery.com
Genographic Project $99
DIY Bio
23andme $399Time Magazine Nov 2008 invention of the year
![Page 5: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/5.jpg)
5
DTC SNP chips : Breast Cancer
deCODEme: “does not include the high-risk but rare BRCA1 and BRCA2 breast cancer risk variants”. Navigenics: “Mutations in BRCA1 or BRCA2 are less common in the population and are only present in approximately 5 – 10% of families with breast and ovarian cancer.”23andme: “Hundreds of cancer-associated BRCA1 and BRCA2 mutations have been documented, but three specific BRCA mutations are worthy of note because they are responsible for a substantial fraction of hereditary breast cancers and ovarian cancers among women with Ashkenazi Jewish ancestry”.
1M vs 3G
![Page 6: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/6.jpg)
6
“Genes Show Limited Value in Predicting Diseases”
Nicholas Wade April 15, 2009
David B. Goldstein, Ph.D.“We must therefore turn more sharply toward the
study of rare variants.”
(Common SNP backlash)
![Page 7: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/7.jpg)
7
Valuable Personal Genome Sequences
1464 genes are highly predictive & medically actionable(inherited & cancer) at ~$2K per gene.
**Very few of these are on SNP chips.** Why?PKU, Tay Sachs, Cystic Fibrosis, BRCA1/2, etc.
Pharmacogenomic drug/allele combinations:Herceptin, Iressa, ..
Also: Ancestry, Forensics, Social Networking, Education, Research
![Page 8: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/8.jpg)
8
Multigenic rare causative alleles can yield strong or weak GWA with a common allele
CasesStrong GWA
Controls
Casesweak GWA
Controls
Red=haplotype block
![Page 9: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/9.jpg)
9
Seq bp/$
0.01
0.1
1
10
100
1000
10000
100000
1000000
10000000
1980 1985 1990 1995 2000 2005 2010
(Moore’s law) 1.5x/yr for electronics
vs10x/yr for
DNA Sequencing
4 logs in 4 years
2009:Lig:$5K
2005:capil:$50M
1995:gel: $3G
Pol:$50K
![Page 10: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/10.jpg)
10
Ultra-low-cost sequencing1. Polonator SbL/P Open-source $170K device, haplotypes2. Roche-454 SbP Long reads (>0.4 kb)3. Illumina-GA SbP Fluorescent read-length 2*110 bp4. AB-SOLiD SbL Longest ligation reads5. Helicos SbP High parallelism & quantitation6. CGI SbL Rolony grid & 100Kb haplotypes $5K genome
7. Ion Torrent SbP Potentially small device8. Genizon BioSci SbH In situ sequencing9. LightSpeed SbL 16X higher density, >10X speed10. Intelligent Bio SbP Hexagonal grid11. Pacific Bio SbP Long reads (>2.0 kb)12. Bionanomatrix SbP Fluorescent mapping (>300kb) 13. Visigen SbP14. OxfordNanopore Pore Potentially small device15. Nabsys Pore Potentially small device16. Halcyon EM Long reads (>300kb) 17. ZS Genetics EM Long reads (>300kb)
Polonator Polonator
![Page 11: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/11.jpg)
11
SequencingmC
G T A C
Clarke, Bayley, et al. Nature Nanotech 2009
![Page 12: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/12.jpg)
12
Electron Microscopy
YOYO labeled stretched ss-M13-DNA on PDMS15 μm = 30 kb
Pt-G-ssDNA 0.5nm = 1 base
William Andregg, et al. unpublished, 2009 .gg...gg....g..g.....gg.n....g...g...g..ggg.....gg.gg....n..
![Page 13: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/13.jpg)
13
Electron Microscopy
Osmium T
William Andregg, et al. unpublished, 2009
10 vs10K fps
![Page 14: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/14.jpg)
14
Why open-architecture hardware, software, wetware?
Polonator
1999-2009$170K
2 billion reads per run
Precedents:1981 IBM PC1991 Linux1993 WWW2001 Wikipedia
Rich TerryFigure 4.6.1 Polonator instrument
A shared resource: Pol & Ligase chemistries
![Page 15: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/15.jpg)
15
Anonymity vs Open-access? Are we in denial?
Trends in laws to make data public (not just at elite institutions): e.g. H.R. 2764, SEC. 218. 26-Dec-07 open-access publishing for all NIH-funded research.
(12) Identify individual case/control status from pooled SNP data Homer et al PLoS Genetics 2008 as this became known, NCBI pulled dbGAP data
(11) Re-identification after “de-identification” using public data. Group Insurance list of birth date, gender, zip code sufficient to re-identify medical records of Governor Weld & family via voter-registration records (1998)
Self identification trend (10) Unapproved self-identification. e.g. Celera IRB. (Kennedy Science. 2002)(9) Obtaining data about oneself via FOIA or sympathetic researchers. (8) DNA data CODIS data in the public domain.
even if acquitted
![Page 16: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/16.jpg)
16
Anonymity vs Open Access? Are we in denial?Accessing “Secure data”(7) Laptop loss. 26 million Veterans' medical records,
SSN & disabilities stolen Jun 2006. (6) Hacking. A hacker gained access to confidential medical info at the U.
Washington Medical Center -- 4000 files (names, conditions, etc, 2000)(5) Combination of surnames from genotype with geographical info An
anonymous sperm donor traced on the internet 2005 by his 15 year old son who used his own Y chromosome data.
(4) Identification by phenotype. If CT or MR imaging data is part of a study, one could reconstruct a person’s appearance . Even blood chemistry can be identifying in some cases.
(3) Inferring phenotype from genotype Markers for eye, skin, and hair color, height, weight, geographical features, dysmorphologies, etc. are known & the list is growing.
(2) “Abandoned DNA bearing samples (e.g. hair, dandruff, hand-prints, etc.) (1) Government subpoena. False positive IDs and/or family coercion
index
![Page 17: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/17.jpg)
17
Who can contribute to cures?
Huntington's NancyWexler (psychologist)
Adrenoleukodystrophy
Odone (World Bank)
Parkinson’sBrin family Hugh Rienhoff, (MD)
MyDaughtersDNA.org
ALS Jamie Heywood (engineer)PatientsLikeMe.com
Motivating, donating data ... access to data?
LRRK2 G2019S
HFE Aull(engineer)
![Page 18: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/18.jpg)
18
Genesenvironmentstraits, cells1) First/only open access data 2) Avoid over-promising on de-identification 3) 100% on Exam to assure informed consent(*Educate pre-consent rather than post-discovery*)4) Low cost coding sequence + regulatory data 5) Multi-traits: images, iPS-etc.RNA, microbe/VDJ 6) Cells available for personal functional genomics7) IRB approval for 100,000 diverse volunteers
501(c)(3)
0431
1070
1660
1677
1687
1833
1846
1731
1730
1781
![Page 19: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/19.jpg)
1919
Traitomatic: 7 diploid +10 PGP sequences: hypertrophic cardiomyopathy allele
![Page 20: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/20.jpg)
20
Diagnostics Systems Biology Challenge
TRAITS(Phenome)
Genome6 Gbp
3M Alleles
NOT going from ONLY Genome Sequence to Prediction
![Page 21: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/21.jpg)
21
PersonalGenomes.orgInherited, Somatic, Environmental Genomics
VDJ-ome
TRAITS(Phenome)
Personal stem-cellsepigenome(RNA,mC)
PERSONAL GENOME
6 Gbp3M alleles
One in a life-time genome + yearly ( to daily) tests
Public Health Bio-weathermap.org : Allergens, Microbes, Viruses
Microbiome~5 new non-synonymousAlleles per generation
![Page 22: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/22.jpg)
22
Microbiome vs VDJ-ome
Microbe tests: Detect Drug resistance spectrumEarlier warning (e.g. meningitis)
Immune tests: Focus on response to exposureLonger times to detect exposure (e.g. HIV, TB)
![Page 23: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/23.jpg)
23
Multiple Phyla Subsisting on 18 Antibiotics
DantasSommerChurchScience
2008
(& lignin)
![Page 24: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/24.jpg)
24
PersonalGenomes.orgInherited, Somatic, Environmental Genomics
VDJ-ome
TRAITS(Phenome)
Personal stem-cellsepigenome(RNA,mC)
PERSONAL GENOME3M alleles
One in a life-time genome + yearly ( to daily) testsPublic Health Bio-weather map : Allergens, Microbes, Viruses
Microbiome
![Page 25: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/25.jpg)
25
Epignome: DNA - RNA - Protein
Regulatory RNA & Proteins
![Page 26: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/26.jpg)
26
Selective genome sequencing
Shendure, et al. Science 309:1728 Porreca et al 2007 Nat Methods 4:931Nilsson et al. (2006) Trends Biotechnol 24:83.
Red=Synthetic; Yellow=genome/cDNA
Optimize 258K oligos: 148,949 exons, 20,065 CCDS genes.
3 ways to capture alleles from genomic or c-DNA
In vitro Paired-end-tags (PET)
Science 2005Science 2005
Hybridiz.selection
Zhang, Chou, Shendure, Li, Leproust, Dahl, Davis, Nilsson, Church
For rearrangements
2. 3.1.
GapFill
Nat Methods 2007
3.
![Page 27: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/27.jpg)
2727
Array Synthesis of Padlock Probes
barcodes
![Page 28: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/28.jpg)
28
PO4
PO4
App
Barcoding RNAs
Efficient microRNA capture and barcoding via enzymatic oligonucleotide adenylation.Vigneault et al. Nature Methods 2009
3’‐OH5’
5’
+
X3’
+T4 RNA ligase
ATPX3’5’
X3’
![Page 29: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/29.jpg)
29
RNA editing: A to I(G)# of known cases increased from from 10 to 569
Erez Levanon
Genomic DNA
RNA - intestine
RNA - kidney
RNA - diencephalon
RNA - frontal lobe
RNA - corpus callosum
RNA - cerebellum
Li, Levanon,Yoon,Aach, Xie, LeProust, Zhang, Gao, Church (Science 2009)
e.g. VEZF1
![Page 30: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/30.jpg)
30
Regulation & MethylationHigh Expression = High Gene-Body to Promoter Ratio
Ball, Li, Gao, Lee, LeProust, Park, Xie, Daley, Church. (Nature Biotech 2009)
Genome wide bisulfite & enzyme assays unrestricted by CpG Island bias
![Page 31: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/31.jpg)
31
G
A
TC
Allele-specific expression (ASE)
N=1: Combine all cis element variants
GA
AAAAAAAAAAAAAAAAAAAA
TC
TT
& eliminate environmental & trans-acting variation among individuals.Cis: Copy number, enhancer, promoter, splicing, polyA, termination, transport, decay.
G
A
GG
Allele‐specific transcription factor
binding
TF
Causality: Synthetic homologous allele‐replacement
Zhang, Li, Church unpublishedForton et al. Genome Res. 2007
![Page 32: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/32.jpg)
3232
PersonalGenomes.org: skin to stem cells to many types
Park& Lee
Hair or skin sample
![Page 33: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/33.jpg)
33
Clustering stat-significant
allele-specific expression in
reprogrammed cells, ~50% of ASE invariant
among cell types
LeeZhangParkDaleyChurch
![Page 34: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb](https://reader033.vdocument.in/reader033/viewer/2022041502/5e22678d4381c65e0073c4c2/html5/thumbnails/34.jpg)
34
PersonalGenomes.orgInherited, Somatic, Environmental Genomics
VDJ-ome
TRAITS(Phenome)
Personal stem-cellsepigenome(RNA,mC)
PERSONAL GENOME
6 Gbp3M alleles
One in a life-time genome + yearly ( to daily) tests
Public Health Bio-weather map : Allergens, Microbes, Viruses
Microbiome