thanks to: george church (harvard gtl & cegs centers) 5-jan-2006 hpcgg landsdowne 2 pm personal...
Post on 21-Dec-2015
215 views
TRANSCRIPT
![Page 1: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/1.jpg)
Thanks to:
George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM
Personal Genomics meets Quantitative Proteomics
NHGRI Seq Tech 2004: Agencourt, 454, Microchip, 2005: Nanofluidics, Network, VisiGen Affymetrix, Helicos, Solexa-Lynx
![Page 2: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/2.jpg)
"Open-source" Personal Genome Project (PGP)
• Harvard Medical School IRB Human Subjects protocol submitted 16-Sep-2004, approved Aug 31, 2005.
• Gradual plan. Start with "highly-informed" individuals consenting to non-anonymous genomes & extensive phenotypes (medical records, imaging, omics).
• Cell lines in Coriell NIGMS Repository
• Diploid genome subsets at $0.1/kb, <3E-7 FP Errors How? Polony bead Sequencing-by-Ligation (SbL)
![Page 3: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/3.jpg)
Analyses of single chromosomes (single cells , RNAs, particles)
(1) When we only have one cell as in Preimplantation Genetic Diagnosis (PGD) or environmental samples
(2) Candidate chromosome region sequencing
(3) Prioritizing or pooling (rare) species based on an initial DNA screen.
(4) Multiple chromosomes in a cell or virus
(5) RNA splicing
(6) Cell-cell interactions (predator-prey, symbionts, commensals, parasites)
![Page 4: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/4.jpg)
CD44 Exon Combinatorics (Zhu & Shendure)
• Alternatively Spliced Cell Adhesion Molecule• Specific variable exons are up-or-down-regulated in
various cancers (>2000 papers)• v6 & v7 enable direct binding to chondroitin sulfate,
heparin…
Zhu,J, et al. Science. 301:836-8.
![Page 5: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/5.jpg)
Zhu J, Shendure J, Mitra RD, Church GM. Science 301:836-8. Single molecule profiling of alternative pre-mRNA splicing.
EXON PATTERN Eph4 Eph4bDD TOTALEph4 FRATIO LSTP-PV------------7-8-9-10 609 764 1373 1.17 1E-4--------------8-9-10 320 390 710 1.13 3E-2----------6-7-8-9-10 431 251 682 -1.85 4E-18------4-5-6-7-8-9-10 218 216 434 -1.08 2E-1----------------9-10 68 143 211 1.96 7E-7--------5-6-7-8-9-10 86 39 125 -2.37 2E-6----3-4-5-6-7-8-9-10 40 56 96 1.30 9E-2------4-5---7-8-9-10 16 74 90 4.30 2E-9--2-3-4-5-6-7-8-9-10 44 28 72 -1.69 1E-21-2-3-4-5-6-7-8-9-10 22 5 27 -4.73 3E-4--------5---7-8-9-10 5 19 24 3.53 3E-3----3-4-5---7-8-9-10 1 15 16 13.95 4E-4--2-3-4-5---7-8-9-10 1 10 11 9.30 5E-3
Eph4 = murine mammary epithelial cell line
Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic)
CD44 RNA isoforms
![Page 6: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/6.jpg)
Molecular Weight Assessment of Proteins in Total Proteome Profiles Using 1D-PAGE and LC/MS/MS.
Proteome Sci. 3:6 (2005) Ahmad R, Nguyen DH, Wingerd MA, Church GM, Steffen MA.
Candidates for alternative splicing (AS), endoproteolytic processing (EPP), & post-translational modifications (PTMs) in Lymphoblastoid cells
Protein Name Predicted MW Observed MW Difference before & after leader cleavageCytochrome c oxidase subunit IV isoform 1 19577 2582 205NADH dehydrogenase 21750 5084 334Coproporphyrinogen oxidase 50175 13632 357MHC II, DQ 1 29733 25896 404NADH (ubiquinone) Fe-S protein 2 52545 48185 815Mito short-chain enoyl-coA hydratase 1 31371 27499 901Peptidylprolyl isomerase B (cyclophilin) 23742 19360 940
![Page 7: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/7.jpg)
-Glc-1P ADP-Glc -1,4-glucosyl-glucan glycogenCentralCarbonMetabol.
glgC
glgX
glgA glgB
glgP
Glycogen metabolism
Time (hours)
0 4 8 12 16 20 24 28 32 36 40 44 48
Nor
mal
ized
Exp
ress
ion
0.1
1
10
glgAglgBglgCglgXglgP
Zinser et al. unpublZinser et al. unpubl..
Light regulated Circadian metabolism
![Page 8: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/8.jpg)
Viral Photosynthetic Proteins
Podovirus P-SSP7 46 kb
PC HLIPs Fd D1
12kb 24kb
PC HLIPs Fd D1
12kb 24kb
~500 bp
HLIPs D1 D2
6.4kb 2.8kb
~500 bp
Myovirus P-SSM4 181 kbHLIPs D1 D2
6.4kb 2.8kb
Lindell, Sullivan, Chisholm et al. 2004Lindell, Sullivan, Chisholm et al. 2004
HLIP D1
Myovirus P-SSM2 255 kb
![Page 9: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/9.jpg)
Photosynthesis genes in marine viruses yield proteins during host infection.
Nature 2005 438:86-9. Lindell D, Jaffe JD,
Johnson ZI, Church GM, Chisholm SW.
![Page 10: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/10.jpg)
Photosynthesis genes in marine viruses yield proteins during host infection.
Nature 2005 438:86-9. Lindell D, Jaffe JD, Johnson ZI, Church GM, Chisholm SW.
15N 13C synthetic standards
host
phage
![Page 11: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/11.jpg)
Improving MS Peptide Coverage
? Ionization efficiencyX Ions outside the mass range of the analyzer ? Chromatographic behavior ? Sample preparation bias X Instrument duty cycle • Improve Spectra interpretation over current algorithms
– Details of fragmentation patterns– Dipeptide P, DE/KR, V.G intensity effects– B & Y ions unequal & co-dependent – More intense ions in middle of peptides
MDQuest: Mike Chou, Dan Schwartz, Steve Gygi, Josh Elias http://gygi.med.harvard.edu/dpsp/
![Page 12: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/12.jpg)
SEQUEST vs MDQUEST PerformanceROC Curves
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1 - Specificity (FP rate)
Se
nsi
tivity
(T
P r
ate
)
sequest
mdquest
![Page 13: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/13.jpg)
MapQuant is a program designed to isolate unique organic species and quantify their relative
abundances from an LC/MS experiment.
Scheme: Data from an LC/MS experiment are analyzed after being formatted into a data structure called a 2-D map, analogous to a gray-scale image.
Scan number: N N+1 N+2 N+3
2-D peptide map
time or scans
m/z
uni
ts
m/z
uni
ts
![Page 14: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/14.jpg)
2-D map
Retention time
m/z
uni
ts
MapQuant Gives a List of All Organic Species In the Sample
MapQuant
AbundanceVolume
Retention Time
RT m/z
MZ Charge Carbons
60123 27.30 0.118 828.938 0.0117 2 7530227 42.67 0.162 772.432 0.0102 2 7619363 48.01 0.150 913.449 0.0143 3 13513838 34.52 0.131 736.060 0.0092 3 1089726 28.17 0.129 797.385 0.0108 2 745370 34.19 0.131 762.360 0.0099 2 744729 52.25 0.153 906.988 0.0141 2 871612 47.22 0.136 786.402 0.0105 4 165151 24.65 0.116 883.525 0.0132 1 33
![Page 15: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/15.jpg)
MapQuant is a program designed to isolate unique organic species and quantify their relative
abundances from an LC/MS experiment.
Scheme: Data from an LC/MS experiment are analyzed after being formatted into a data structure called a 2-D map, analogous to a gray-scale image.
Scan number: N N+1 N+2 N+3
2-D peptide map
time or scans
m/z
uni
ts
m/z
uni
ts
![Page 16: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/16.jpg)
2-D map
Retention time
m/z
uni
ts
MapQuant Gives a List of All Organic Species In the Sample
MapQuant
AbundanceVolume
Retention Time
RT m/z
MZ Charge Carbons
60123 27.30 0.118 828.938 0.0117 2 7530227 42.67 0.162 772.432 0.0102 2 7619363 48.01 0.150 913.449 0.0143 3 13513838 34.52 0.131 736.060 0.0092 3 1089726 28.17 0.129 797.385 0.0108 2 745370 34.19 0.131 762.360 0.0099 2 744729 52.25 0.153 906.988 0.0141 2 871612 47.22 0.136 786.402 0.0105 4 165151 24.65 0.116 883.525 0.0132 1 33
![Page 17: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/17.jpg)
Leptos et al. Proteomics 2006
![Page 18: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/18.jpg)
MapQuant is publicly available at http://arep.med.harvard.edu/mapquant.html
![Page 19: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/19.jpg)
Leptos et al. Proteomics 2006
![Page 20: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/20.jpg)
Leptos et al. Proteomics 2006
![Page 21: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/21.jpg)
retention time (in min)
m/z
units
EKLAVSAR
QEPERSEK
DAFLSGER
??
?
MapQuant gives me a list of all organic species in the sample BUT
WHAT ARE THEIR IDENTITIES?
![Page 22: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/22.jpg)
MapQuant identifies approx. 2x104 organic species per LC/MS experiment.
ONLY ~ 500 (3%) organic species have fragmentation (CID) spectra and hence sequence IDs
retention time (in min)
EKLAVSAR
QEPERSEK
DAFLSGER
??
?m/z units
Dealing With Many Peptides (Organic Species)22
= CID spectrum or MS/MS event
![Page 23: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/23.jpg)
Dealing With Many Peptides (Organic Species)
retention time (in min)
EKLAVSAR
QEPERSEK
DAFLSGER
??
?
Database of 11845 peptides from ALL LC/MS experiments carried out on
Prochlorococcus samples
(rt, m/z) coordinatesm/z units
![Page 24: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/24.jpg)
Proteins observedin diel
experiment
Proteinsobserved in experimentsprior to diel
TOTAL NUMBER OF ORFS: 1742
1314 539
522792 17
Protein Distribution Among Experiments
![Page 25: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/25.jpg)
Sequence Coverage of the Protein groES
![Page 26: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/26.jpg)
Summary
Proteome Sci. 3:6 (2005) Ahmad R, Nguyen DH, Wingerd MA, Church GM, Steffen MA.
• Open Personal Genome Project (PGP) including Proteomics• Single molecule RNAs for alternative splicing (AS)• Gel –MS methods for endoproteolytic processing • MapQuest for MS quantitation without isotopic labeling
http://arep.med.harvard.edu
![Page 27: Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:](https://reader033.vdocument.in/reader033/viewer/2022051516/56649d6d5503460f94a4d24a/html5/thumbnails/27.jpg)
Thanks to:
George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM
Personal Genomics meets Quantitative Proteomics
NHGRI Seq Tech 2004: Agencourt, 454, Microchip, 2005: Nanofluidics, Network, VisiGen Affymetrix, Helicos, Solexa-Lynx