1
Applications of Mass Spectrometry to Proteomics
Akhilesh Pandey, M.D., Ph.D.
McKusick-Nathans Institute of Genetic Medicine and the Department of Biological Chemistry
Why Proteomics?
2
One Gene, Many Proteins
Gene mRNA Protein
Gene
mRNA1mRNA2mRNA3mRNA4mRNA5
Alternativesplicing
Protein1Protein2Protein3Protein4Protein5Protein6....
Proteini
Co/post-translational processing
Biological changes(morphogenesis)
Profiling of proteins
Proteome ‘n’Proteome 1
Proteome 1 Proteome 2 Proteome ‘n’Proteome 3Proteome 4
Genomics and Proteomics
3
mRNA and Protein Correlation
Peptide Sequencing by MS/MS
Y7
Select one peptide
Fragments containingCarboxyl-terminus
Collision
Y6
Y5
Y4
Y3
4
MS/MS spectrum (sequencing)
Micromass QTOF Finnigan LCQ Deca LC/MSD Trap XCT
5
1D Liquid Chromatography Setup
Manual 3-port valve for peak ”parking valve”
Peptide Mixture from pump
diverter valve
Analytical column
C18
Split line
Emitter
8µm tip
80nL/min with valve open
A matchstick Needle with a very small opening and containing a reverse phase Column for separation
6
Multidimensional Liquid Chromatography (MudPIT) Setup
Liu, H. et al. (2002)
MS
Offline Fractionation in the First Dimension
Pump
Auto Sampler
Manual Injector
UV Detector
Reverse phase LCMS
Fraction Collector
SCX Column
7
Automated nanoLC-MS/MS
• Purification of sample• Very complex samples
can be analyzed• Identical peptides will
elute in one peak (enhanced signal)
• Automated• Many proteins can be
identified in one run (100-2,000 proteins)
Quantitative Proteomics
• Based on difference in intensity
• By relative quantitation using MS
8
Quantitative Proteomics
• Based on difference in intensity– 1D gels– 2D gels– Use of fluorescence-based methods
1D Gel-based Comparison
200
110
73
47
28
IP : anti-pTyr
EGF :
9
2D Gel-based Comparison
Normal Cancer
2D Gels - Limitations
• Sample preparation - lot of optimization required
• Loading capacity limited• Do not resolve very small (<10 kD) or
large (>100 kD) proteins• Do not resolve hydrophobic (e.g.
membrane) proteins• Issues with reproducibility
10
Quantitative Proteomics
• Fluorescence-based quantitation– DIGE (Difference in-gel electrophoresis)
DIGE
• Samples to be labeled are labeled with Cy3 (green) and Cy5 (red)
• Samples are ‘mixed’ and resolved by 2D gels
• Fluorescence measured and quantitated
11
2D-DIGE of Pancreatic Juice
Cancer
Normal
Cy5
Cy3
Cy3-Cy5
Quantitative Proteomics
• By relative quantitation using MS– in vitro labeling
• 18O-labeling• Peptide mass tagging (ICAT)
– in vivo labeling• Labeling with stable isotope containing amino
acids (SILAC)
12
18O-labelingnormal cancer
Excise gel band Excise gel band
In-gel digestion with H2
16O waterIn-gel digestion with H2
18O water
Mix 1:1
MS-analysis
N C
18O-labeling
13
Isotope-Coded Affinity Tag (ICAT)
1 2 3
4Structure of ICAT reagent
Stable Isotope Labeling in Cell Culture (SILAC) for Protein Quantitation
• Mammalian cell culture models are used for studying a number of biological processes
• In the SILAC approach, cells are grown continuously in media containing one or more stable isotopes (e.g. 13C). All the proteins in the cells are heavier and can be used to ‘mark’ a given state in mass spectrometric analysis
14
Time Course of Heavy Amino Acid Incorporation
Advantages of the SILAC Method
• Simple• In vivo• Does not require any extra processing steps• All proteins are uniformly labeled• Complete and predictable incorporation• Choice of labeled amino acids• De novo sequencing can be performed
15
General strategy for stable isotope labeling by amino acids
12C6 Labeled Lysine
Mix and run on SDS-PAGE
Elute, Digest with Trypsin and Analyze by MS
SILAC for Quantitation of Secreted Proteins
m/z
Rel
ativ
e In
tens
ity
13C6 Labeled Lysine
16
Day 1
Day 4
Day 1
Day 9
Day 1
Day 7
MixDays 1+4
MixDays 1+9
MixDays 1+7
Profile of Proteins Secreted by Adipocytes
m/z770 772 774 776 778
0
100772.40
772.90
775.42
773.41
773.91
775.90
776.41
50
Ratio 1:0.5Ratio 1:0.4
m/z770 772 774 776 778
0
100772.39
772.88
775.39773.39
773.88
775.89
776.40
50
Ratio 1:0.24
m/z770 772 774 776 778
0
100 772.40
772.91
773.40775.40
775.91776.41
50
m/z800 802 804 806 808
0
100 805.44
802.43802.94
803.43
805.94
806.45
50
Ratio 1:2.4
m/z800 802 804 806 808
0
100 805.42
802.40802.92
803.40
805.91
806.42
50
Ratio 1:2.7
m/z800 802 804 806 8080
100 805.44
802.43
802.92803.43
805.95
806.4650
Ratio 1:1.6
Fibronectin expression is downregulated
Collagen alpha 3 expression is upregulated
17
Studying Dynamics Using SILAC
786.2
786.7
787.3
788.2
789.2
789.7790.2790.7
791.2
791.7
792.2
792.7
793.1
0.00
0.25
0.50
0.75
1.00
1.25
1.50
5x10
786 788 790 792 794 796 798 m/z
VSHLLGINVTDFTR6Da 4Da
Inte
nsity
Biomarker Discovery Using Proteomics
• Ideal targets for biomarker– Protein (differential expressed proteins)– DNA (mutations, methylation)– RNA (differential expressed genes)
• Biological specimen– Tissue (whole tissue or isolated tumor cells)– Pancreatic juice– Serum– Plasma
18
Pancreatic juice
Physiological proteome of human pancreatic juice
• Collection of pancreatic juice (cancer) during surgery
• Run on 1D gel• LC-MS/MS analysis• Bioinformatic analysis of
identified proteins• Compare data with
known microarray data
19
Differential Proteomic Analysis of Pancreatic Cancer Secretome
• In-vivo labeling with both arginine and lysine
• Collect conditioned media and concentrate with centricon 3,000 Da MWCO
• Normalize and mix in 1:1 ratio• Resolve proteins by 1D gel
electrophoresis• Excise bands and digest proteins
by trypsin• Identify proteins by nanoLC-MS/MS
(2x30 bands)• Verify identified proteins (manually)• Relative quantitation of ID proteins
(manually)
Quantitation of Secreted Proteins
20
Post-translational Modifications
• Peptides can have a number of modifications• During database searching, a variable
modification has to be specified – otherwise, no ‘hit’
• Common PTMs are phosphorylation, acetylation, ubiquitination, glycosylation etc.
Protein Phosphorylation
• One-third of all cellular proteins are phosphorylated at one time or another
• Phosphoamino acid content of a vertebrate cell:Serine - 90%; Threonine - 10%; Tyrosine - 0.05%
• Ser:Thr:Tyr - 1800:200:1• Tyrosine phosphorylation is tightly regulated
21
Why is Phosphorylation Analysis Difficult?
• Stoichiometry of phosphorylation is low• Complete coverage of proteins is difficult to obtain • Phosphorylated serine and threonine residues are labile
whereas phosphotyrosines are more stable • Phosphoserines and phosphothreonine residues can be
subjected to a beta-elimination reaction but not phosphotyrosine residues
• Antibodies to enrich for serine and threonine phosphorylated proteins are not available
• Phosphopeptides are ‘suppressed’ in a mass spectrum
Phosphotyrosine Immonium Ion (216.043 Da)
NH CH
CH2
O
PHO OHO
C
OHN CH2
CH2
O
PHO OHO
Immonium Ion of Phosphotyrosine as a Reporter Ion
+
22
Phosphotyrosine-Specific Precursor Ion Scanning – Bcr/Abl
97
200
68
43
28 ******
500 700 900
590 600 610 608 616 624
Phosphotyrosine-Specific Precursor Ion Scan
MS Scan
Con
trol
p185
Bcr
/Abl
IP: anti-pTyr
538 540 542 544
Large scale IP with an anti-pSer/Thr antibody
WB: Anti-pSer/Thr
Lysates
Calyculin A:
23
In Vivo Labeling with 32P
IP: Anti-pSer/Thr
Calyculin A: - +
10RLpSPAPQLGP19
Identification of Phosphorylated Ser/Thr residues
24
200
110
73
47
28
IP : anti-pTyr
EGF :
MS-Based Identification of a 130 kDa Protein in the EGF Receptor Signaling Pathway
Silver stained gel
>KIAA0229 (1180 residues) FRAGMENT
SWGKGREGVVSPAGLGGALPGDGKFGSPSRLGCSLGEGVQRVAALGMGKEQELLRAARTGHLPAVEKLLSGKRLSSGFGGGGGGGSGGGGGGSGGGGGGLGSSSHPLSSLLSMWRGPNVNCVDSTGYTPLHHAALNGHHRRSSSSRSQDSAEGQDGQVPEQFSGLLHGSSPVCEVGQDPFQLLCTAGQSHPDGSPQQGACHKASMQLEETGVHAPGASQPSALDQSKRVGYLTGLPTTNSRSHPETLTHTASPHPGGAEEGDRSGAR
Assignment of the initiator methionine in a cDNA ‘fragment’ based on an N-terminal peptide
25
>KIAA0229 (1180 residues) FRAGMENT
SWGKGREGVVSPAGLGGALPGDGKFGSPSRLGCSLGEGVQRVAALGMGKEQLLRAARTGHLPAVEKLLSGKRLSSGFGGGGGGGSGGGGGGSGGGGGGLGSSSHPLSSLLSMWRGPNVNCVDSTGYTPLHHAALNGHHRRSSSSRSQDSAEGQDGQVPEQFSGLLHGSSPVCEVGQDPFQLLCTAGQSHPDGSPQQGACHKASMQLEETGVHAPGASQPSALDQSKRVGYLTGLPTTNSRSHPETLTHTASPHPGGAEEGDRSGAR
Assignment of the initiator methionine in a cDNA ‘fragment’ based on an N-terminal peptide
RCH3 C
O
H2N Q L LG K E
Acetylated lysineTri-methylated lysine
128.095
Mono-methylated lysine
142.111
Di-methylated lysine
156.127 170.143 170.105Mass (Da)
Massgain
- 14.016 Da 28.032 Da 42.048 Da 42.010 Da
NH 2
CO CH 3
Lysine
NH 2 CH CO
CH 2
CH 2
CH 2
CH 2
NH 2
OHNH 2 CH C
OCH 2
CH 2
CH 2
CH 3
OHNH 2 CH C
OCH 2
CH 2
CH 2
CH 3
OHNH 2 CH C
OCH 2
CH 2
CH 2
CH 2
NH
OH
CH 3
NCH 3 CH 3
NH 2 CH CO
CH 2
CH 2
CH 2
CH 3
OH
N+
CH 3CH 3
CH 3
Lysine Modifications
26
Di-glycyl lysine
NH 2 CH CO
CH 2
CH 2
CH 2
CH 2
NH
OH
Gly
Gly
Isopeptide bond
Lysine residue
Lysine Modifications
Massgain
114.05 Da
100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400m/z
0
100
50
175.12
232.14
1016.51
717.35
589.30347.17 476.21
959.49
1234.621087.55
1347.72
Tryptic Digest of Ubiquitylated Peptide
LIFAGKQLEDGRGG
242.14 Da
GG
K
27
100 110 120 130 140 150 158
m/z
0
1000
100 120.07
112.08 116.06
158.09
141.10129.10
121.08
129.10
120.07
112.08
158.09
Di-glycine Containing Peptide
Non Di-glycine Containing Peptide
Signature of a Ubiquitylated Peptide
Immonium ion