data processing and database search - itqb · 2011-11-29 · data processing and database search de...
TRANSCRIPT
Data Processing and DatabaseSearch
De novo sequencing
Renata Soares, PhD
Miguel Ventosa, MSc
2-4 November 2011
Summary
• 1. Analyze manually fragmentation spectra
• 2. Protein identification using Protein Pilot –combined search
Peptide Fragmentation
• Fragmentation is directed by protonation
• Proton can jump along the peptide backbone
• Protonated peptide bond weakened
• When CID occurs (fragmentation by collision
• with a gas) the bond breaks – ocurs at the C-N bond (which is usually the weakest)
Peptide Fragmentation
Peptide fragmentation from N-terminus: y-ions
Peptide fragmentation from N-terminus: b-ions
Example
MS/MS are complex spectrum
Neutral loss
• Peptides loose water (-18 mass units) – y0, b0, a0– Ser, Thr and also Glu, Asp
• Peptides can loose ammonia (-17 mass units) – y*, b*, a*– Asn, Glu, Arg, Lys
• b-ions can loose C=O (-28 mass units) – a ion
Immonium ion
• An internal fragment with just a single side chain
• Diagnostic for the presence of a specific amino acid
De novo strategy hints
1. Determine the parent ion mass
2. Get the N-terminus
3. Get the C-terminus
4. Fill the rest of the peptide sequence
5. Check
C-terminal• Assuming a tryptic peptide:
– Ends in K: y1 = 128+18+1 = 147
– Ends at R: y1 = 156+18+1 = 175
• High mass region for bn-1 ion:
– Ends at K: bn-1 = Ion mass – 18 - 128
– Ends at R: bn-1 = Ion mass – 18 - 156
De novo strategy hints
b ion less 18
y ion gain 18
Example 1: Determine the peptidesequence
- Hint: only contains y-series ions of a tryptic peptide- Start from the y1 and see amino acid differences for most intense peaks
719.3
620.7
473.5360.3603,6
456.4343.3
232.0
175.1
Example 2: More complex data• Hint: contains both b- and y-series ions
• Determined the parent ion mass
Get the N-terminus region
1. Identify the a2/b2 combo in the low mass region (28 mass units)
2. Identify in the high mass region the yn-1
yn-1= [M+H]+ - first aa
• [b2]+ = 219– Possible combinations:
A + FM + SD + C
– To determine the correct aa see the corresponding Yn-1 ion:
(A) 1186.6-71=1115.6(F) 1186.6-147=1039.6(M) 1186.6-131=1055.6(S) 1186.6-87=1099.6(D) 1186.6-115=1071.6(C) 1186.6-103=1083.6
Get the C-terminus region
– Check the low mass region for y1
• If it is a trytic peptide:
(K) y1=128+18+1=147
(R) y1=156+18+1=175
– Check the high mass region for bn-1
(K) bn-1=1186.6-18-128 = 1040.6
(R) bn-1=1186.6-18-156 = 1012.6
• Determine the rest of the sequence– y-series ions (in red)
• m/z of 310.1 is composed of 2 aa, one of them being C-terminal Lys (K)
– m/z y-series = 310.1 = aa+K+18+1 = 163.1= Tyr (Y)
• Determine the rest of the sequence– b-series ions (in blue)
• Verify the mass of the peptide ion: sum of aa (should be 1186.6, as determined in step 1)
Amino acid dipeptides
Now it’s your turn !
Good luck
AXEXFR
Protein identification using MS+MS/MS data file
Software Protein Pilot
What is a good ProtScore?• Default is 1.3 (95 % confidence)• Decrease the threshold – High number of false positive• To reduce the false positives – Increase the threshold
Guidelines:• Score greater than 2 – generally true• Score greater than 1 – generally not true (more data)• Review results to select protein with high ProtScores