data processing and database search - itqb · 2011-11-29 · data processing and database search de...

40
Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011

Upload: others

Post on 11-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Data Processing and DatabaseSearch

De novo sequencing

Renata Soares, PhD

Miguel Ventosa, MSc

2-4 November 2011

Page 2: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Summary

• 1. Analyze manually fragmentation spectra

• 2. Protein identification using Protein Pilot –combined search

Page 3: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Peptide Fragmentation

• Fragmentation is directed by protonation

• Proton can jump along the peptide backbone

• Protonated peptide bond weakened

• When CID occurs (fragmentation by collision

• with a gas) the bond breaks – ocurs at the C-N bond (which is usually the weakest)

Page 4: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Peptide Fragmentation

Page 5: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Peptide fragmentation from N-terminus: y-ions

Page 6: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Peptide fragmentation from N-terminus: b-ions

Page 7: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Example

Page 8: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

MS/MS are complex spectrum

Page 9: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Neutral loss

• Peptides loose water (-18 mass units) – y0, b0, a0– Ser, Thr and also Glu, Asp

• Peptides can loose ammonia (-17 mass units) – y*, b*, a*– Asn, Glu, Arg, Lys

• b-ions can loose C=O (-28 mass units) – a ion

Page 10: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Immonium ion

• An internal fragment with just a single side chain

• Diagnostic for the presence of a specific amino acid

Page 11: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

De novo strategy hints

1. Determine the parent ion mass

2. Get the N-terminus

3. Get the C-terminus

4. Fill the rest of the peptide sequence

5. Check

Page 12: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

C-terminal• Assuming a tryptic peptide:

– Ends in K: y1 = 128+18+1 = 147

– Ends at R: y1 = 156+18+1 = 175

• High mass region for bn-1 ion:

– Ends at K: bn-1 = Ion mass – 18 - 128

– Ends at R: bn-1 = Ion mass – 18 - 156

De novo strategy hints

b ion less 18

y ion gain 18

Page 13: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Example 1: Determine the peptidesequence

- Hint: only contains y-series ions of a tryptic peptide- Start from the y1 and see amino acid differences for most intense peaks

719.3

620.7

473.5360.3603,6

456.4343.3

232.0

175.1

Page 14: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 15: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Example 2: More complex data• Hint: contains both b- and y-series ions

• Determined the parent ion mass

Page 16: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Get the N-terminus region

1. Identify the a2/b2 combo in the low mass region (28 mass units)

2. Identify in the high mass region the yn-1

yn-1= [M+H]+ - first aa

Page 17: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 18: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

• [b2]+ = 219– Possible combinations:

A + FM + SD + C

– To determine the correct aa see the corresponding Yn-1 ion:

(A) 1186.6-71=1115.6(F) 1186.6-147=1039.6(M) 1186.6-131=1055.6(S) 1186.6-87=1099.6(D) 1186.6-115=1071.6(C) 1186.6-103=1083.6

Page 19: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 20: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 21: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Get the C-terminus region

– Check the low mass region for y1

• If it is a trytic peptide:

(K) y1=128+18+1=147

(R) y1=156+18+1=175

– Check the high mass region for bn-1

(K) bn-1=1186.6-18-128 = 1040.6

(R) bn-1=1186.6-18-156 = 1012.6

Page 22: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 23: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 24: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

• Determine the rest of the sequence– y-series ions (in red)

Page 25: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

• m/z of 310.1 is composed of 2 aa, one of them being C-terminal Lys (K)

– m/z y-series = 310.1 = aa+K+18+1 = 163.1= Tyr (Y)

Page 26: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

• Determine the rest of the sequence– b-series ions (in blue)

Page 27: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

• Verify the mass of the peptide ion: sum of aa (should be 1186.6, as determined in step 1)

Page 28: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Amino acid dipeptides

Page 29: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Now it’s your turn !

Good luck

Page 30: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 31: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 32: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 33: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 34: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 35: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 36: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 37: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary
Page 38: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

AXEXFR

Page 39: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

Protein identification using MS+MS/MS data file

Software Protein Pilot

Page 40: Data Processing and Database Search - ITQB · 2011-11-29 · Data Processing and Database Search De novo sequencing Renata Soares, PhD Miguel Ventosa, MSc 2-4 November 2011. Summary

What is a good ProtScore?• Default is 1.3 (95 % confidence)• Decrease the threshold – High number of false positive• To reduce the false positives – Increase the threshold

Guidelines:• Score greater than 2 – generally true• Score greater than 1 – generally not true (more data)• Review results to select protein with high ProtScores