part ii. tandem ms. mass filter; complete spectrum is obtained by scanning whole range ions are lost...

59
Part II. Tandem MS

Upload: travis-bicknell

Post on 31-Mar-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Part II.

Tandem MS

Page 2: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Mass filter; complete spectrum is obtained by scanning whole range

Ions are lost

Mass range 10- 4,000 Da

Mass Analyzer (2) – Quadrupole

Page 3: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Q2Collision

Q1 Selection Pusher

TOF with reflectron

Detector

Hybrid Quadrupole/Time-of-Flight (Q-TOF) MS

Page 4: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Electrospray MS and MS/MS of Proteins

Page 5: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Sample Preparation

tissue fraction gel

peptidesAdd trypsin

MPSER……

GTDIMRPAK

……

HPLCTo MS/MS

Page 6: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Tandem Mass Spectrometer

Quadrupolemass

analyzercollision

parent ions fragment ions

MPSER

SG…

+

PAK +

+

P + AKPAK +

PAK + PA + K

AK +P

K +PA

P +K+

PA+

AK+

PAK +

PAK +

peptide sequencing

TOFmass

analyzer

ionsdetector

ESI

QTOF

Page 7: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

How Does a Peptide Fragment?

m(y1)=19+m(A4)m(y2)=19+m(A4)+m(A3)m(y3)=19+m(A4)+m(A3)+m(A2)

m(b1)=1+m(A1)m(b2)=1+m(A1)+m(A2)m(b3)=1+m(A1)+m(A2)+m(A3)

Page 8: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

How MS/MS corresponds to peptide

m/z

L G E R

b1b2

b3

m/z

Ry1 y2 y3

E G L

N-term

C-term

Page 9: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Put both together

m/z

L G E R

m/z

R E G L

In practice, there are many more peaks other than b and y peaksMany b and y peaks may disappear.

Page 10: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Matching Sequence with Spectrum

Page 11: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

LGSSEVEQVQLVVDGVKpeptide sequence:

tandem mass spectrometry:

MS/MS spectrum

de novo sequencing:

LGSSEVEQVQLVVDGVK

database

Page 12: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Database Search Methods• Mascot

– matrix sciences– General software

• Sequest– John Yates et. al.– Distributed by Thermo Finnigan.– Works for Thermo’s LTQ.

• PEAKS– Bin Ma et. Al.– Distributed by Bioinformatics Solutions Inc.– General software

Page 13: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Mascot

Page 14: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

PEAKS

Page 15: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

• De Novo Sequencing (Dancik et al., JCB 6:327-342.)– Given a spectrum, a mass value M,

compute a sequence P, s.t. m(P)=M, and the matching score is maximized.

• We consider the matching score of P is the sum of the scores of the matched peaks.

De Novo Sequencing

Page 16: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Spectrum Graph Approach

• Convert the peak list to a graph. A peptide sequence corresponds to a path in the graph.

• Bartels (1990), Biomed. Environ. Mass Spectrom 19:363-368.

• Taylor and Johnson (1997). Rapid Comm. Mass Spec. 11:1067-1075. (Lutefisk)

• Dancik et al. (1999), JCB 6:327-342.• Chen et al. (2001), JCB 8:325-337.• ……

Page 17: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Difficulties• Spectrum graph approach has difficulties to handle

errors:– Missing of ions – break a path.– Too many peaks in a small error tolerance – too many edges

connecting to the same peak. (reduce efficiency)– Error accumulation.– A peak is used as both a y-ion and a b-ion.

• It is still possible to solve these problems under the spectrum graph schema– E.g. The y-b overlap problem had been addressed by Dancik

et al (1999) and Chen et al. (2001).– But things are getting complicated.– A reliable signal preprocessing is required.

Page 18: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

PEAKS’ approach

• It is more natural and easier to handle the errors and noises.– Less dependent to the signal preprocessing.– Solved the missing ions and y-b overlap problems

naturally.– Showed great success on real-life lab data.– Has been licensed by tens of research labs in public and

private sectors.

Page 19: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

A simplified case – Counting Only Y-ions

Page 20: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

The Score of a Suffix

y1 y2 y3

score(Q) are the sum of scores of those y-ions of Q.

)(max)()(

QscoremDPmQm

Let Q be a suffix of the peptide. It can determine some y-ions.

19

Page 21: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Recursive Computation of DP(m)

)'()19()( QscoremhQscore

))(()19()( ammDPmhmDP

))((max)19()(a

ammDPmhmDP

Q’

Do not know a?

a Suppose Q is such that DP(m)=score(Q).

19

score(Q’)=DP(m(Q’))

Page 22: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Dynamic Programming

1. for m from 0 to M

2. backtracking to decide the optimal peptide.

)(max)19()( amDPmhmDP a

Page 23: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

PEAKS – The Software

Page 24: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Comparison

• LCQ data (Iontrap instrument):– Generously provided by Dr. Richard Johnson. 144

spectra.• Micromass Q-Tof data:

– Measured in UWO’s Protein ID lab. 61 spectra• Sciex Q-Star data:

– Provided by U. Victoria’s Genome BC Proteomics Centre. 13 good/okay spectra.

Page 25: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

PEAKS v.s. Lutefisk

• completely correct sequences: – 38/144 v.s. 15/144

• correct amino acids: – 1067/1702 v.s. 767/1702 v.s.

• partially correct sequences with 5 or more contiguous correct amino acids: – 94/144 v.s. 64/144

Page 26: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

PEAKS v.s. Micromass PLGS

• completely correct sequences: – 23/61 v.s. 7/61

• correct amino acids: – 559/764 v.s. 232/764

• partially correct sequences with 5 or more contiguous correct amino acids: – 50/61 v.s. 24/61

Page 27: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

PEAKS v.s. Sciex BioAnalyst

• completely correct sequences: – 7/13 v.s. 1/13

• correct amino acids:– 115/150 v.s. 86/150

• partially correct sequences with 5 or more contiguous correct amino acids: – 12/61 v.s. 7/61

Page 28: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Post Translational Modification (PTM)

Page 29: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

PTM• PTMs are important to the functions of proteins.• There are more than 500 types of PTMs included in

the unimod PTM database. • For example: Reversible phosphorylation of proteins

is an important regulatory mechanism. Many enzymes are switched "on" or "off" by phosphorylation and dephosphorylation. This is done by the structural change caused by the PTM.

Page 30: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Phosphorylation

pS pT pY

H H

H

S T Y

Monoisotopic mass change: PO3H = 79.966

Page 31: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

PTM increases complexity

• Most protein databases do not have the PTM information. Therefore, when PTM is present, one has to try different PTM possibilities to match a peptide with a spectrum.

• For peptide LGSSEVTMVYLK, if only phosphorylation is considered, there are 16 possibilities.– What if there are 10 possible PTM sites?

• This type of PTMs are called variable PTMs.

Page 32: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Fixed PTM

• Some PTMs are know to present all the time.– These are called fixed PTM.

• Oxidation of M. Mass +16. – It happens automatically in the air. So people

often make sure that all of the M are oxidized.

• carboxyamidomethyl cysteine (CamC). Mass +57.02– These are added intentionally to break the

disulphide bonds.

• Fixed PTMs are easier.

Page 33: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Variable PTM in DB Search and DeNovo

• For DB search, have to try different combinations.

• For De Novo, each variable PTM is like adding a new amino acid.– For example, if pS, pT, pY are variable, then

instead of having 20 characters in alphabet, we have 23.

– But too many variable PTMs will reduce the accuracy of the de novo sequencing.

Page 34: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Peptide Identification v.s. Protein Identification

Page 35: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Peptid

e

sequen

cing

Peptid

e

sequen

cing

digestiondigestion

Proteins

Peptides

……

>protein APAKGTIRHIHGCDKRGAPWPAS…>protein BMSERNHLREIIGNEVR……>protein CLSIMQDKDYSASFIS……

>protein APAKGTIRHIHGCDKRGAPWPAS…>protein BMSERNHLREIIGNEVR……>protein CLSIMQDKDYSASFIS……

Proteins

MS/M

S

MS/M

S

Protein IDProtein ID

PeptidesPAKMSERLSIMQDKHIHGCDKEIIGNEVRSIMQMDYSASFIS......

PAKMSERLSIMQDKHIHGCDKEIIGNEVRSIMQMDYSASFIS......

Peptides

Common procedure for protein ID

Page 36: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Problems• A peptide appears in several proteins.• A protein family may share many peptides.

– Usually only one of them is true.

• A protein may have only one peptide or two weak peptides, is it true or false positive?– The “one hit wonder”.

Page 37: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Estimate False Positives• Suppose you have a score for each identified protein. You

want to choose a score threshold T. – Score >T positive (keep)– Score <=T negative (discard)

• It is important to estimate the false positive rate for each given result.

• False Positive Rate – In statistics, FPR= #false positives/#negative results.– We care more about FPR = #false positives/#results reported as positives.

Positive (prediction)

Negative (prediction)

Positive (reality)

TP FN

Negative (reality)

FP TN

The two definitions are different!

Page 38: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Decoy Database Method

• Choose a decoy database: – for example, reverse the database.

• Anything from this database is false.• Search in a real database and a decoy database separately

– For same T, if there are x proteins in the decoy database >T, then perhaps there are x false proteins in the real database with score >T.

• Threshold T, – real db has 497 proteins >T,– decoy db has 7 proteins >T,– False positive rate is 7/497 = 1.4%

Page 39: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Problems

• Only works for large dataset. – Not statistically significant when dataset is small.

• Does not care how many proteins are actually kept.– Keeping only the true results is not our only goal,

we also want to keep as many as true results as possible.

• Decoy database is only good for validation and cannot substitute a good scoring method.

Page 40: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

SPIDER – listen to both parties!

The solution when there is no protein database and no perfect

MS/MS.

兼听则明,偏听则暗

Page 41: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

de novosequencing

EISGNEVR

protein DB

ESIGNEVRdatabase

search

protein DBhomology

search

ESIGSEVR

PEAKS: Ma et. al, Rapid Comm. Mass Spec. 2003

SI

PatternHunter: Ma, Tromp and Li, Bioinformatics. 2002

SPIDER: Han, Ma and Zhang, JBCB. 2005

Page 42: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Two purposes of our research

1. Given de novo sequence with errors, find homolog of the real sequence. (searching)

2. Using the de novo sequence and the homolog as input, compute the real sequence. (sequencing)

Page 43: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

LSCFAV

“Listen to both sides and you will be enlightened; Heed only one side you will be benighted.”

EACFAV

de novoDACFKAV

homolog

Page 44: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Homology mutations

• Sequence alignment

• • Also called edit distance

EACF-AVQR DACFKAV-R

indelEDdist 2),(cost

Page 45: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Common de novo sequencing errors

same mass replacementAN?NA?GAG?

Page 46: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Two exercises

(denovo) X: LSCFV(real) Y: EACFV (homolog) Z: DACFV

m(LS)=m(EA)=200.1mu

(denovo) X: LSCFAV(real) Y: SLCFAV (homolog) Z: SLCF-V

blosum62

Page 47: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

More formally• Let

• Sequencing: Given de novo sequence X, homolog Z, find Y such that is minimized.

• Let

• Searching: search a database for Z such that d(X,Z) is minimized.

XYZ

seqError

editDist

),(),( ZYdYXd es

ZYZYd

YXYXd

e

s

and between distanceedit ),(

and between errors sequencing ofcost ),(

),(),(min),( ZYdYXdZXd esY

Page 48: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

How to compute ds(X,Y)• Easily align X and Y together (according to mass).

• For each erroneous mass block with mass mi, define the cost to be

• Define

XYZ

seqError

editDist

(denovo)X: LSCFAV(real) Y: EACFAV

i

is mfYXd )(),(

)( imf

Page 49: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

How to compute d(X,Z)

• A multiple alignment can be built from alignments (X,Y) and (Y,Z).

• Lemma:

• Dynamic Programming!• Let X

YZ

seqError

editDist

(denovo) X: LSCF-AV(real) Y: EACF-AV (homolog)Z: DACFKAV

])..1[],..1[(),( jZiXdjiD

i

ii ZXdZXd ),(),(

Page 50: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Four cases of the last Block

indeljiDjiD )1,(),(

(A)(B)(C) no sequencing error

D(i,j) is the minimum of the four cases.

indeljiDjiD ),1(),(

])[],[()1,1(),( jZiXdistjiDjiD ])'..[],'..[()1',1'(),( jjZiiXjiDjiD

]1..1[ jZ

]..1[ iX

]..1[ jZ

Page 51: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

How to compute),( CA

),()(

),(min)(

),(

)(:

Cmmf

CBdmf

CA

emBmB

A)(AmmB

C

Page 52: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Three cases of the alignment

])[,(])1..1[),((min

)),((min

])1..1[,(

min),(

nCbdistnCbmm

indelCbmm

indelnCm

Cm

b

b

m

]1..1[ nC ][nC

(1)

)(bmm

C

(2)

]1..1[ nC ][nC

(3)

bB

C

)(bmm b

Page 53: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

The algorithm for computing

1. for m from 0 to m(X) step Δfor i from 0 to |Z|

for j from i to |Z|

])[,()])1..([),((min

)])1..([,(

])..[),((min

min])..[,(

jZydistjiZymm

indeljiZm

indeljiZymm

jiZm

y

y

Time complexity: )( 2MnO

])..[,( jiZm

Page 54: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

The algorithm for computing d(X,Z) and Y

1. for i from 1 to |X|for j from 1 to |Z|

2. output D(|X|,|Z|) as d(X,Z).3. backtracking to get the best middle sequence Y.

]))'..[((])'..[]),'..[(()1',1'(min

])[],[()1,1(

)1,(

),1(

min),(

',' iiXmfjjZiiXmjiD

jZiXdistjiD

indeljiD

indeljiD

jiD

ji

Time complexity: )( 4nO

Total time complexity: )( 24 MnnO

Page 55: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Experiment

• 28 spectra from ALBU_BOVIN.• PEAKS de novo sequencing gives 13 correct and

15 partially correct sequences• SPIDER found good peptide homologues in

human protein DB for all.• 24 constructed correct peptide sequences.

PEAKS EAEGNEVR

human DB SPIDER

ESIGSEVRESIGSEVR

ESIGNEVRESIGNEVR

ALBU_BOVIN

2813+15

28

24+4

Page 56: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Two exemplary results

(denovo) X: CCQ[W ]DAEAC[AF]<NN><PG>K

(real) Y: CCK AD DAEAC FA VE GP K

(homolog)Z: CCK[AD]DKETC[FA]<EE><GK>K

(denovo) X: FVE<RDG>LVTD[TL]K(real) Y: FVE VTK LVTD LT K(homolog)Z: FAE<VSK>LVTD[LT]K

homology mutations

sequencing errors

Page 57: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Four modes in SPIDER

• Homology mode• Non-gapped homology mode

– Assume sequencing error and homology mutations do not overlap.

• Segment match mode– Assume no homology mutations.

• Exact match mode – Assume no sequencing errors or homology

mutations.

Page 58: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

Experiment• 144 ion trap MS/MS spectra, lower quality spectra. • The proteins are all in Swissprot but not in human database.• PEAKS 2.0 was used to de novo sequence. • SPIDER searches Swissprot and human databases, respectively.

Page 59: Part II. Tandem MS. Mass filter; complete spectrum is obtained by scanning whole range Ions are lost Mass range 10- 4,000 Da Mass Analyzer (2) – Quadrupole

People like SPIDER• Best Paper Award at CSB2004• Some random emails we received

– “I'm a big SPIDER fan!” Shinichi Iwamoto, Shimadzu Corporation

– “The results I've been getting have been consistently very good. Thank you for this great piece of software!” Jason W. H. Wong, University of Oxford

– “Your software is by far the fastest and more user-friendly I have found.” Juan Luis, University of Georgia

– ……– I plan to teach SPIDER in my Advanced Bioinformatics class. I wonder if your

powerpoint slides are available?”Pavel Pevzner, Ronald R. Taylor Professor of Computer Science, UCSD

• Included in PEAKS as both a separate tool and an intermediate step in protein candidates generation.

• The best is yet to come– People just started using the de novo + homology approach.