andreas bender - research group gisbert schneider - goethe-university frankfurt1 analysis of...
Post on 18-Dec-2015
218 views
TRANSCRIPT
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 1
Analysis of mitochondrial transit peptides
of Plasmodium falciparum
Andreas Bender
Diplomarbeit
Research Group Gisbert Schneider
April 2002 - September 2002
Goethe-University, Frankfurt
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 2
Contents
• Why … ?
• Our results – in short
• Biological background
• Data coding and analysis
• Detailed results
• P. falciparum and other organisms
• Summary and outlook
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 3
Why … ?
• Why P. falciparum ?– It causes malaria– Genome sequencing recently completed– „Apicoplastic pressure“– Closely related to Toxoplasma gondii etc.
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 4
Why … ?
• Why mitochondrial transit peptides?– Recent related work for apicoplast exists– Major compartment– Failure of established tools
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 5
Our results – in short
• Artificial neural networks results:– Mathews coefficient cc = 0.74 (test set),
corresponding to ~90% correct predictions– 381 to 1177 mTPs found in 5334 annotated
genes (7% to 22%) of Plasmodium falciparum
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 6
Biological background
Female Anopheles
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 7
Biological background
Courtesy of Mark F. Wiser,
Tulane University
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 8
Biological background - Targeting
Courtesy of the Division of Biological Sciences, University of Montana
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 9
Biological background
• Mitochondrial targeting signals – Characteristics– N-terminal, internal, C-terminal– Matrix-targeting or IMS-targeting (bipartite)– No sequence conservation– On average 20-30 amino acids– Net positive charge, forms α-Helix– Distinct cleavage site (Arg at -2 or -3,…)
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 10
Data coding and analysis
• 3 Lengths: N-terminal 24, 31, 42 residues
• Redundance reduction
• Two representations:– Relative amino acid frequencies (20-dim.)– Physikochemical properties (19-dim.)
• SOM
• ANN
• Variable selection
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 11
Data coding and analysis
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 12
Data coding and analysis
• Three-layer feed-forward perceptrons
• Input data – N-terminal 24, 31 and 42 amino acids– Coded in relative amino acid frequency and in
physikochemical space
• All parameters varied one-at-a-time
• 10-fold cross-validation, 40 positive examples, 135 negative
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 13
Data coding and analysis
0,6
0,65
0,7
0,75
0,8
0,85
24 31 42N-terminal length
Be
st M
ath
ews
coe
ffic
ien
t
Relative amino acid frequency
Physikochemical properties
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 14
Data coding and analysis
• Two ANNs– Best cc: 1177 of 5334 annotated genes have
mTPs (~22%)– High penalty for overpredictions: 381 of 5334
annotated genes have mTPs (~7%)– Arabidopsis thaliana: 8% mTPs– Saccharomyces cerevisiae: 11% mTPs
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 15
Data coding and analysis
Matthews cc
Sensitivity Selectivity
MitoProtII 0.49 0.80 0.47
TargetP 0.60 0.55 0.81
PlasMit 0.74 0.94 0.68
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 16
P. falciparum and other organisms
0
0,5
1
1,5
2
2,5
3
3,5
G A V L I M F W P S T C Y N Q D E K R HAminosäure (1-Buchstaben-Code)
Rel
ativ
e am
ino
ac
id f
req
uen
cy in
P.
falc
ipa
rum
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 17
P. falciparum and other organisms
• 25% G+C-Content in coding regions (sample of chromsome 2 and 3)
• In good agreement with work of Lobry for 50 bacterial genomes
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 18
Summary
• Failure of established tools for mTP pred.
• There are general differences in AA usage between P. falciparum and other eukaryotes
• Low G+C-Content of coding regions
• New tool PlasMit outperforms existing algorithms
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 19
Outlook
• Question: Why are there so many positive predictions in P. falciparum ?
• Using PlasMit for assembling putative metabolic pathways in the mitochondria will now be possible
• Final goal: Full map of P. falciparum´s metabolism
Andreas Bender - Research Group Gisbert Schneider - Goethe-University Frankfurt 20
Thank you!