prediction of protein localization and membrane protein topology
DESCRIPTION
Prediction of protein localization and membrane protein topology. Gunnar von Heijne Department of Biochemistry and Biophysics Stockholm Bioinformatics Center Stockholm University. Stockholm Bioinformatics Center. www.sbc.su.se. sorting. Protein localization. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/1.jpg)
Prediction of protein localization and membrane protein topology
Gunnar von Heijne
Department of Biochemistry and Biophysics
Stockholm Bioinformatics Center
Stockholm University
![Page 2: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/2.jpg)
Stockholm Bioinformatics Center
www.sbc.su.se
sorting
![Page 3: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/3.jpg)
Protein localization
![Page 4: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/4.jpg)
Protein sorting in a eukaryotic cell
SP
![Page 5: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/5.jpg)
The ’canonical’ signal peptide
n h c
-3 -1
n-region: positively charged
h-region: hydrophobic
c-region: more polar, small residues in -1, -3
mTP
![Page 6: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/6.jpg)
mTPs are rich in R & K and can form amphiphilic helices
(Abe et al., Cell 100:551)
cTP
mTP bound to Tom20
![Page 7: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/7.jpg)
Typical chloroplast transit peptide
IV X A A
mature
MA-
no G,P,K,R
no D,E
high S,T
no D,E
high S,T
high R
no D,E
high S,T
ANN
![Page 8: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/8.jpg)
A simple artificial neural network (ANN)
A C G T A C G T A C G T
A A G AC
1 0 0 0 1 0 0 0 0 0 1 0
ACGnot
ACG output layer
input layer
Inside ANN
![Page 9: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/9.jpg)
Artificial neural networks:a summary
- a high-quality dataset (positive and negative examples)
- an ANN architecture (can be optimized)
- all internal parameters in the ANN are systematically optimized during a training session
- evaluate the predictive performance using cross- validation
ChloroP
![Page 10: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/10.jpg)
ChloroP(Prot.Sci. 8:978)
0
10
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
MEME score
residue
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
network score
-30
-20
-10
TargetP
![Page 11: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/11.jpg)
TargetP - a four-state SP/mTP/cTP/other predictor
(JMB 300:1105)
performance
![Page 12: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/12.jpg)
TargetP sensitivity/specificity
sens spec
SP .91 .96
mTP .82 .90
cTP .85 .69
other .85 .78
sens = tp/(tp+fn) spec = tp/(tp+fp)
Other predictors
![Page 13: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/13.jpg)
Other ways to predict localization
- amino acid composition
- sequence homology
- domain structure
- phylogenetic profiles
- expression profiles
Membrane proteins
![Page 14: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/14.jpg)
Popular prediction programs
SignalP (NN, HMM)
ChloroP
TargetP
LipoP
-------
MitoProt
PSORT
Membrane proteins
www.cbs.dtu.dk
![Page 15: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/15.jpg)
Membrane protein topology
![Page 16: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/16.jpg)
A simulated lipid bilayer(Grubmüller et al.)
QuickTime™ and aYUV420 codec decompressorare needed to see this picture.
![Page 17: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/17.jpg)
Only two basic structures(Quart.Rev.Biophys. 32:285)
Helix bundle ß-barrel
Lipid/prot interactions
![Page 18: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/18.jpg)
Most MPs are synthesized at the ER
SP
![Page 19: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/19.jpg)
The basic model(courtesy Bill Skach)
prediction
![Page 20: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/20.jpg)
Topology prediction
![Page 21: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/21.jpg)
TM helix lengths are typically 20-30 residues
(Bowie, JMB 272:780)
Trp, Tyr
![Page 22: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/22.jpg)
Trp & Tyr are enriched in the region near the lipid headgroups
(Prot.Sci. 6:808; 7:2026)
Loop lengths
![Page 23: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/23.jpg)
Loops tend to be short(Tusnady & Simon, JMB 283:489)
PI rule
![Page 24: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/24.jpg)
The ’positive inside’ rule(EMBO J. 5:3021; EJB 174:671, 205:1207; FEBS Lett. 282:41)
N
C
+ + +
Bacterial IMin: 16% KR out: 4% KR
Eukaryotic PMin: 17% KR out: 7% KR
Thylakoid membranein: 13% KR out: 5% KR
Mitochondrial IMIn: 10% KR out: 3% KR
in
out
prediction
![Page 25: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/25.jpg)
The positive-inside rule applies to all organisms
(Nilsson, Persson & von Heijne, submitted)
0
10
20
30
40
50
60
70
80
90
100
110
A C D E F G H I K L M N P Q R S T V W Y
(D+E) (K+R) (W+Y)
num
ber
of g
enom
es
amino acid
![Page 26: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/26.jpg)
Topology can be manipulated(Nature 341:456)
Lep constructs expressed in E. coli
f-Met-Ala-Asn-Met-Phe-
H1 H2
P1
P2
+
+
- -
QSLNASASE
H1 H2
P1
P2
++
+
+ +
+
++
+
+
- -
---
f-Met-Ala-Asn-Met-Phe-
Ala-Asn-Met-(Lys) -Phe-
H1H2
P1
P2
+
+
- -
QSLNASASE
4-
-
Lep wt Lep' Lep'-inv
periplasm
cytoplasm10+
2+
2+
4+
0+0+
PK
![Page 27: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/27.jpg)
Topology prediction - a classical problem in bioinformatics
MDSQRNLLVIALLFVSFMIWQAWE....
4 characteristics
![Page 28: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/28.jpg)
Three important characteristics
~20 hydrophobic residues
predictors
’Positive inside’ rule
Trp, Tyr
![Page 29: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/29.jpg)
Popular topology predictors
TMHMM (HMM)HMMTOP (HMM)TopPred (h-plot + PI-rule)MEMSAT (dynamic programming)TMAP (h-plot, mult. alignment)PHD (NN, mult. alignment)
toppred
![Page 30: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/30.jpg)
TopPred(JMB 225:487)
0 100 200 300 400-3
-2
-1
0
1
2
3
position
<H>
http://bioweb.pasteur.fr/seqanal/interfaces/toppred.html 2 3 5 4 2 2
1 0 0 1 1 0
2
∆+ = 17
2
1
3
0
5
0
4
1
2
3
0
2
∆+ = 9
- construct all possible topologies
- rank based on +
E. coli LacY
TMHMM
![Page 31: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/31.jpg)
TMHMM(Sonnhammer et al., ISMB 6:175, Krogh et al., JMB
305:567)
h & l models
www.cbs.dtu.dkwww.sbc.su.se
A hidden Markov model-based method
![Page 32: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/32.jpg)
HMMTOP(Tusnady & Simon, JMB 283:489)
performance
![Page 33: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/33.jpg)
Helix & loop models in TMHMM
HMMTOP
![Page 34: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/34.jpg)
TMHMM performance(Krogh et al., JMB 305:567; Melén et al. JMB 327:735)
Discrimination globular/membrane:sens & spec > 98%
Correct topology: 55-60%
Single TM identification:sensitivity: 96%specificity: 98%
Training set:160 membrane proteins650 globular proteins
# of TM proteins
![Page 35: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/35.jpg)
Can performance be improved?
Consensus predictions
Multiple alignments
Experimental constraints
# of TM proteins
![Page 36: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/36.jpg)
’Consensus’ predictions indicate reliability
(FEBS Lett. 486:267)
0
0,2
0,4
0,6
0,8
1
5/0 4/1 3/2 & 3/1/1 2/1/1/1
60 E. coli proteins
majority level
frac
tion
corr
ect/
cove
rage
5 prediction methods used
46% of 764 predicted E. coli IM proteins are in the 5/0 or 4/1 classes
Partial consensus
![Page 37: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/37.jpg)
TMHMM reliability scores(Melén et al. JMB 327:735)
TMHMM output:
1. Mean probability pmean
2. Minimum probability pmin(label)
3. PbestPath/PallPaths
Sequence: M C Y G K C I p(i): 0.78 0.78 0.78 0.76 0.76 0.08 0.03 p(h): 0.00 0.00 0.02 0.02 0.15 0.85 0.93 p(o): 0.22 0.22 0.20 0.20 0.08 0.07 0.04 Label: i i i i i h h
S3 results
![Page 38: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/38.jpg)
TMHMM (score 3)Prediction accuracy vs. coverage
Test set bias
60
70
80
90
100
0 20 40 60 80 100
perc
ent
corr
ect
coverage
~70%~45%
92 bacterial proteins
![Page 39: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/39.jpg)
”Experimentally known topologies” is a biased sample
0
10
20
30
40
test set
C. elegans
S.cerevisiae
E.coli
perc
ent
0-0.
25
0.25
-0.5
0.5-
0.75
0.75
-1
score interval
Estimate true performance
![Page 40: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/40.jpg)
Correlation between accuracy and TMHMM S3 score
02040608010000.20.40.60.81
mean score
perc
ent
corr
ect
genomes
![Page 41: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/41.jpg)
Expected TMHMM performance on proteomes
E. coli
S. cerevisiae
test set
C. elegans
40
50
60
70
80
90
100
0 25 50 75 100
coverage
perc
ent
corr
ect
Add C-term.
![Page 42: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/42.jpg)
Original TMHMM prediction, one TM helix missing
TMHMM prediction with C-terminus fixed to inside
Experimental information helps(JMB 327:735)
improvement
![Page 43: Prediction of protein localization and membrane protein topology](https://reader036.vdocument.in/reader036/viewer/2022081505/568157d6550346895dc55d03/html5/thumbnails/43.jpg)
When the location of the C-terminus is
known, the correct topology is predicted for
an estimated ~70% of all membrane proteins
(~ 55% when not known)
Reporter fusions
Experimental information helps(JMB 327:735)