protein threading algorithms 1.genthreader jones, d. t. jmb(1999) 287, 797-815 2.protein fold...
Post on 19-Dec-2015
218 views
TRANSCRIPT
![Page 1: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/1.jpg)
Protein threading algorithms
1. GenTHREADER Jones, D. T. JMB(1999) 287, 797-8152. Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider, R. & Sander, C. JMB(1997)270,471-480
Presented by Jian Qiu
![Page 2: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/2.jpg)
Why do we need protein threading?
To detect remote homologue Genome annotation Structures are better conserved than sequences. Remote homologues with low sequence similarity may share significant structure similarity.
To predict protein structure based on structure template Protein A shares structure similarity with protein B. We could model the structure of protein A using the structure of protein B as a starting point.
![Page 3: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/3.jpg)
An successful example by GenTHREADER
ORF MG276 from Mycoplasma genitalium was predicted to share structure similarity with 1HGX. MG276 shares a low sequence similarity (10% sequence identity) with 1HGX.
Supporting Evidence: MG276 has an annotation of adenine phosphoribosyltransferase, based on
high sequence similarity to the Escherichia coli protein;
1HGX is a hypoxanthine-guanine-xanthine phosphoribosyltransferase
from the protozoan parasite Tritrichomonas foetus.
Four functionally important residues in 1HGX are conserved in MG276.
The secondary structure prediction for ORF MG276 agrees very well with
the observed secondary structure of 1HGX.
![Page 4: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/4.jpg)
Structure of 1HGX
![Page 5: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/5.jpg)
Functional residue conservation between 1HGX and MG276
![Page 6: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/6.jpg)
GenTHREADER Protocol
Sequence alignment
For each template structure in the fold library, related sequences
were collected by using the program BLASTP.
A multiple sequence alignment of these sequences was generated with a simplified version of MULTAL.
Get the optimal alignment between the target sequence and the sequence profile of a template structure with dynamic programming.
![Page 7: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/7.jpg)
Threading Potentials
Pairwise potential (the pairwise model family):
k: sequence separation s: distance interval mab: number of pairs ab observed with sequence separation k
weight given to each observation
fk(s): frequency of occurrence of all residue pairs
fkab(s): frequency of occurrence of residue pair ab
![Page 8: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/8.jpg)
Solvation potential (the profile model family):
r: the degree of residue burial the number of other C atoms located within 10 Å of the residue's C
atom
fa(r): frequency of occurrence of residue a with burial r
f (r): frequency of occurrence of all residues with burial r
![Page 9: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/9.jpg)
Variables considered to predict the relationship
Pairwise energy score
Solvation energy score
Sequence alignment score
Sequence alignment length
Length of the structure
Length of the target sequence
![Page 10: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/10.jpg)
Artificial Neural Network
A node
![Page 11: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/11.jpg)
Neural network architecture in GenTHREADER
![Page 12: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/12.jpg)
The effects of sequence alignment score and pairwise potential on the Network output
![Page 13: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/13.jpg)
Confidence level with different network scores
Low Medium(80%) High(99%)
Certain(100%)
![Page 14: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/14.jpg)
Genome analysis of Mycoplasma genitalium
All the 468 ORFs were analyzed within one day.
![Page 15: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/15.jpg)
Distribution of protein folds in M. genitalium
![Page 16: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/16.jpg)
PHD: Predict 1D structure from sequence
MaxHom
Sequence
Multiple Sequence Alignment
PHDsec PHDacc
Secondary structure:H(helix), E(strand),L(rest)
Solvent accessibility:Buried(<15%), Exposed(>=15%)
![Page 17: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/17.jpg)
Threading Protocol
![Page 18: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/18.jpg)
Similarity matrix in dynamic programming
Purely structure similarity matrix: six states (combination of three secondary structure states and two solvent accessibility states)
Purely sequence similarity matrix: McLachlan or Blosum62
Combination of strcture and sequence similarity matrix: Mij=Mij
1D structure + (100-)Mijsequence
sequence alignment only1Dstructure alignment only
![Page 19: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/19.jpg)
Performance of the algorithm
![Page 20: Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,](https://reader035.vdocument.in/reader035/viewer/2022062320/56649d395503460f94a13ecd/html5/thumbnails/20.jpg)
Results on the 11 targets of CASP1
Correctly detected the remote homologues at first rank in four cases; Average percentage of correctly aligned residues: 21%; Average shift: nine residues.
Best performing methods in CASP1: Expert-driven usage of THREADER by David Jones and colleagues detected five out of nine proteins correctly at first rank. Best alignments of the potential-based threading method by Manfred Sippl and colleagues were clearly better than the best ones of this algorithm.