protein structure prediction using neural...
TRANSCRIPT
![Page 1: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/1.jpg)
Protein Structure PredictionUsing Neural Networks
Martha MercaldiKasia Wilamowska
Literature ReviewDecember 16, 2003
![Page 2: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/2.jpg)
The Protein Folding Problem
![Page 3: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/3.jpg)
Evolution of Neural Networks
• Neural networks originally designed to approximate connections betweenneurons in the brain
![Page 4: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/4.jpg)
Evolution of Neural Networks
![Page 5: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/5.jpg)
Why use Neural Nets for ProteinFolding?
• Successful applications in:– Secondary structure prediction
– Solvent access
• No “inherent shortcoming” yet found
• Can incorporate evolutionary informationvia multiple alignments
• Detect previous misclassifications
![Page 6: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/6.jpg)
Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network
• Purpose– Using neural nets, effectively predict the
secondary structure of proteins.
• Current best for secondary structureprediction is SSpro8 with accuracy in therange of 62-63%
![Page 7: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/7.jpg)
Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network
• Input to the system– Can choose to use DNA or
amino acid sequences
– SSpro8 uses amino acidsequences
– The authors’ system,UTMPred, uses DNA
• Output - forms consistingof alpha helices, betasheets and loopsexpanded to eightstructure forms
Protein Secondary Structure Forms
![Page 8: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/8.jpg)
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 9: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/9.jpg)
x - input tothe network
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 10: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/10.jpg)
y – numberof inputnodes
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 11: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/11.jpg)
p – prototype,representing a number ofk nearest previouslytrained input to the currenttested input
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 12: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/12.jpg)
i – number of prototypesin y-dimensional featurespace
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 13: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/13.jpg)
s – the activation function
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 14: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/14.jpg)
M – number of outputclasses
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 15: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/15.jpg)
ujq – a weight which
represents the degree ofmembership forprototype j to outputclass q
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 16: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/16.jpg)
mjq – the BAA mass,
product of weight mjq and
activation function sj
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 17: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/17.jpg)
hjq – the conjunctive
combination of the BBA’s
Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network
![Page 18: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/18.jpg)
Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network
• The DBNN input are DNAsequences converted tobinary format prior to use
• The sequences are:– 88 Escheichia coli proteins
– 25 yeast Saccharomycescerevisiae proteins
– 166 mammalian proteins(80 of which are human)
![Page 19: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/19.jpg)
Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network
• The input window sizefor UTMPred is set to7 codons, whichresults in 84 inputnodes and 8 outputnotes which representthe expandedstructural forms.
![Page 20: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/20.jpg)
Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network
• UTMPred used 200 prototypes and after the training wascompleted, the system was able to predict H and Eforms with accuracy above 75%. At the same time, thesystem had difficulty predicting form I, due to a smallamount of data in the training samples.
![Page 21: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/21.jpg)
Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory
• Purpose– Using neural networks, efficiently predict
protein function
• Using databases such as Prosite, Pfam,and Prints, either query the databases formotifs within a protein in question, orquery for an absence or presence ofarbitrary combinations of motifs.
![Page 22: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/22.jpg)
Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory
• Given a training set,induce a classifier able toassign novel proteinsequences to one of theprotein familiesrepresented in thetraining set
• Once trained, theclassifier will be able topredict novel proteins intospecific functionalfamilies based on itsknowledge of the trainingset
![Page 23: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/23.jpg)
Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory
• Input data– From the Prosite database containing over 1100
entries. Each entry describes a function shared bysome proteins. In the experiment one Prositedocumentation entry corresponded to a protein class,and each protein class could, in turn, be characterizedby one or more motif patterns/profiles. Only motifsconsidered significant matches by profileScan werechosen.
• DBNN was used as the classifier.
![Page 24: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/24.jpg)
Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory
• 585 proteins belonging toone of ten classes wereused, out of whichsubsets of varying sizewere picked randomly tobecome the training set.
• Once the DBNN wastrained, all 585 proteinswere used as the test setto determine accuracy
With only 10% of the total trainingsamples, DBNN could be constructed toclassify proteins with a 95% accuracy.
![Page 25: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/25.jpg)
Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory
• The number of falsepositives generated byDBNN were significantlylower than those resultingfrom a Prosite search.
• As the size of the data setapproaches 100%, thefalse positives discoveredby DBNN approacheszero.
The number of false positives resultingfrom the use of the DBNN trained usingtraining sets of different sizes.
![Page 26: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/26.jpg)
Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory
• A second data set of 73protein sequences drawn fromfive classes were used to builda DBNN classifier
• Using the DBNN classifier builtby random sized datasets, theoutput exceeded 96%accuracy when the training setwas greater or equal to 22
• Once the input contained morethan 80% (58 or moresequences) of the dataset, allsequences were correctlypredicted Result of classifying proteins
containing common motifs
![Page 27: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/27.jpg)
Future Work
• Ultimate solution to “protein folding” will probablybe a hybrid
• Neural networks likely to be included due to theirsuccessful application to related problems– Secondary structure– Solvent access– Distance between residues in final structure– Protein interface recognition
• In addition, neural nets can combine knowledgefrom multiple sources
![Page 28: Protein Structure Prediction Using Neural Networkscourses.cs.washington.edu/courses/cse527/03au/pt-mm-kw.pdfProtein Secondary Structure Prediction Based on Denoeux Belief Neural Network](https://reader033.vdocument.in/reader033/viewer/2022053000/5f0499087e708231d40ec2de/html5/thumbnails/28.jpg)
Bibliography
• B. Rost. “Neural networks for protein structure prediction: hype orhit?” Artificial intelligence and heuristic methods for bioinformatics(2003): 34-50.
• S.N.V. Arjunan, S. Deris, R.M. Illias. “Protein Secondary StructurePrediction Based on Denoeux Belief Neural Network.” ICAIETProceedings (2002): 554-560.
• N.M.Zaki, S. Deris, S.N.V. Arjunan. “Assignment of ProteinSequence to Functional Family Using Neural Network andDempster-Shafer Theory” Journal of Theoretics 5-1 (2003).
Background information
• S.N.V Arkimam, S. Deris, R.M.Illias. “Prediction of ProteinSecondary Structure” Jurnal Teknologi 35(C) (2001): 81-90.
• T.Wessels, C.W. Omlin.”Refining Hidden Markov Models withRecurrent Neural Networks”.