bioinformatics and protein structural analysis
DESCRIPTION
Surabhi Agarwal. Bioinformatics and Protein Structural Analysis. - PowerPoint PPT PresentationTRANSCRIPT
Bioinformatics and Protein Structural Analysis
Surabhi Agarwal
The molecular structures of proteins are complex and can be defined at various levels. These structures can
also be predicted from their amino-acid sequences. Protein structure prediction is one of the most
widespread fields of research in bioinformatics.
Master Layout (Part 1)
5
3
2
4
1 This animation consists of 2 parts:Part 1: Protein Structural DatabasesPart 2: Uses of Structural databases
Different types of data and the organization of data in a
Structural Database
Search the Database for Protein Structures
Definitions of the components:Part 1 – Protein structural databases
5
3
2
4
11. Query Peptide: The unknown protein or peptide whose sequence is
first determined, with which further analysis is performed. This protein sequence is compared with other known protein sequences in existing databases.
2. Protein sequence: The linear chain or sequence of amino acids, which form the structural unit of a protein, is known as the protein sequence. This sequence is unique for all proteins and is also known as the primary structure of the protein.
3. Sequence similarity: The process by which the amino acid sequences of two proteins are aligned linearly to evaluate their similarities.
4. 3-D structural alignment: The three dimensional structural alignment is the process of super-positioning two given protein structures. This can be achieved by using suitable software by entering protein identifiers or their atomic coordinates.
5
3
2
4
15. Geometry of Protein Structure: Geometry of a protein structure
refers to the three dimensional coordinates of its atoms and the angles between their bonds. These are essential to simulate the protein structure on computers.
6. Biology of Protein Structure: Information regarding the biological source of the protein and its metabolic roles within the cell and organism is referred to as the biology of protein structure.
7. SCOP classification: SCOP stands for “Structural Classification of Proteins” and aims to provide a detailed description of the various structural and evolutionary relationships between all proteins that have been structurally characterized. SCOP Classification can be done at four levels - Class, Fold, Superfamily and Family.
8. CATH classification: CATH stands for “Class Architecture Topology and Homologous Superfamily” and provides a semi-automatic, hierarchical classification of protein domains. The levels for CATH classification are Class, Architecture, Topology and Homologous Superfamily.
Definitions of the components:Part 1 – Protein structural databases
Step 1: Protein Structure Database: Search 1
5
3
2
4
Protein Structural Database
Enter Protein ID or text query Capsid
Structure Features Biology
Experiment
10 Retro Transcribing Viruses
X-RAY CRYSTALLOGRAPHY
Sequence Features
< 500
Optional Inputs
Macromolecule type
Number of Chains
Number of models
Molecular Weight
Secondary Structure Content
Secondary Structure Length
SCOP classification
CATH classification
Number of Chains
Source Organism
Expression Organism
Enzyme Classification
Biological Process
Cellular componentExperimental method
Resolution
Crystal Properties
Detectors used
Experimental Data Available
Source Organism
Sequence
Translated Nucleotide Sequence
Sequence Length
Sequence Motif
Sequence Length Experimental method
Search
http://www.pdb.org/pdb/search/advSearch.do
Step 1: Protein Structure Database: Search
Action Audio Narration
1
5
3
2
4
Description of the actionSchematic for Database functioning
Follow the steps as shown in the animations. First show the basic layout of the database. Then input the test “Capsid” in the text box on the top of the page. For each 4 categories, when the down-link gets clicked announce the options as the mouse hovers on them. The downlink in the animation should look like the downlink in web-pages. Re-create all images.
The protein structural databases contain a basic search box which requires the input for an identifier of the protein. This identifier can be the protein name, key-word, ID, author, etc. In this example, we take the case of Viral Capsid Proteins. These databases have advanced search features which are optional but help in making the query very specific. The general options can be categorized in 4 broad classes. Structural Features, Biology, Sequence Data and Experimental Details.
http://www.pdb.org/pdb/search/advSearch.do
Step 2.a: Protein Structure database: Output
Action Audio Narration
1
5
3
2
4Description of the action
Protein Structural Database
Number of Hits
Follow the steps as shown in the animations. Re-create all images. Show the display of “67” in front of tab titled “Number of Hits”. Then show the figure under the 2nd horizontal line. Show clicking effect on the 1st point. This slide and the 8 that follow it, are part for the same animated webpage.
The search results for the query protein entered showed 67 structures in the database that match the criteria given by the user in the search options. The first page of the results shows the titles of all the hits. The user then needs to select the protein structure of their interest to study in detail. Here we select the structure titled “HIV CAPSID C-TERMINAL DOMAIN (CAC146)” for further study.
67
1. HIV CAPSID C-TERMINAL DOMAIN (CAC146)
2. X-RAY CRYSTAL STRUCTURE OF EQUINE INFECTIOUS ANEMIA VIRUS (EIAV) CAPSID PROTEIN P26
3. ROUS SARCOMA VIRUS CAPSID PROTEIN: N-TERMINAL DOMAIN
4. STRUCTURE OF HIV1 PROTEASE AND AKC4P_133A COMPLEX.
Showing 1 to 4 of 67 Next
Schematic for Database functioning
http://www.pdb.org/pdb/explore/explore.do?structureId=1AUM
Step 2.b - Protein Structure database: Output
Action Audio Narration
1
5
3
2
4Description of the action
Protein Structural Database
Summary Sequence data Sequence similarity 3D similarity
BiologyMethods Geometry
1. 1AUM
2. Molecule:HIV CAPSIDStructure Weight: 7970.16Type:polypeptide(L)Chains:ALength:70Classification: Viral Protein
Derived data
Follow the steps as shown in the animations. Re-create all images. This slide and the 7 slides that follow it, are part for the same webpage. The mouse pointer should be shown clicking on each of the 8 tabs one –by-one , and the text below it changes accordingly. Always highlight the active tab with a different color as done in websites..As each of the four headings is being narrated in the audio narration, that particular text must be highlighted in the animation.
The summary page shows all the general information pertaining to the basic features of the protein. This includes:1 . Protein Identifier2. Molecule name, structure weight, polymer type, number of chains, length of the molecule and its classification3. Source organism and Expression organism4. Journal, paper and author name
http://www.pdb.org/pdb/explore/explore.do?structureId=1AUM
3. Scientific Name: Human immunodeficiency virus 1 Expression System: Escherichia coli bl21(de3)
4.“Structure of the carboxyl-terminal dimerization domain of the HIV-1 capsid protein”, Science, 1997
Schematic for Database functioning
Step 2.c - Protein Structure database: Output
Action Audio Narration
1
5
3
2
4Description of the action
Protein Structural Database
Summary Sequence data Sequence similarity 3D similarity
BiologyMethods Geometry Derived data
Follow the steps as shown in the animations. Re-create all images. This is a follow-up slide to slide #8, as described there.
The sequence data tab contains all the information related to the amino acid sequence corresponding to the protein under consideration1. FATSA sequence for all chains in the polypeptide 2. Type of chain such as polypeptide, glyco-peptide, lipo-peptide, etc.3. Diagrammatic representation of the Classification and Secondary structure of this chain - assigning residues with helix, sheet or turn
Schematic for Database functioning
1. FASTA>1AUM:A|PDBID|CHAIN|SEQUENCELDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTILKALGPGATLEEMMTACQG
2.Chain Type: polypeptide(L)
3.
Sequence of Amino Acid Residues and their
positions
Cysteine Residues
Cysteine Residues
Di-sulphide bridge
Domain of the protein
Alpha HelixHydrogen Bonded
TurnNo assigned
secondary structure
http://www.pdb.org/pdb/explore/explore.do?structureId=1AUM
Step 2.d - Protein Structure database: Output
Action Audio Narration
1
5
3
2
4Description of the action
Protein Structural Database
Summary Sequence data Sequence similarity 3D similarity
BiologyMethods Geometry Derived data
Follow the steps as shown in the animations. Re-create all images. This is a follow-up slide to slide #8, as described there.
The sequence similarity tab shows the information related to comparative studies of the two sequences. 1. Option to perform BLAST search. 2. List of Clusters of proteins is produced. These clusters are formed and ranked based on the resolution of the structures within them. The better the quality (resolution) of the cluster, higher it is ranked.When the user clicks on a particular cluster, the component proteins within the cluster are displayed along with supporting information..
Schematic for Database functioning
Cluster Similarity Cut-off
Rank
100% 1
95% 3
PDB ID Name of the Protein
1A80 HIV CAPSID
2ONT Capsid protein p24
1AUM HIV CAPSID
BLAST
Perform BLAST of the sequence of the
retrieved ProteinTable for cluster of similar
proteins where the structure has been
determined
http://www.pdb.org/pdb/explore/explore.do?structureId=1AUM
Step 2.e -Protein Structure database: Output
Action Audio Narration
1
5
3
2
4 Description of the action
Protein Structural Database
Summary Sequence data Sequence similarity 3D similarity
BiologyMethods Geometry Derived data
Follow the steps as shown in the animations. Re-create all images. This is a follow-up slide to slide #8 , as described there.
The structural similarity tab shows the information related to comparative studies of the two structures. It establishes equivalences based on 3D conformations of both proteins. The default visualization tool for PDB is Jmol. Structural alignment is covered in more detail in the second part of this animation.
Schematic for Database functioning
http://www.pdb.org/pdb/workbench/showPrecalcAlignment.do?action=pw_fatcat&mol=1A8O.A&mol=1BAJ.A
HIV capsid alignment with GAG ployprotein
HIV CAPSID (colored orange)
GAG POLYPROTEIN (colored blue)
Step 2.f - Protein Structure database: Output
Action Audio Narration
1
5
3
2
4Description of the action
Protein Structural Database
Summary Sequence data Sequence similarity 3D similarity
BiologyMethods Geometry Derived data
This tab provides details of the methodology used in conducting those experiments. This includes,
1. Crystallization methods, pH, temperature, and other details of the experiment2. Crystal Data (Space group, unit cell dimensions)3. Diffraction source, diffraction protocol and diffraction detectors4. Data related to Resolution and Refinement details5. Software, programs and Computing utilized.A brief summary of this result is shown in this animation. For details visit
http://www.pdb.org/pdb/explore/materialsAndMethods.do?structureId=1AUM#
Schematic for Database functioning
All tables have to be re-drawn by the animator. Follow the steps as shown in the animations. This is a follow-up slide to slide #8, as described there.
Crystallization Experiments Method vapor diffusion - sitting droppH 8
Space Group Name I 41Diffraction Detector CCD
Computing Data Reduction (intensity integration) DENZO
Computing Data Reduction (data scaling) SCALEPACKComputing Structure Solution X-PLOR 3.843
Computing Structure Refinement X-PLOR 3.843
http://www.pdb.org/pdb/explore/geometryDisplay.do?structureId=1AUM
Step 2.g - Protein Structure database: Output
Action Audio Narration
1
5
3
2
4Description of the action
Protein Structural Database
Summary Sequence data Sequence similarity 3D similarity
BiologyMethods Geometry Derived data
All tables have to be re-drawn by the animator. Follow the steps as shown in the animations. This is a follow-up slide to slide #8 , as described there.
The Geometry of the molecule contains all the spatial information about the Geometry of the molecule, so that it can be simulated in a virtual environment. This includes:Bond length: Number of occurrences and their positions in the chainsBond Angles: Number of occurrences and their positions in the chainsDihedral Angles: Number of occurrences and their positions in the chainsRamachandran plot, Fold Deviation Scores and other structural detailshttp://www.pdb.org/pdb/explore/geometryDisplay.do?structureId=1AUM
Schematic for Database functioning
The position, total number, range of the covalent bond lengths between two adjacent atoms in a protein molecule
The angle formed by 3 consecutive atoms in native conformation of a protein and their statistics
The angle formed by 2 consecutive planes of 4 linearly bonded atoms. Their occurrence, positions along with other statistics.
Ramachandran Map to show the residues that lie in the favored region (outlined in Dark Blue) and the permitted region (outlined in light blue)
67/68 residues lie in the favored region and none of the residues lie in the
dis-allowed region
Residue ValuesLEU1 1.29ASP2 0.56ILE3 1.19
ARG4 1.73GLN5 1.29GLY6 1.85PRO7 0.65LYS8 0.73GLU9 1.27
PRO10 1.53PHE11 0.41
Values for Fold Deviation Score . For a specific reference value, FDS is a multiple of the standard deviationPlot for Fold Deviation Score. x- axis has the residue positions and y-axis has the FDS values
http://www.pdb.org/pdb/explore/geometryDisplay.do?structureId=1AUM
Step 2.h - Protein Structure database: Output
Action Audio Narration
1
5
3
2
4Description of the action
Protein Structural Database
Summary Sequence data Sequence similarity 3D similarity
BiologyMethods Geometry Derived data
Follow the steps as shown in the animations. Re-create all images. This is a follow-up slide to slide #8 , as described there.
The biology tab contains information about the significance of the molecule at the biological and cellular level. This includes 1. Molecule type 2. Formula weight 3. Monomers, and linkages 4. Source method 5. Ligands and prosthetic groups 6. Gene detail and Genome information 7. Keywords
Schematic for Database functioning
Description HIV CAPSID
FragmentC-TERMINAL DOMAIN,
RESIDUES 146 - 231 Nonstandard Linkage no
Nonstandard Monomers no Polymer Type polypeptide(L)
Formula Weight 7970.2
Source Methodgenetically manipulated
Entity Name CAC146
SWS/UNP ID POL_HV1N5SWS/UNP Accession(s) P12497
Protein Details
Scientific NameHuman immunodeficiency virus 1
Genus LentivirusCell Line
Bl21
Host Scientific Name Escherichia coli bl21(de3)
Host Genus Escherichia
Host Species Escherichia Coli
Host Strain Bl21 (de3)
Host Vector Pet11a
Host Plasmid Name WISP97-7
Gene Details
http://www.pdb.org/pdb/explore/geometryDisplay.do?structureId=1AUM
Step 2.g - Protein Structure database: Output
Action Audio Narration
1
5
3
2
4Description of the action
Protein Structural Database
Summary Sequence data Sequence similarity 3D similarity
BiologyMethods Geometry Derived data
Follow the steps as shown in the animations. Re-create all images. This is a follow-up slide to slide #8 , as described there.
Data for the same protein but from other resources such as SCOP, CATH and PFAM classification details are provided in the derived data tab. For more detailed analysis visit http://www.pdb.org/pdb/explore/derivedData.do?structureId=1AUM
Schematic for Database functioning
Domain Info d1auma_
Class All alpha proteins
Fold Acyl carrier protein
Super-Family
Retrovirus capsid dimerization domain-like
Family
Retrovirus capsid protein C-terminal domain
Domain HIV capsid protein,
dimerisation domain
Species
Human immunodeficiency virus
type 1 [TaxId: 11676]
SCOP classificationDomain 1aumA00
Class Mainly Alpha
Architecture Orthogonal Bundle
Topology
Non-ribosomal Peptide Synthetase
Peptidyl Carrier Protein; Chain A
CATH classification
Chain APFAM
AccessionPF00607
PFAM ID Gag_p24
Description
gag gene protein p24 (core nucleocapsid
protein)
Type Family
PFAM classification
http://www.pdb.org/pdb/explore/geometryDisplay.do?structureId=1AUM
Master Layout (Part 2)
5
3
2
4
1 This animation consists of 2 parts:Part 1: Protein Structural DatabasesPart 2: Uses of Structural databases
Functional Annotation
Protein Structural alignment Secondary Structure Prediction
Definitions of the componentsPart 2 – Uses of structural databases
5
3
2
4
11. Protein Structural Alignment: The geometry of two given protein structures
can be compared by means of available software tools that analyse their three dimensional similarity to each other.
2. Protein Structure Prediction: The prospective secondary structures of peptides or proteins can be predicted from a given stretch of amino acid residues by using machine learning algorithms.
3. Machine Learning Algorithms: These are computer algorithms that can be trained from a given classified dataset. Thereafter, these programs train their parameters in a such a way, that they can classify new data. Most widely used Machine Learning Algorithms in Bioinformatics are Artificial Neural Networks, Hidden Markov Modeling, Support Vector Machines, etc.
4. Functional Annotation: For novel proteins that are yet to be characterized, the potential functions can be predicted by techniques such as Homology Modelling which provide an initial insight into the protein’s properties.
Definitions of the componentsPart 2 – Uses of structural databases
5
3
2
4
15. Gene Ontology: Also known as GO terms, they are identifiers to represent a
gene’s functional properties categorized to cover three domains namely, “cellular component”, “molecular function” and “biological process”.
6. Root Mean Square Deviation (RMSD): Qauantification of the average distance between the atoms of the super-imposed proteins. The higher is the RMSD value, the lower is the similarity.
7. Protein Structural Alignment Server: Web based servers which help in determining the structural similarity of two given proteins by superimposing the two proteins and calculating various comparative parameters. Currently there are a large number of web based servers assigned for this task. Few examples of available servers for this include DALI (Distance Matrix Alignment), MAMMOTH (Matching Molecular Models Obtained from Theory), CE/CE-MC (Combinatorial Extension -- Monte Carlo), SSAP(Sequential Structure Alignment Program), ProFit (Protein least-squares Fitting), etc.
Step 1: Structure Alignment - Input
Action Audio Narration
1
5
3
2
4Description of the action
Protein Structural Alignment Server (DALI)
Follow the steps as shown in the animations. Re-create all images. Enter the 2 IDs in the text box. Follow it with clicking effect on “Submit” Button. Show the action in progress effect as shown in the slide. Follow it with the two simple structures getting superimposed and highlight the no-aligned areas. Follow this with the actual output in the next slide.
Two given proteins can be structurally aligned to evaluate the similarity between them. The server requires an input of two protein sequences or their IDs, which are then simulated and aligned based on their 3D coordinates, bond angles and dihedral angles. Few of the various servers available for this are DALI, MAMMOTH, CE/CE-MC, SSAP and ProFit.
Enter the first PDB ID and Chain(or Upload a Protein Structure)
Enter the second PDB ID and Chain(or Upload a Protein Structure)
1A8O 1BAJ
Submit
Running the Server…3D Superimposition
Web-Tool functioning
Non-aligned regions on super-imposed structures
Step 2: Structure Alignment- Output 1
5
3
2
4
Protein Structural Alignment Server (DALI)
1A8O 1BAJ
P-value: 0.00e+00Score: 190.92RMSD: 0.75%Id: 94.0%
It is the probability for similarity between the two structures. P-value < 0.05
indicates significant similarity
Raw score of alignment is used to compare other similarity matches with same proteinsIn super-imposed proteins, RMSD The average of the
distances between the atoms
Percentage of identical residues in the sequences of
the alignment
http://www.pdb.org/pdb/workbench/showPrecalcAlignment.do?action=pw_fatcat&mol=1A8O.A&mol=1BAJ.A
Step 2: Structure Alignment- Output Action Audio Narration
1
5
3
2
4
Description of the actionFollow the steps as shown in the animations. Mention the definitions of the result in audio narration as well as written format. Re-create all images.
The results are 1. P-value: It is the probability measure that the two structure are similar. If P-value < 0.05 indicates significant similarity2. Raw score: It is used to compare other similarity matches with same proteins3. RMSD: Measure of the average distance between the atoms of the super-imposed proteins4. Percentage sequence identity in the alignment
Web-Tool functioning
http://www.pdb.org/pdb/workbench/showPrecalcAlignment.do?action=pw_fatcat&mol=1A8O.A&mol=1BAJ.A
Step 3: Structure Prediction 1
5
3
2
4
Protein Structural Prediction Server
Enter the sequence of amino acids (primary structure of protein)
DAHKSEVAHRFKDLGEENFKALVLIAFAQYLQQCPFEDHVKLVNEVTEFAKTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCAKQEPERNECFLQHKDDNPNLPRLVRPEVDVMCTAFHDNEETFLKKYLYEIARRHPYFYAPELLFFAKRYKAAFTECCQAADKAACLLPKLDELRDEGKASSAKQRLKCASLQKFGERA
Predicted Secondary Structure
Alpha Helix Beta Sheets
Coils
http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html
Action Audio NarrationDescription of the actionWeb-Tool functioning
Follow the steps as shown in the animations. Re-create all images.
Once the amino acid sequence of the protein is known, its secondary and tertiary structures can be predicted using many prediction algorithms, which utilize information from previous structurally characterized sequences. In the secondary structure prediction, 1.“h” represents Alpha Helix2.“e” represents Beta Sheets,3.“c” represents CoilsSince all known proteins have not yet been structurally characterized, this provides a useful bioinformatics analysis tool for researchers. The various servers for structure prediction are GOR, HNN, PredictProtein, NNPredict and Sspro.
Step 3: Structure Prediction
http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html
Step 4: Functional Annotation
Action Audio Narration
1
5
3
2
4 Description of the action
Protein Functional Annotation Server
Follow the steps as shown in the animations. Re-create all images.
Given a particular amino acid sequence, the cellular, molecular and biological processes associated with the sequence can be predicted using functional annotation servers. These processes are represented by a unique set of identifiers called “Gene Ontology Terms” or the “GO Terms”. The GO term can be a word or an alphanumeric identifier which includes a definition with cited sources and a namespace indicating the domain to which it belongs. The various server for this include DbAli Annolite, PFP, ProteomeAnalyst, GOPET, SpearMint and ProKnow.
Enter the sequence of amino acids (primary structure of protein)
DAHKSEVAHRFKDLGEENFKALVLIAFAQYLQQCPFEDHVKLVNEVTEFAKTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCAKQEPERNECFLQHKDDNPNLPRLVRPEVDVMCTAFHDNEETFLKKYLYEIARRHPYFYAPELLFFAKRYKAAFTECCQAADKAACLLPKLDELRDEGKASSAKQRLKCASLQKFGERA
Functional Prediction
Molecular Functions
Probability GO Term Description
100 %
97%
GO0549
GO0543
Vitamin D Binding
Water Binding
Biological Functions
Probability GO Term Description
100 %
97%
GO0189
GO0243
C21 Steroid Hormone Metabolism
Vitamin Transport
Cellular Component
Probability GO Term Description
89 %
74%
GO0432
GO0
Membrane
Intra-cellular organelle
Web-Tool functioning
http://www.pdb.org/pdb/explore/remediatedSequence.do?structureId=1AO6, http://kiharalab.org/web/pfp.php
Interactivity option 1: Predict the 3 Dimensional Structure of Human Serum Albumin and cross-validate
Boundary/limitsInteracativity Type Options Results
1
2
5
3
4
Input the term “human serum albumin” in a structural Database 1
Click on the hit which matches with your query 2
Go to the “sequence details” tab and retrieve the FASTA sequence of the protein 3
Go to the 3D structure details and save the actual co-ordinates and the 3D structure of the protein, derived from experimental details 4
Select a structural alignment tool and superimpose the predicted structure on the actual structure derived from the database 6
Predict the tertiary structure from the amino-acid sequence and save the predicted structure coordinates 5
Arrange the steps in the order to be performed. Remove the step number from the bottom of the tab
Remove the step number mentioned in the tabs in “yellow” color. Show all the steps in the mixed order. The user must click on the tabs order wise. If the user clicks at a tab which is not in the right order, then flash a message saying “try again”
All the tabs must be arranged in right order.
Check for the quality of the alignment. If the RMSD value is low, then the structural alignment is good. Thereby, the structure prediction was correct 7
Interactivity option 2.a - True/False - Questions
Interactivity Type Options Results
1
2
5
3
4True or False Flash the Questions one at a time. User needs to
press either the “Green tab” marked “TRUE” or the “Red Tab” marked “FALSE”. If the answer is correct flash “Tick”. If the answer is incorrect flash “Cross”. For all questions which have an answer “False”, also mention the correct answer as shown in the next slide
Next Slide
GO stands for “Genetic Oncology”
DALI is a server for Protein Structural Alignment
SCOP is a classification scheme for Nucleic Acids
p-value is one of the result from Structural Alignment
In protein secondary structure, “e” stands for coil
RMSD stands for “Root Mean Square Distance”
TRUE
FALSE
Interactivity option 2.b - True/False - Correct Answers
Interacativity Type Options Results
1
2
5
3
4True or False
Flash the Questions one at a time. User needs to press either the “Green tab” marked “TRUE” or the “Red Tab” marked “FALSE”. If the answer is correct flash “Tick”. If the answer is incorrect flash “Cross”
The questions are followed by their correct answers
GO stands for “Genetic Oncology”
DALI is a server for Protein Structural Alignment
SCOP is a classification scheme for Nucleic Acids
p-value is one of the result from Structural Alignment
In protein secondary structure, “e” stands for coil
RMSD stands for “Root Mean Square Distance”
TRUE
FALSE
FALSE
FALSE
FALSE
TRUE
GO stands for “Genetic Ontology”
SCOP is a classification scheme
for ProteinsIn protein secondary
structure, “e” stands for beta sheets
RMSD stands for “Root Mean Square Deviation”
Interactivity option 2.c - True/False - Example
Boundary/limitsInteracativity Type Options Results
1
2
5
3
4True or False
Flash the Questions one at a time. User needs to press either the “Green tab” marked “TRUE” or the “Red Tab” marked “FALSE”. If the answer is correct flash “Tick”. If the answer is incorrect flash “Cross” and the correct answer as mentioned in the next slide
This is an example slide to show the various cases of answers.
GO stands for “Genetic Oncology”
TRUE
FALSE
The correct answer
is “False”. GO stands for “Genetic Ontology”
DALI is a server for Protein Structural Alignment
SCOP is a classification scheme for Nucleic Acids
SCOP is a classification scheme
for Proteins
Questionnaire1. Which is the server for Protein Structure Prediction ?
Answers: a) ProtParam b) PeptideMass c) nnPREDICT
d) DALI
2. Which is the server for Functional annotation of Proteins?
Answers: a) DALI b) GOR c) SSAP d) Proteome
Analyst
3. Which amongst these is NOT the output for Functional annotation?
Answers: a) GO Term b)Source Organism c) Probability
of annotation d) Description of Function
4. By default, PDB structures appear in which visualization tool?
Answers: a) VMD b) NAMD c) Jmol d) None of the
above
5. PDB is primarily which Database?
a) Protein b) Nucleotide c) Gene d) None of the Above
1
5
2
4
3
Links for further readingReference websites
http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.htmlhttp://cubic.bioc.columbia.edu/predictprotein/
http://ekhidna.biocenter.helsinki.fi/dali_lite/starthttp://kiharalab.org/web/pfp.php
http://pa.cs.ualberta.ca:8080/pa/index.htmlhttp://www.ebi.ac.uk/Tools/clustalw2/index.html
http://www.pdb.org/pdb/home/home.do
http://expasy.org/sprot/
http://expasy.org/prosite/
http://webdocs.cs.ualberta.ca/~bioinfo/PA/
Links for further reading
Following URLs are used for animations
http://www.pdb.org/pdb/search/advSearch.do
http://www.pdb.org/pdb/explore/explore.do?structureId=1AUM
http://www.pdb.org/pdb/workbench/showPrecalcAlignment.do?action=pw_fatcat&mol=1A8O.A&mol=1BAJ.A
http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.htmlhttp://www.pdb.org/pdb/explore/remediatedSequence.do?
structureId=1AO6 http://kiharalab.org/web/pfp.php
Links for further readingPublished Literature
SCOP: A Structural Classification of Proteins Databasefor the Investigation of Sequences and Structures
Alexey G. Murzin, Steven E. Brenner, Tim Hubbard and Cyrus Chothia.J. Mol. Biol. (1995) 247, 536–540
CATH — a hierarchic classification of protein domain structuresCA Orengo, AD Michie, S Jones, DT Jones, MB Swindells and
JM Thornton Structure 1997, Vol 5 No 8
Books:
Bioinformatics Sequence and Genome Analysis by David Mount