protein sequence analysis - overview - nih proteomics workshop 2007 raja mazumder scientific...
TRANSCRIPT
![Page 1: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/1.jpg)
Protein Sequence Analysis- Overview -
NIH Proteomics Workshop 2007
Raja MazumderScientific Coordinator, PIR
Research Assistant Professor, Department of Biochemistry and Molecular Biology
Georgetown University Medical Center
![Page 2: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/2.jpg)
Topics
Proteomics and protein bioinformatics (protein sequence analysis)
Why do protein sequence analysis? Searching sequence databases Post-processing search results Detecting remote homologs
![Page 3: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/3.jpg)
Clinical proteomics
From Petricoin et al., Nature Reviews Drug Discovery (2002) 1, 683-695
![Page 4: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/4.jpg)
Single protein and shotgun analysis
Adapted from: McDonald et al. (2002). Disease Markers 18:99-105
Protein Bioinformatics
Mixture of proteinsG
el b
ased
sep
erat
ion
Single protein analysis
Digestion of protein mixture
Spot excisionand digestion
LC orLC/LC separation
Shotgun analysis
Peptides from a single protein
Peptides from many proteins
MS analysisMS/MS analysis
![Page 5: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/5.jpg)
Protein bioinformatics: protein sequence analysis
Helps characterize protein sequences in silico and allows prediction of protein structure and function
Statistically significant BLAST hits usually signifies sequence homology
Homologous sequences may or may not have the same function but would always (very few exceptions) have the same structural fold
Protein sequence analysis allows protein classification
![Page 6: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/6.jpg)
Development of protein sequence databases
Atlas of protein sequence and structure – Dayhoff (1966) first sequence database (pre-bioinformatics). Currently known as Protein Information Resource (PIR)
Protein data bank (PDB) – structural database (1972) remains most widely used database of structures
UniProt – The Universal Protein Resource (2003) is a central database of protein sequence and function created by joining the forces of the Swiss-Prot, TrEMBL and PIR protein database activities
![Page 7: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/7.jpg)
Comparative protein sequence analysis and evolution
Patterns of conservation in sequences allows us to determine which residues are under selective constraint (and thus likely important for protein function)
Comparative analysis of proteins is more sensitive than comparing DNA
Homologous proteins have a common ancestor
Different proteins evolve at different rates
Protein classification systems based on evolution: PIRSF and COG
![Page 8: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/8.jpg)
PIRSF and large-scale annotation of proteins
PIRSF is a protein classification system based on the evolutionary relationships of whole proteins
As part of the UniProt project, PIR has developed this classification strategy to assist in the propagation and standardization of protein annotation
![Page 9: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/9.jpg)
Comparing proteins
Amino acid sequence of protein generated from proteomics experiment
e.g. protein fragment DTIKDLLPNVCAFPMEKGPCQTYMTRWFFNFETGECELFAYGGCGGNSNNFLRKEKCEKFCKFT
Amino-acids of two sequences can be aligned and we can easily count the number of identical residues (or use an index of similarity) as a measure of relatedness.
Protein structures can be compared by superimposition
![Page 10: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/10.jpg)
Protein sequence alignment
Pairwise alignmenta b a c d a b _ c d
Multiple sequence alignment provides more informationa b a c da b _ c dx b a c e
MSA difficult to do for distantly related proteins
![Page 11: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/11.jpg)
Protein sequence analysis overview
Protein databases PIR (pir.georgetown.edu) and UniProt
(www.uniprot.org)
Searching databases Peptide search, BLAST search, Text search
Information retrieval and analysis Protein records at UniProt and PIR Multiple sequence alignment Secondary structure prediction Homology modeling
![Page 12: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/12.jpg)
Universal Protein Resource
http://www.uniprot.org/
Literature-Based Annotation
UniProt Archive
UniProt NREF
Swiss-Prot
PIR-PSDTrEMBL RefSeq GenBank/EMBL/DDBJ
EnsEMBL PDB PatentData
Other Data
UniProt KnowledgebaseAutomated Annotation
Clustering at 100, 90, 50%
Literature-Based Annotation
UniParc
UniRef100
Swiss-Prot
PIR-PSDTrEMBL RefSeq GenBank/EMBL/DDBJ
EnsEMBL PDB PatentData
Other Data
UniProtKB
Automated mergingof sequences
Automated Annotation
UniRef90
UniRef50
![Page 13: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/13.jpg)
Peptide Search
![Page 14: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/14.jpg)
ID mapping
![Page 15: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/15.jpg)
Query Sequence
Unknown sequence is Q9I7I7
BLAST Q9I7I7 against the UniProt Knowledgebase (http://www.uniprot.org/search/blast.shtml)
Analyze results
![Page 16: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/16.jpg)
BLAST results
![Page 17: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/17.jpg)
Text searchAny Fieldnot specific
![Page 18: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/18.jpg)
Text search results: display optionsMove Pubmed ID, Pfam ID and PDB ID into “Columns in Display”
specific
![Page 19: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/19.jpg)
Text search results: add input box
![Page 20: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/20.jpg)
Text search result with null/not null
![Page 21: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/21.jpg)
UniProt beta sitehttp://beta.uniprot.org/
![Page 22: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/22.jpg)
UniProtKB protein record
![Page 23: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/23.jpg)
SIR2_HUMAN protein record
![Page 24: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/24.jpg)
Are Q9I7I7 and SIR2_HUMAN homologs?
Check BLAST results
Check pairwise alignment
![Page 25: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/25.jpg)
Protein structure prediction
Programs can predict secondary structure information with 70% accuracy
Homology modeling - prediction of ‘target’ structure from closely related ‘template’ structure
![Page 26: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/26.jpg)
Secondary structure predictionhttp://bioinf.cs.ucl.ac.uk/psipred/
![Page 27: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/27.jpg)
Secondary structure prediction results
![Page 28: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/28.jpg)
Sir2 structure
![Page 29: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/29.jpg)
Homology modelinghttp://www.expasy.org/swissmod/SWISS-MODEL.html
![Page 30: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/30.jpg)
Homology model of Q9I7I7
Blue - excellentGreen - so soRed - not good
Yellow - beta sheetRed - alpha helixGrey - loop
![Page 31: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/31.jpg)
Sequence features: SIR2_HUMAN
![Page 32: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/32.jpg)
Multiple sequence alignment
![Page 33: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/33.jpg)
Multiple sequence alignmentQ9I7I7, Q82QG9, SIR2_HUMAN
![Page 34: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/34.jpg)
Sequence features: CRAA_RABIT
![Page 35: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/35.jpg)
Identifying Remote Homologs
![Page 36: Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department](https://reader035.vdocument.in/reader035/viewer/2022062520/56649f3b5503460f94c599c1/html5/thumbnails/36.jpg)
Structure guided sequence alignment