the uniprot knowledgebase a hub of integrated protein data
DESCRIPTION
The UniProt knowledgebase www.uniprot.org a hub of integrated protein data. [email protected] Swiss-Prot group, Geneva SIB Swiss Institute of Bioinformatics . Science cover , february 2011. data knowledge. p rotein sequence functional information. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/1.jpg)
[email protected] group, GenevaSIB Swiss Institute of Bioinformatics
The UniProt knowledgebase
www.uniprot.org
a hub of integrated protein data
![Page 2: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/2.jpg)
Science cover, february 2011
![Page 3: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/3.jpg)
protein sequence functional information data knowledge
![Page 4: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/4.jpg)
UniProt consortium
EBI : European Bioinformatics Institute (UK)SIB : Swiss Institute of Bioinformatics (CH)PIR : Protein information resource (US)
![Page 5: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/5.jpg)
www.uniprot.org
![Page 6: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/6.jpg)
UniProt databases
![Page 7: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/7.jpg)
UniProtKB: protein sequence knowledgebase, 2 sections UniProtKB/Swiss-Prot and UniProtKB/TrEMBL (query, Blast, download) (~15 mo entries)
UniParc: protein sequence archive (ENA equivalent at the
protein level). Each entry contains a protein sequence with cross-links to other databases where you find the sequence (active or not). Not annotated (query, Blast, download) (~25 mo entries)
UniRef: 3 clusters of protein sequences with 100, 90 and 50 % identity; useful to speed up sequence similarity search (BLAST) (query, Blast, download) (UniRef100 10 mo entries; UniRef90 7 mo entries; UniRef50 3.3 mo entries)
UniMES: protein sequences derived from metagenomic projects (mostly Global Ocean Sampling (GOS)) (download) (8 mo entries, included in UniParc)
![Page 8: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/8.jpg)
UniProt databasesThe central piece
![Page 9: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/9.jpg)
UniProtKBan encyclopedia on proteins
composed of 2 sectionsUniProtKB/TrEMBL and UniProtKB/Swiss-Prot
unreviewed and reviewed automatically annotated and manually annotated
released every 4 weeks
![Page 10: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/10.jpg)
UniProtKBOrigin of protein sequences
UniProtKB protein sequences are mainly derived from
- INSDC (translated submitted coding sequences - CDS)- Ensembl (gene prediction ) and RefSeq sequences- Sequences of PDB structures- Direct submission or sequences scanned from literature
Notes: - UniProt is not doing any gene prediction- Most non-germline immunoglobulins, T-cell receptors , most patent
sequences, highly over-represented data (e.g. viral antigens), pseudogenes sequences are excluded from UniProtKB, - but stored in UniParc
- Data from the PIR database have been integrated in UniProtKB since 2003.
15 %
85 %
![Page 11: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/11.jpg)
Swiss-Prot
TrEMBL
EMBL
Automated extraction of protein sequence (translated CDS), gene name and
references.Automated annotation
Manual annotation of the sequence and associated
biological information
![Page 12: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/12.jpg)
UniProtKB/TrEMBL
unreviewedAutomatic annotation
released every 4 weeks
![Page 13: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/13.jpg)
One protein sequenceOne species
Automated annotationKeywords
and Gene Ontology
Automated annotationFunction, Subcellular location,
Catalytic activity, Sequence similarities…
Automated annotationtransmembrane domains,
signal peptide…
Cross-references to over 125 databases
References
Protein and gene namesTaxonomic information
UniProtKB/TrEMBLwww.uniprot.org
![Page 14: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/14.jpg)
UniProtKB/TrEMBL
Automatic annotation Protein sequence
- The quality of the protein sequences is dependent on the information provided by the submitter of the original nucleotide entry (CDS) or of the gene prediction pipeline (i.e. Ensembl). - 100% identical sequences (same length, same organism are merged automatically).
Biological information Sources of annotation- Provided by the submitter (EMBL, PDB, TAIR…)- From automated annotation (automated generated annotation
rules (i.e. SAAS) and/or manually generated annotation rules (i.e. UniRule))
![Page 15: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/15.jpg)
![Page 16: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/16.jpg)
![Page 17: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/17.jpg)
Example of fully automatic annotation: SAAS
• Rules are derived from the UniProtKB/Swiss-Prot manual annotation.
• Fully automated rule generation based on C4.5 decision tree algorithm.
• One annotation, one rule.
• High stringency – require 99% or greater estimated precision to generate annotation (test on UniProtKB/Swiss-Prot)
• Rules are produced, updated and validated at each release.
UniProtKB/TrEMBL
![Page 18: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/18.jpg)
UniProtKB/Swiss-Prot
reviewedmanually annotated
released every 4 weeks
![Page 19: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/19.jpg)
MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGAARAFDQIDNAPEEKARGITINTSHVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQVGVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALE GDAEWEAKILELAGFLDSYIPEPERAIDKPFLLPIEDVFSISGRGTVVTGRVERGIIKVGEEVEIVGIKETQKSTCTGVEMFRKLLDEGRAGENVGVLLRGIKREEIERGQVLAKPGTIKPHTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEMVMPGDNIKMV VTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLG
One protein sequenceOne gene
One species
Manual annotationKeywords
and Gene Ontology
Manual annotationFunction, Subcellular location,
Catalytic activity, Disease, Tissue specificty, Pathway…
Manual annotationPost-translational modifications,
variants, transmembrane domains, signal peptide…
Cross-references to over 125 databases
References
Protein and gene namesTaxonomic information
Alternative products:protein sequences produced by
alternative splicing, alternative promoter usage,
alternative initiation…
UniProtKB/Swiss-Protwww.uniprot.org
![Page 20: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/20.jpg)
UniProtKB/Swiss-Prot
Manual annotation
1. Protein sequence (merge available CDS, annotate sequence discrepancies, report sequencing mistakes…)
2. Biological information (sequence analysis, extract literature information, ortholog data propagation, …)
![Page 21: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/21.jpg)
UniProtKB/Swiss-Prot
1- Protein sequence curation
![Page 22: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/22.jpg)
The displayed protein sequence: …canonical, representative, consensus…
+alternative sequences (described within the entry)
1 entry <-> 1 gene (1 species)
UniProtKB/Swiss-Prot
a gene-centric view of the protein space
![Page 23: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/23.jpg)
What is the current status?• At least 20% of Swiss-Prot entries required a
minimal amount of curation effort so as to obtain the “correct” sequence.
• Typical problems– unsolved conflicts– uncorrected initiation sites– frameshifts– wrong gene prediction– other ‘problems’
![Page 24: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/24.jpg)
UCSC genome browserexamples of CDS annotation submitted to INSDC…
![Page 25: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/25.jpg)
UniProtKB/Swiss-Prot
2- Biological data curation
![Page 26: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/26.jpg)
UniProtKB/Swiss-Prot gathers data form multiple sources:
- publications (literature/Pubmed)- prediction programs (Prosite, TMHMM, …)- contacts with experts - other databases- nomenclature committees
An evidence attribution system allows to easily trace the source of each annotation
Extract literature informationand protein sequence analysis
maximum usage of controlled vocabulary
![Page 27: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/27.jpg)
Protein and gene names
![Page 28: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/28.jpg)
…enable researchers to obtain a summary of what is known about a protein…
General annotation (Comments)
www.uniprot.org
![Page 29: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/29.jpg)
Human protein manual annotation: some statistics (June 2011)
![Page 30: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/30.jpg)
Sequence annotation (Features)
…enable researchers to obtain a summary of what is known about a protein…
www.uniprot.org
![Page 31: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/31.jpg)
Non-experimental qualifiers UniProtKB/Swiss-Prot considers both experimental and
predicted data and makes a clear distinction between both
Type of evidence QualifierStrong experimental evidence None or Ref.X
Light experimental evidence Probable
Inferred by similarity with homologous protein
By similarity
Inferred by prediction Potential
![Page 32: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/32.jpg)
Find all the proteins localized in the cytoplasm (experimentally
proven) which are phosphorylated on a serine
(experimentally proven)
![Page 33: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/33.jpg)
• The ‘Protein existence’ tag indicates what is the evidence for the existence of a given protein;
• Different qualifiers:1. Evidence at protein level (~18%) (MS, western blot (tissue specificity), immuno (subcellular
location),…)2. Evidence at transcript level (~19%)3. Inferred from homology (~58 %)4. Predicted (~5%)5. Uncertain (mainly in TrEMBL)
‘Protein existence’ tag
http://www.uniprot.org/docs/pe_criteria
![Page 34: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/34.jpg)
![Page 35: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/35.jpg)
UniProtKBAdditional information
can be found in the cross-references (to more than 140 databases)
![Page 36: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/36.jpg)
2D gel2DBase-EcoliANU-2DPAGEAarhus/Ghent-2DPAGE (no server)COMPLUYEAST-2DPAGECornea-2DPAGE DOSAC-COBS-2DPAGEECO2DBASE (no server)OGPPHCI-2DPAGEPMMA-2DPAGERat-heart-2DPAGEREPRODUCTION-2DPAGESiena-2DPAGESWISS-2DPAGEUCD-2DPAGEWorld-2DPAGE
Family and domainGene3DHAMAPInterProPANTHERPfamPIRSFPRINTSProDomPROSITESMARTSUPFAMTIGRFAMs
Organism-specificAGDArachnoServerCGDConoServerCTDCYGD dictyBaseEchoBASEEcoGeneeuHCVdbEuPathDBFlyBaseGeneCardsGeneDB_SpombeGeneFarmGenoListGrameneH-InvDB HGNCHPA LegioListLepromaMaizeGDBMGIMIMneXtProtOrphanet PharmGKBPseudoCAPRGDSGDTAIRTubercuListWormBaseXenbaseZFIN
Protein family/groupAllergomeCAZyMEROPSPeroxiBasePptaseDBREBASETCDB
Genome annotationEnsemblEnsemblBacteriaEnsemblFungiEnsemblMetazoaEnsemblPlantsEnsemblProtistsGeneIDGenomeReviewsKEGGNMPDRTIGRUCSCVectorBase
Enzyme and pathwayBioCycBRENDAPathway_Interaction_DBReactome
OtherBindingDBDrugBank NextBio PMAP-CutDB
SequenceEMBLIPIPIRRefSeqUniGene
3D structureDisProtHSSPPDBPDBsumProteinModelPortalSMR
PTMGlycoSuiteDBPhosphoSitePhosSite
UniProtKB/Swiss-Prot:129 explicit links
and 14 implicit links!
ProteomicPeptideAtlasPRIDEProMEX
PPIDIPIntAct MINTSTRING
Phylogenomic dbseggNOGGeneTreeHOGENOMHOVERGENInParanoidOMAOrthoDBPhylomeDBProtClustDB
PolymorphismdbSNP
Gene expressionArrayExpressBgeeCleanExGenevestigatorGermOnline
Ontologies GO
![Page 37: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/37.jpg)
The UniProt web site www.uniprot.org
• Powerful search engine, google-like and easy-to-use, but also supports very directed field searches
• Scoring mechanism presenting relevant matches first
• Entry views, search result views and downloads are customizable
• The URL of a result page reflects the query; all pages and queries are bookmarkable, supporting programmatic access
• Search, Blast, Align, Retrieve, ID mapping
![Page 38: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/38.jpg)
Search
A very powerful text search tool with autocompletion and refinement
options allowing to look for UniProt entries and documentation by
biological information
![Page 39: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/39.jpg)
Find all human proteins located in the nucleus
![Page 40: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/40.jpg)
The search interface guides users with helpful suggestions and hints
![Page 41: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/41.jpg)
![Page 42: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/42.jpg)
Advanced Search
A very powerful search tool
To be used when you know in which entry section the information is stored
![Page 43: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/43.jpg)
Find all the protein localized in the cytoplasm (experimentally
proven) which are phosphorylated on a serine
(experimentally proven)
![Page 44: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/44.jpg)
Result pages: highly customizable
![Page 45: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/45.jpg)
Result pages: downloadable
![Page 46: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/46.jpg)
![Page 47: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/47.jpg)
The URL can be bookmarked and manually
modified.
![Page 48: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/48.jpg)
Blast
A tool associated with the standard options to search
sequences in different UniProt databases and
data sets
![Page 49: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/49.jpg)
Blast: customize the result display
![Page 50: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/50.jpg)
Blast: local alignment sequence annotation highlighting option
![Page 51: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/51.jpg)
Align
A ClustalW multiple alignment tool with
sequence annotation highlighting option
![Page 52: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/52.jpg)
Align
sequence annotation highlighting option
![Page 53: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/53.jpg)
Retrieve
A UniProt specific tool allowing to retrieve a list of entries in several standard identifiers formats.
You can then query your ‘personal database’ with the UniProt search tool.
![Page 54: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/54.jpg)
Query your own dataset
![Page 55: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/55.jpg)
ID Mapping
Gives the possibility to get a mapping between different databases for a given
protein
![Page 56: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/56.jpg)
These identifiers are all pointing to a TP53 (p53) protein sequence !
P04637, NP_000537, NP_001119584.1, NP_001119585.1, NP_001119584.1, NP_001119584.1, NP_001119584.1, NP_001119584.1, ENSG00000141510, CCDS11118, UPI000002ED67, IPI00025087, etc.
![Page 57: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/57.jpg)
![Page 58: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/58.jpg)
Download
![Page 60: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/60.jpg)
Canonical and isoform sequences (fasta format)
![Page 61: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/61.jpg)
A few words on the UniProt ‘complete proteome’
sequence sets…
![Page 62: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/62.jpg)
2’747 complete proteomes
Genome completely sequenced
Proteins mapped to the genome
Entries tagged with the KW ‘Complete proteome’
UniProtKB/Swiss-Prot isoform sequences are available in FASTA format only
Fully manually reviewed (e.g. S. cerevisiae)Partially manually reviewed (e.g. Homo sapiens)Unreviewed (e.g. Acinetobacter baumannii (strain 1656-2))
UniProtKB - complete proteomes
![Page 63: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/63.jpg)
Can be downloaded:
From our complete proteome page www.uniprot.org/taxonomy/complete-proteomes
From the ‘ftp download ‘ page
By querying UniProtKB + download Query: organism:93062 AND keyword:"complete proteome"
UniProtKB - complete proteomes
Additional information: www.uniprot.org/faq/15
![Page 64: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/64.jpg)
Query UniProtKB + download
![Page 65: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/65.jpg)
![Page 66: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/66.jpg)
Human proteome ~ 20’200 genes
Query for ‘homo sapiens’ (August 2011)• UniProtKB: 110,056 entries + alt sequences (~ 15’435) = 125’491• UniProtKB/Swiss-Prot: 20’244 entries + alt sequences (~ 15’435) =
35’679• UniProtKB/TrEMBL: 89,834 entries• RefSeq: 32’898 sequences• Ensembl: 90’720 sequences
Query for ‘homo sapiens’ + Complete proteome (KW-181)• UniProtKB: 56’392 + alt sequences (15’435) = 71’827• UniProtKB/Swiss-Prot: 20’238 + alt sequences (15’435) = 35’673• UniProtKB/TrEMBL: 36’154
92% of human entries are linked with at least one RefSeq entry…
![Page 67: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/67.jpg)
Summary
![Page 68: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/68.jpg)
![Page 70: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/70.jpg)
The UniProt ConsortiumSIBIoannis Xenarios, Lydie Bougueleret, Andrea Auchincloss, Kristian Axelsen, Delphine Baratin, Marie-Claude Blatter, Brigitte Boeckmann, Jerven Bolleman, Laurent Bollondi, Emmanuel Boutet, Lionel Breuza, Alan Bridge, Edouard de Castro, Lorenzo Cerutti, Elisabeth Coudert, Béatrice Cuche, Mikael Doche, Dolnide Dornevil, Severine Duvaud, Anne Estreicher, Livia Famiglietti, Marc Feuermann, Sebastien Gehant, Elisabeth Gasteiger, Alain Gateau, Vivienne Gerritsen, Arnaud Gos, Nadine Gruaz-Gumowski, Ursula Hinz, Chantal Hulo, Nicolas Hulo, Janet James, Florence Jungo, Guillaume Keller, Vicente Lara, Philippe Lemercier, Damien Lieberherr, Xavier Martin, Patrick Masson, Anne Morgat, Salvo Paesano, Ivo Pedruzzi, Sandrine Pilbout, Sylvain Poux, Monica Pozzato, Manuela Pruess, Nicole Redaschi, Catherine Rivoire, Bernd Roechert, Michel Schneider, Christian Sigrist, Karin Sonesson, Sylvie Staehli, Eleanor Stanley, André Stutz, Shyamala Sundaram, Michael Tognolli, Laure Verbregue, Anne-Lise Veuthey
EBIRolf Apweiler, Maria Jesus Martin, Claire O'Donovan, Michele Magrane, Yasmin Alam-Faruque, Ricardo Antunes, Benoit Bely, Mark Bingley, David Binns, Lawrence Bower, Wei Mun Chan, Emily Dimmer, Francesco Fazzini, Alexander Fedotov, John Garavelli, Leyla Garcia Castro, Rachael Huntley, Julius Jacobsen, Michael Kleen, Duncan Legge, Wudong Liu, Jie Luo, Sandra Orchard, Samuel Patient, Klemens Pichler, Diego Poggioli, Nikolas Pontikos, Steven Rosanoff, Tony Sawford, Harminder Sehra, Edward Turner, Matt Corbett, Mike Donnelly and Pieter van Rensburg
PIRCathy H. Wu, Cecilia N. Arighi, Leslie Arminski, Winona C. Barker, Chuming Chen, Yongxing Chen, Pratibha Dubey, Hongzhan Huang, Kati Laiho, Raja Mazumder, Peter McGarvey, Darren A. Natale, Thanemozhi G. Natarajan, Jules Nchoutmboube, Natalia V. Roberts, Baris E. Suzek, Uzoamaka Ugochukwu, C. R. Vinayaka, Qinghua Wang, Yuqi Wang, Lai-Su Yeh and Jian Zhang
www.uniprot.org
![Page 71: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/71.jpg)
UniProt is mainly supported by the National Institutes of Health (NIH) grant 1 U41 HG006104-01. Additional support for the EBI's involvement in UniProt comes from the NIH grant 2P41 HG02273-07. Swiss-Prot activities at the SIB are supported by the Swiss Federal Government through the Federal Office of Education and Science and the European Commission contracts SLING (226073), Gen2Phen (200754) and MICROME (222886). PIR activities are also supported by the NIH grants 5R01GM080646-04, 3R01GM080646-04S2, 1G08LM010720-01, and 3P20RR016472-09S2, and NSF grant DBI-0850319.
![Page 72: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/72.jpg)
www.isb-sib.ch
![Page 73: The UniProt knowledgebase a hub of integrated protein data](https://reader036.vdocument.in/reader036/viewer/2022081422/56816504550346895dd773d7/html5/thumbnails/73.jpg)
Thank you for your attention
http://education.expasy.org/cours/Prague2011/