dr. dieter maier manchester ontologies workshop 23/24.3.02
DESCRIPTION
FunCat TM , a controlled vocabulary encompassing the biology of prokaryotes, plants and animals from cellular to systemic level. Dr. Dieter Maier Manchester Ontologies Workshop 23/24.3.02 Biomax Informatics AG, Lochhamer Str. 11, 82152 Martinsried, Germany. Outline. Objectives Structure - PowerPoint PPT PresentationTRANSCRIPT
Biomax Informatics AG Bioinformatics designed with you in mind.
FunCatTM, a controlled vocabulary encompassing the biology of prokaryotes, plants and animals from
cellular to systemic level
Dr. Dieter Maier
Manchester Ontologies Workshop 23/24.3.02
Biomax Informatics AG, Lochhamer Str. 11, 82152 Martinsried, Germany
Biomax Informatics AG Bioinformatics designed with you in mind.
Outline
• Objectives
• Structure
• Content
• Development
• Use
Biomax Informatics AG Bioinformatics designed with you in mind.
Objectives
• Automatic data management
• No prior knowledge of vocabulary required
• Group genes by functional categories
• Extensible
• Organism independent
• Compatible to other ontologies
Biomax Informatics AG Bioinformatics designed with you in mind.
Disclaimer
what the FunCat is what the FunCat is notnot::
- Tool for the complete description of functions on a Tool for the complete description of functions on a single gene levelsingle gene level
Biomax Informatics AG Bioinformatics designed with you in mind.
Structure
• Organized hierarchicall
• Related functions grouped on different levels
• Internally consistent
=>=> Provides a data warehouse
- overview about available selection
- progress from general to specific
- infere from specific to general
Biomax Informatics AG Bioinformatics designed with you in mind.
Hierarchical structure
5´-end 5´-end processingprocessing
rRNA-processingrRNA-processing
tRNA-transcriptiontRNA-transcription
rRNA-transcriptionrRNA-transcription
mRNA-transcriptionmRNA-transcription
mRNA-processingmRNA-processing
TranscriptionTranscription
Biomax Informatics AG Bioinformatics designed with you in mind.
Content
• Covers cellular processes, systemic physiology, development and anatomy
from procaryotes to the human
• 25 main Categories with ~ 1500 sub-categories
• Categories are independent of organism
• Genes can belong to multiple categories
Biomax Informatics AG Bioinformatics designed with you in mind.
Metabolism: 247
Energy: 60
Cell cycle and DNA processing: 54
Transcription: 31
Protein synthesis (Translation): 11
Protein fate (folding, modification, destination): 25
Cellular transport: 32
Cellular communication: 47
Cell rescue, defense and virulence: 50
Regulation / interaction with cellular environment: 45
Cell fate: 54
Systemic regulation / interaction with environment : 89
Development (systemic): 51
Transposable Elements, viral and plasmid proteins: 8
Control of cellular organisation: 57
Cell type differentiation: 69
Tissue differentiation: 40
Organ differentiation: 91
Enzymatic activity => EC ~ 4400Protein activity regulation: 23Protein with binding function / cofactor requirement: 49Transport facilitation: 49
Molecular function: 122
Biological process: 1061
Subcellular localisation: 63Cell type localisation: 69Tissue localisation: 41Organ localisation: 91
Localisation: 256
Biomax Informatics AG Bioinformatics designed with you in mind.
Development
• Historical
• Pathways
• Thesaurus
• Complex relations
Biomax Informatics AG Bioinformatics designed with you in mind.
Structural development
• Proven flexibility – easy to extend
• Stable overall structure
• Compatibel to other ontologies like- Enzyme Cataloge- Gene Ontology- EcoCyce
Biomax Informatics AG Bioinformatics designed with you in mind.
Development in numbers
S. cerevisiae 1996
Main categories: 16
Depth: 4
Total: 182
Plant (A. thaliana)and Procaryotes 1998
20
6
528
Animals (Human) 2001
25
6
1448
Biomax Informatics AG Bioinformatics designed with you in mind.
Integrating Pathways into processes
- hierachical structure allows:- Univocal attribution - Test for completeness- Test for consistence
Biomax Informatics AG Bioinformatics designed with you in mind.
Integrating additional information
• Create a dynamic ontology from existing ontologies, keywords and linguistic extraction of descriptors from the literature
• Semiautomatic mapping of dynamic ontologie to FunCat
Biomax Informatics AG Bioinformatics designed with you in mind.
Enabling complex relations
• Intensify multidimensionality
• Enable if ... then ... relations
Biomax Informatics AG Bioinformatics designed with you in mind.
Use
• Manual annotation
• Automatic annotation
• Data mining
Biomax Informatics AG Bioinformatics designed with you in mind.
Manual annotation
- multidimensional- stepwise
Four dimensions
Biomax Informatics AG Bioinformatics designed with you in mind.
Manual annotation
• 17 manually annotated genomes (5 eucaryotes, 12 procaryotes)
• H.sapiens, A.thaliana, S.cerevisiae, N.crassa, propriatary: A.niger
• B.subtilis, T.acidophilum, Listeria, 6 public procaryotes in progress, propriatary: C.glutamicum, C.pneumoniae, 1 undisclosed
• Used for annotation of Transcriptomes
Biomax Informatics AG Bioinformatics designed with you in mind.
Automatic Annotation
Sequence similarity to manually annotated proteins(distinguish experimentally verified and similarity associated function):
- H. sapiens- A. thaliana- S. cerevisiae- B. subtilis- T. acidophilum
Biomax Informatics AG Bioinformatics designed with you in mind.
PEDANT Genome Database
Currently more than 170 genomes (600 000 ORFs)
Bacteria Archea Eucarya
Thermotogales
Flavobacteria
Cyanobacteria
Proteobacteria
Gram positives
Greennon-sulfurbacteria
Pyrodictium
Thermoproteus
Methanococcus
Methanobacterium
Methanosarcina Extremehalophiles
EntamoebaSlimemolds
Animals
Fungi
Plants
Ciliates
Flagellates
Trichomonades
Microsporida
Diplomonades
Biomax Informatics AG Bioinformatics designed with you in mind.
Data mining
• Retrieval
• Visualisation
• Mining
• Integration
Biomax Informatics AG Bioinformatics designed with you in mind.
Queries using the FunCat: Grouplevel
- Looking for groups of genes:
Biomax Informatics AG Bioinformatics designed with you in mind.
Single molecule level
- Retrieving protein entries:
Biomax Informatics AG Bioinformatics designed with you in mind.
The human FunCat
Unclassified
Metabolism
Energy
cell cycle Transcription TranslationProtein fate
Intracellular Transport
Defense
Signalling
Cellphysiology
Biomax Informatics AG Bioinformatics designed with you in mind.
Comparing genomes
Sequence similairty „ functional homology“
Identification of organism specific functions
Biomax Informatics AG Bioinformatics designed with you in mind.
Comparing H.sapiens – B.subtilis
0
5
10
15
20
25
30
H.sapiens
B.subtilis
Metabolism
Protein
fate Cellular
communication
Interaction
with cellu
lar
environment
Biomax Informatics AG Bioinformatics designed with you in mind.
Integrative analysis
Gene expression data
Functionalcatalogue
Functionalcatalogue
Functionalcatalogue
Functionalcatalogue
Protein-proteininteraction data
Protein expression data
Biomax Informatics AG Bioinformatics designed with you in mind.
Limitations
Co-expression is no proof of functional association.
Integrate evidence from multiple sources.
Biomax Informatics AG Bioinformatics designed with you in mind.
Integration with annotation
Analyse gene expression data using integration with annotation catalogues.
Functional catalogue
Phenotypes
Interaction
Biomax Informatics AG Bioinformatics designed with you in mind.
FunCat
Tool to structure information
Tool to connect information