other biological databases

43
Other biological databases

Upload: ownah

Post on 27-Jan-2016

58 views

Category:

Documents


0 download

DESCRIPTION

Other biological databases. Biological systems. Sequence data. Protein folding and 3D structure. Taxonomic data Literature. Pathways and networks. Protein families and domains. Small molecules. Whole genome data. Ontologies -GO. Biological systems. Other Biological Databases. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Other biological databases

Other biological databases

Page 2: Other biological databases

Biological systems

Taxonomic data

Literature

Protein folding and 3D structure

Small molecules

Pathways and networks

Biological systems

Protein families and domains

Whole genome data

Sequence data

Ontologies -GO

Page 3: Other biological databases

Other Biological Databases

• Transcription factor binding sites -TRANSFAC

• Protein structure databases- PDB, SCOP, CATH

• Protein family databases- Pfam, Prints, PROSITE etc.

• Chemicals and small molecules -ChEBI

• Gene expression databases –GEO, ArrayExpress

• Metabolic pathways - Reactome, KEGG

• Genome Databases- Ensembl, FlyBase, WormBase etc.

• Human genetics-related databases –HapMap, dbSNP

Page 4: Other biological databases

Transcription factor binding sites

• TRANSFAC –database of eukaryotic transcription factors: http://www.gene-regulation.com/pub/databases.html#transfac

• TESS –Transcription Element Search System –for predicting transcription factor binding sites, uses TRANSFAC: http://www.cbi.upenn.edu/tess

• TFsearch –for searching transcription factor binding sites: http://www.cbrc.jp/research/db/TFSEARCH.html

Page 5: Other biological databases

Protein structure databases

• Main resource is Protein Data Bank (PDB): http://www.rcsb.org/pdb/

• Contains the spatial coordinates of macromolecule atoms whose 3D structure has been obtained by X-ray or NMR studies

• Proteins represent more than 90% of available structures (others are DNA, RNA, sugars, viruses, protein/DNA complexes…)

• Can search by PDB code

Page 6: Other biological databases

Searching MSD

http://www.ebi.ac.uk/msd -Search by PDB code

Page 7: Other biological databases

Protein structure-related databases

• Structural family databases based on PDB –SCOP (http://scop.mrc-lmb.cam.ac.uk/scop/) and CATH (http://www.biochem.ucl.ac.uk/bsm/cath/)

• Predicted structures in SWISS-MODEL (http://swissmodel.expasy.org//SWISS-MODEL.html)

Page 8: Other biological databases

Protein family databases

• Databases that produce signatures for identifying protein families or domains

• Used for functional classification of proteins

• E.g. Pfam, PROSITE, Prints, SMART, TIGRFAMs etc.

• Integrated into single resource InterPro (http://www.ebi.ac.uk/interpro)

Page 9: Other biological databases

InterProScan sequence search

Stand-alone version available

Page 10: Other biological databases

InterPro text search

Search keyword, protein acc or InterPro acc

Page 11: Other biological databases

Results for

protein acc

Page 12: Other biological databases

Example InterPro

entry

Page 13: Other biological databases

Chemicals and small molecules

• Chemical abstracts- http://www.cas.org/• ChEBI- http://www.ebi.ac.uk/chebi• KEGG –part of it includes chemicals

http://www.genome.jp/kegg • ChemID plus -chemicals cited in NLM databases

http://chem2.sis.nlm.nih.gov/chemidplus/chemidlite.jsp

• MSD-Chem –ligands and chemicals in MSD

Page 14: Other biological databases

CheBI example entry

Page 15: Other biological databases

Hierarchy for

chemicals

Page 16: Other biological databases

Gene expression databases

• NCBI Gene Expression Omnibus (GEO) http://www.ncbi.nlm.nih.gov/geo/

• ArrayExpress http://www.ncbi.nlm.nih.gov/geo/

• Stanford microarray database http://genome-www5.stanford.edu/

• Can usually search for experiments or particular expression profiles

Page 17: Other biological databases

GEO search page

Page 18: Other biological databases

Profiles search results

Page 19: Other biological databases

Specific entry and experiment info

Page 20: Other biological databases

ArrayExpress search results

Page 21: Other biological databases

What does the data look like?

• Info on experiment, array used, etc.

• Raw or processed tab delimited file containing spots and their intensities cy3/cy5 ratios) across different samples

• Files with meta data e.g. sample info, annotation and coordinates of each spot on array

Page 22: Other biological databases

Proteomics: SWISS-2DPAGE

Page 23: Other biological databases

Enzymes and metabolic pathways

• Contain information describing enzymes, biochemical reactions and metabolic pathways;

• ENZYME and BRENDA: nomenclature databases that store information on enzyme names and reactions;

• IntEnz: Integrated relational Enzyme database

Page 24: Other biological databases

Enzyme nomenclature• E.C. (Enzyme Commission) numbers assigned based

on reactions they catalyze

• Hierarchy, high level groups:– EC 1 –Oxidoreductases– EC 2 –Transferases– EC 3 –Hydrolases– EC 4 –Lyases– EC 5 –Isomerases– EC 6 –Ligases

Page 25: Other biological databases

EC example

Page 26: Other biological databases

Metabolic Pathway databases• PATHGUIDE >200 pathways• KEGG (Kyoto encyclopedia of genes and genomes):

http://www.genome.jp/kegg -includes:– Database of chemicals, genes and networks (metabolic,

regulatory etc.)– Well-curated and quite specific

• EcoCyc (Encyclopedia of E. coli K12 genes and metabolism): http://ecocyc.org –curation of entries genome

• Reactome –curated biological pathways: http://www.reactome.org/

• GenMAPP –pathways contributed by users

Page 27: Other biological databases

http://www.genome.ad.jp/kegg

Different pathway in different species: -> comparison

Page 28: Other biological databases

Pathway in Reactome

Page 29: Other biological databases

Example of a pathway in BioCyc

Page 30: Other biological databases

Protein-protein interaction databases

• Protein-protein interaction databases store pairwise interactions or complexes

• Can get 1 to more than 20,000 interactions per publication• IntAct http://www.ebi.ac.uk/intact • DIP (Database of Interacting Proteins) http://dip.doe-

mbi.ucla.edu/• BIND (Biomolecular Interaction Network Database)

http://submit.bind.ca:8080/bind/

Page 31: Other biological databases

Protein-protein interactions in IntAct

Page 32: Other biological databases

Integrated functional interactions in STRING

Page 33: Other biological databases

Genome browsers

• Integrate sequence & functional data for a genome• Ensembl –genome browser for major eukaryotic genomes,

e.g. human, mouse etc. http://www.ensembl.org• UCSC browser -http://genome.ucsc.edu/ • FlyBase –Drosophila genome database:

http://www.ebi.ac.uk/flybase• WormBase –C. elegans: http://www.wormbase.org• PlasmoDB –Plasmodium (malaria): http://plasmodb.org• Etc.

Page 34: Other biological databases

Ensembl genome browser

Page 35: Other biological databases

Ensembl gene view 1

Page 36: Other biological databases

Ensembl gene view 2

Page 37: Other biological databases

Gene within context on chromosome

Page 38: Other biological databases

Human genetics databases

• GeneCards (http://www.genecards.org/)

• HapMap (http://hapmap.ncbi.nlm.nih.gov/)

• OMIM http://www.ncbi.nlm.nih.gov/omim

• HGDP Human Genome Diversity Project (http://hagsc.org/hgdp/files.html)

Page 39: Other biological databases

Most of the databases are disease or gene centric i.e. p53

Mutation/polymorphism databases

Page 40: Other biological databases

dbSNPhttp://www.ncbi.nlm.nih.gov/SNP/

Repository of all known mutation (human and other organisms)

Page 41: Other biological databases

Where to find the databases

• Table of addresses for major databases and tools

• Nucleic Acids Research Database issue January each year

• Nucleic Acids Research Software issue –new

• Expasy list of tools: http://ca.expasy.org/links.html

Page 42: Other biological databases

Large scale data retrieval

• Programmatic access to many databases

• MySQL access to some

• BioMart access –public and private

• FTP sites –large data downloads

Page 43: Other biological databases

Other tutorials

• http://www.ensembl.org/info/website/tutorials/index.html

• http://www.ebi.ac.uk/training/online/

• http://www.ebi.ac.uk/2can/home.html