![Page 1: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/1.jpg)
Databases
Protein Structure and Bioinformatics
Group
![Page 2: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/2.jpg)
7 Oct 2016 2
Purpose of the lecture
● provide an overview of available databases● what are they for?● the contents of the most important databases● how to query these databases● make you aware of drawbacks and pitfalls
![Page 3: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/3.jpg)
7 Oct 2016 3
Overview
● intro on databases● database models● overview of biological databases● details of often used databases and/or providers● some remarks on data quality
![Page 4: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/4.jpg)
7 Oct 2016 4
Why databases?
● Exponential growth of:– sequences
– structures
– literature
● Need for efficient storage and management tools● Need for standardization
![Page 5: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/5.jpg)
7 Oct 2016 5
Solution: databases
● coherent, consistent, designed for special purpose● data model: clearly defined data structure● database management system: easy access and
management
![Page 6: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/6.jpg)
7 Oct 2016 6
What is a database
● any organized collection of data– card filing system
– telephone book
● now: A collection of information organized in such a way that a computer program can quickly select desired pieces of data.
● you need: Database Management System (DBMS)
![Page 7: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/7.jpg)
7 Oct 2016 7
Database modelslogical structure of a database
● flat file● relational model (most used)● other:
– object-oriented, XML, hierarchical, network
● Database Management Systems (DBMS) include: MySQL, PostgreSQL, SQLite, Microsoft SQL Server,Oracle, SAP, dBASE, FoxPro, IBM DB2, LibreOffice Base and FileMaker Pro
![Page 8: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/8.jpg)
7 Oct 2016 8
Flat file
● written in plain text, standard defined format● often tab-delimited or comma-separated text files● each line is a record● fields are separated by delimiters: tabs, commas● searching only sequential
![Page 9: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/9.jpg)
7 Oct 2016 9
DNA and protein sequences in FASTA format
>gi|71902539|ref|NM_000051.3| Homo sapiens ataxia telangiectasia mutated (ATM), mRNACCGGAGCCCGAGCCGAAGGGCGAGCCGCAAACGCTAAGTCGCTGGCCATTGGTGGACATGGCGCAGGCGCGTTTGCTCCGACGGGCCGAATGTTTTGGGGCAGTGTTTTGAGCGCGGAGACCGCGTGATACTGGATGCGCATGGGCATACCGTGCTCTGCGGCTGCTTGGCGTTGCTTCTTCCTCCAGAAGTGGGCGCTGGGCAGTCACGCAGGGTTTGAACCGGAAGCGGGAGTAGGTAGCTGCGTGGCTAACGGAGAAAAGAAGCCGTGGCCGCGGGAGGAGGCGAGAGGAGTCGGGATCTGCGCTGCAGCCACCGCCGCGGTTGATACTACTTTGACCTTCCGAGTGCAGTGACAGTGATGTGTGTTCTGAAATTGTGAACCATGAGTCTAGTACTTAATGATCTGCTTATCTGCTGCCGTCAACTAGAACATGATAGAGCTACAGAACGAAAGAAAGAAGTTGAGAAATTTAAGCGCCTGATTCGAGATCCTGAAACAATTAAACATCTAGATCGGCATTCAGATTCCAAACAAGGAAAATATTTGAATTGGGATG
>gi|71902540|ref|NP_000042.3| serine-protein kinase ATM [Homo sapiens]MSLVLNDLLICCRQLEHDRATERKKEVEKFKRLIRDPETIKHLDRHSDSKQGKYLNWDAVFRFLQKYIQKETECLRIAKPNVSASTQASRQKKMQEISSLVKYFIKCANRRAPRLKCQELLNYIMDTVKDSSNGAIYGADCSNILLKDILSVRKYWCEISQQQWLELFSVYFRLYLKPSQDVHRVLVARIIHAVTKGCCSQTDGLNSKFLDFFSKAIQCARQEKSSSGLNHILAALTIFLKTLAVNFRIRVCELGDEILPTLLYIWTQHRLNDSLKEVIIELFQLQIYIHHPKGAKTQEKGAYESTKWRSILYNLYDLLVNEISHIGSRGKYSSGFRNIAVKENLIELMADICHQVFNEDTRSLEISQSYTTTQRESSDYSVPCKRKKIELGWEVIKDHLQKSQNDFDLVPWLQIATQLISKYPASLPNCELSPLLMILSQLLPQQRHGERTPYVLRCLTEVALCQDKRSNLESSQKSDLLKLWNKIWCI
![Page 10: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/10.jpg)
7 Oct 2016 10
Relational database
● database is composed of tables● each table has records (rows)● each record has fields (columns)● relational:
– tables hold logically related sets of data
– each record has a unique identifier: primary key
– relations between tables through keys
![Page 11: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/11.jpg)
7 Oct 2016 11
Relational database
● PK = primary key, unique identifier
● FK = foreign key, connects to primary key in Customer table
![Page 12: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/12.jpg)
7 Oct 2016 12
Aspects of relational databases● tables hold logically related sets of data● order of rows irrelevant (random access!)● rows are unique: no duplication of information● searching is specifying what you want:
– which field(s) from which table(s) under which condition(s)
– SQL (Structured Query Language)
● searching speed can be increased by using indexes
![Page 13: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/13.jpg)
7 Oct 2016 13
Querying a database with SQL
![Page 14: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/14.jpg)
7 Oct 2016 14
How to access databases● Web-based Graphical Users Interfaces (GUI)
– you do not see the underlying database structure
– output defined by host/provider
● File Transfer Protocol (FTP)– mostly flat files
● Application Programmers Interface (API)– you will approach database programmatically
through web services (SOAP/REST)
![Page 15: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/15.jpg)
7 Oct 2016 15
Biological database providers/host
● EBI European Bioinformatics Institute
● SIB Swiss Institute of Bioinformatics
● NCBI National Center for Biotechnology
Information
● DDBJ DNA Databank of Japan
![Page 16: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/16.jpg)
7 Oct 2016 16
Classification of biological databases
Primary: hold experimentally derived data● experimental data repositories● sequence databases● structure databases
![Page 17: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/17.jpg)
7 Oct 2016 17
Classification of biological databases
Secondary: derived information from primary databases
● sequence related● genome related● structure related● expression data (RNA, protein)● pathway information
![Page 18: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/18.jpg)
7 Oct 2016 18
Experimental data repositories
● Gene Expression Omnibus (GEO)● ArrayExpress● European Nucleotide Archive (ENA)
![Page 19: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/19.jpg)
7 Oct 2016 19
Primary sequence databases
DNA/nucleotide sequences
Ensembl (EBI/Wellcome Trust Sanger Inst.)
GenBank (NCBI)
DNA Data Bank of Japan (DDBJ)
European Nucleotide Archive (EMBL-EBI)
![Page 20: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/20.jpg)
7 Oct 2016 20
Primary sequence databases
protein sequences
UniProtKB UniProt Knowledge Base– UniProtKB/Swiss-Prot
– UniProtKB/TrEMBL
NCBI Protein
![Page 21: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/21.jpg)
7 Oct 2016 21
Primary structure databases
Protein Data Bank (PDB)
Nucleic Acid Database
Cambridge Structural Database
![Page 22: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/22.jpg)
7 Oct 2016 22
Secondary databases
● sequence related
– ProSite
– Pfam
– Enzyme
– REBase (restriction enzymes)
![Page 23: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/23.jpg)
7 Oct 2016 23
Secondary databases
● genome related
Online Mendelian Inheritance in Man
TRANSFAC (transcription factors)
![Page 24: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/24.jpg)
7 Oct 2016 24
Secondary databases● structure related
– DSSP Database of Secondary Structure Assignments
– HSSP Homology-derived Secondary Structure of Proteins
– Dali: comparing protein structures in 3D
![Page 25: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/25.jpg)
7 Oct 2016 25
Secondary databases● expression data
– Expression Atlas
– Human Protein Atlas● pathway related
– KEGG: Kyoto Encyclopedia of Genes and Genomes
![Page 26: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/26.jpg)
7 Oct 2016 26
Databases on Human Genes and Diseases
● General human genetics databases
e.g. HGMD
● General polymorphism databases
e.g NCBI SNP (dbSNP)
● Cancer gene and variant databases
e.g. COSMIC, Cancer Genome Atlas
![Page 27: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/27.jpg)
7 Oct 2016 27
Databases on Human Genes and Diseases
● Gene-, system- or disease-specific databases– Locus-Specific DataBases, see e.g. HGVS
http://www.hgvs.org
– Disease-specific, e.g. IDbases: locus-specific databases for immunodeficiency-causing variations http://structure.bmc.lu.se/idbase/
– System-specific, e.g. GWASCatalog: genome-wide association studies
![Page 28: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/28.jpg)
7 Oct 2016 28
Databases on Human Genes and Diseases
● Online Mendelian Inheritance in Man
![Page 29: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/29.jpg)
7 Oct 2016 29
Locus-Specific Databases (LSDBs) list at www.hgvs.org/locuc-specific-
mutation-databases
![Page 30: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/30.jpg)
7 Oct 2016 30
IDbases atstructure.bmc.lu.se/idbase
![Page 31: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/31.jpg)
7 Oct 2016 31
BTKbase at LOVD.nl
![Page 32: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/32.jpg)
7 Oct 2016 32
Nucleic Acids Research
● The NAR on line Molecular Biology Database Collection is published in the Database issue each year
● 2016: 1685 listings● URL: http://www.oxfordjournals.org/nar/database/c/
![Page 33: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/33.jpg)
7 Oct 2016 33
![Page 34: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/34.jpg)
7 Oct 2016 34
Wikipedia
URL: http://en.wikipedia.org/wiki/List_of_biological_databases
![Page 35: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/35.jpg)
7 Oct 2016 35
PubMed
● The access point to medicine related publications● PubMed comprises more than 26 million citations
for biomedical literature
URL: http://www.ncbi.nlm.nih.gov/pubmed
![Page 36: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/36.jpg)
7 Oct 2016 36
Some examples
● NCBI● UniProtKB/Swiss-Prot● PDB● Ensembl
![Page 37: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/37.jpg)
7 Oct 2016 37
NCBIhttps://www.ncbi.nlm.nih.gov/
![Page 38: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/38.jpg)
7 Oct 2016 38
NCBI Genetics & Medicine
![Page 39: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/39.jpg)
7 Oct 2016 39
NCBI Handbook
![Page 40: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/40.jpg)
7 Oct 2016 40
NCBI search
![Page 41: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/41.jpg)
7 Oct 2016 41
NCBI Gene: download settings
![Page 42: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/42.jpg)
7 Oct 2016 42
NCBI Gene: display settings
![Page 43: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/43.jpg)
7 Oct 2016 43
NCBI Gene: Genomic regions etc.
![Page 44: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/44.jpg)
7 Oct 2016 44
NCBI Gene: Reference sequences
![Page 45: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/45.jpg)
7 Oct 2016 45
NCBI Gene: Reference sequences
![Page 46: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/46.jpg)
7 Oct 2016 46
NCBI Gene: Reference sequences
![Page 47: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/47.jpg)
7 Oct 2016 47
NCBI Gene: Reference sequences
information about the fields in GenBank records can be found at:
● NCBI handbook● https://www.ncbi.nlm.nih.gov/genbank/samplerecord/
![Page 48: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/48.jpg)
7 Oct 2016 48
NCBI Gene: Reference sequences
![Page 49: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/49.jpg)
7 Oct 2016 49
NCBI Gene: Reference sequences
![Page 50: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/50.jpg)
7 Oct 2016 50
NCBI Gene: Reference sequences
![Page 51: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/51.jpg)
7 Oct 2016 51
NCBI dbSNP: short genetic variations
![Page 52: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/52.jpg)
7 Oct 2016 52
UniProtwww.uniprot.org
![Page 53: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/53.jpg)
7 Oct 2016 53
UniProtKB/Swiss-Prot
![Page 54: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/54.jpg)
7 Oct 2016 54
UniProtKB/Swiss-Prot
![Page 55: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/55.jpg)
7 Oct 2016 55
Protein Data Bank in Europe (PDBe)
![Page 56: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/56.jpg)
7 Oct 2016 56
Protein Data Bank (in Japan)
![Page 57: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/57.jpg)
7 Oct 2016 57
Protein Data Bank (in Japan)
![Page 58: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/58.jpg)
7 Oct 2016 58
Protein Data Bank
![Page 59: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/59.jpg)
7 Oct 2016 59
Ensemblwww.ensembl.org
![Page 60: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/60.jpg)
7 Oct 2016 60
![Page 61: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/61.jpg)
7 Oct 2016 61
Ensembl variants
![Page 62: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/62.jpg)
7 Oct 2016 62
KEGGintegrating genomic and chemical
information with systems information
![Page 63: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/63.jpg)
7 Oct 2016 63
KEGG Pathways
![Page 64: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/64.jpg)
7 Oct 2016 64
Some remarks about data quality
● how up-to-date is the database● is the database hand-curated by experts● when using data from a database, try to check these● be aware of the fact that there can be always errors
somewhere
![Page 65: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/65.jpg)
7 Oct 2016 65
Example of checking data
● checking variant descriptions can be done with the Mutalyzer Name Checker tool: https://mutalyzer.nl
● Name Checker takes a complete sequence variant description (e.g. NM_000061.2:c.214A>G)
● variant description will be checked if it is according to HGVS rules
![Page 66: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/66.jpg)
7 Oct 2016 66
Example of checking data
![Page 67: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/67.jpg)
7 Oct 2016 67
Mutalyzer Name Checker
![Page 68: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/68.jpg)
7 Oct 2016 68
Mutalyzer Name Check result (part)
![Page 69: Databases Protein Structure and Bioinformatics Group€¦ · 7 Oct 2016 12 Aspects of relational databases tables hold logically related sets of data order of rows irrelevant (random](https://reader035.vdocument.in/reader035/viewer/2022081612/5f10cf0a7e708231d44aeb31/html5/thumbnails/69.jpg)
7 Oct 2016 69
Thanks
● Protein Structure and Bioinformatics Group● BMC B13● [email protected]● http://structure.bmc.lu.se