biological sequence databases

Post on 11-May-2015

321 Views

Category:

Education

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

BIOLOGICAL SEQUENCE

DATABASES

2

NCBIWhat is NCBI?National center for biotechnology

informationEstablished in 1998Part of national library of medicine at

national institute of healthMajor aim : public databaseDevelopment of software tools for

sequence analysis and disseminate biomedical information

3

2 explain Roles of NCBI

1) Maintenance of biological databases whether primary or secondary. It includes GENEBANK

2) NCBI provides the data retrieval systems such as ENTREZ

3) Provides computational sources for the analysis of the GENEBANK data and other biological data

4

Kinds of databases

Primary databases Secondary databases

Original submission by the experimentalists who have originally searched

Content Is controlled by the submitters

Examples include GENEBANK, SNP and GEO

Built up from primary data which is retrieved by primary database

Content controlled by third party NCBI

Examples include RefSeq, RefSNP, NCBI Structure, Protein. Etc.

5

NCBI homepage

6

NCBI

TOOLS

BLAST

Standard blast Mega blast

PSI-blast PHI-blast

RPS blastBLAST 2 SEQ

DATABASE RETREIVAL

TOOL

SPECIALIZED TOOL

ORF finder E-pcr

Sequence submission tool bankit

Spidey

DATABASES

Nucleotide database

Literature database

Protein database

Expression database

Structure database

7

Retrieval tool ENTREZ

Integrated database search and retrieval system

Provides extensive links between and within database records

Cross references of different databases

8

3 Sequence submission to NCBIDatabases are constantly

updated with the newer submissions of the sequences via sequence submission tools such as:

BankitSequein

9

Bank it Web-based sequence submission

tool Connect to NCBI Home Page Connect to GENEBANK side bar

at leftTool of choice for simple

submissionsCan also be used for updating

previously added information

10

SequeinStand alone sequence

submission and updating tool Handling multiple sequence

submission Provides increased capacity for

long sequence submissions Multiple annotationPhylogenetic analysis population

11

BLAST

Basic local alignment search tool program

Sequence similarity searches against a variety of different sequence databases

Unigene, gene, MMDB, GEO

12

Kinds of BLAST

Blastn Blastp Blastx Tblastn Tblastx

13

SPECIALIZED TOOLS

There are a lot of sequence analysis tools which will be explained later

1) ORF Finder2) e-PCR3) SPIDEY

14

ORF FINDEROpen reading frame finderGraphical analysis toolFinds all open reading frames in

the user’s sequence or the sequence already submitted in the databases

Uses standard and alternative genetic codes for the analysis of reading frames

Packaged with sequein

15

e-PCR

Electronic polymerase chain reaction

Searches for the STSWhole template DNA is searched

for STSNew database searches a query

sequence against a sequence database

16

SpideyThis is another m RNA to genome

alignment tool Searches databases via BLAST As an input it gets a single

genomic sequence and m RNA FASTA sequences

Pseudo genes and paralogues are eliminated in this search and rue gene is selected.

17

Databases of NCBI

Nucleotide

Literature

Protein

Gene expression

Structure

Chemical

18

Nucleotide database- GENEBANKNCBI’s primary sequence

databaseComprehensive public database

of nucleotide sequencesBibliographic supportBuilt from authors entry into

genebak regarding EST Genebank an EMBL make an

INSDCollaborative approach to share

data daily

19

HOMOLOGENEAutomated detection of

homologuesCompletely sequenced

eukaryotic genesAnalyses the proteins of the input

organism BlastpTaxonomic trees are being madeStatistical analysis of each match

is done and orthologs and paralogs are identified

20

Db SNPDatabase of single nucleotide

polymorphismsShort deletion and insertions

polymorphismsSNP~ 3D structures via Cn3D

and MMDB Functional variants could be

matched with the OMIM

21

Literature database- PMC

Pubmed central Digital archive of peer review

journals of life sciences Enormous full text journals are

thereImmediate access to full text

journals or within 12 months of publishing

22

Protein databaseENTREZ PROTEIN ~ Protein

sequence database of NCBIDatabases are cross searched PDB, Swiss-ProtTaxonomic relations CDD conserved domain database

23

Gene expression databaseDistribution and regulation of the

Transcriptional products Normal and abnormal cell typesLot of techniques have been

developed for survey of genome wide transcript expression

24

SAGE mapSerial analysis of gene

expression map Gene expression data analysisTag-to-gene function map SAGE tags to gene clusters or a

single gene A reciprocal gene to tag SAGE

Map is also available Updated weekly

25

Structural database- MMDBMolecular modeling database

MMDB3D macromolecular structuresXRD and NMR are being used for

the experimental structure determination

Evolutionary history of function Relationship between

macromolecules.

26

27

28

29

DATABASES

30

Chemical database- PubchemDatabase for the chemical

molecules Freely accessed through web-user

interfaceChemical structureDiagnostic and therapeutic agentsMolecular mass below 2000uBridge between macromolecular

genomics and small organic molecules of cellular metabolism

31

32

Display settings

33

Aspirin

34

35

Thanks

top related