genbank overview genbank ® is the nih genetic sequence database, an annotated collection of all...

10
ه اول س ل ج ک ی ت ورما ف ن وا ی ت: ادی ب ول ا س ود ر ع س م# وری ردا گ

Upload: sharlene-stewart

Post on 02-Jan-2016

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately

اول جلسهانفورماتیک بیو

آبادی: رسول مسعود گردآوری

Page 2: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately

GenBank Overview• GenBank® is the NIH genetic sequence database, an

annotated collection of all publicly available DNA sequences.

• There are approximately 106,533,156,756 bases in 108,431,692 sequence records in the traditional GenBank divisions and 148,165,117,763 bases in 48,443,067 sequence records in the WGS division* as of August 2009.

* Whole Genome ShotgunThe large set of contigs or the finished sequences without annotation

from the proceeding genome project can be submitted to DDBJ/EMBL-Bank/GenBank as WGS data.

Page 3: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately

GenBank Overview

• GenBank is part of the International Nucleotide Sequence Database Collaboration,

• which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI.

• These three organizations exchange data on a daily basis.

Page 4: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately
Page 5: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately

Submissions to GenBank

• There are several options for submitting data to GenBank:

1. BankIt, a WWW-based submission tool for convenient and quick submission of sequence data.

2. Sequin, NCBI's stand-alone submission software for MAC, PC, and UNIX platforms, is available by FTP. When using Sequin, the output files for direct submission should be sent to GenBank by e-mail

Page 6: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately

Submissions to GenBank

1. tbl2asn, a command-line program, automates the creation of sequence records for submission to GenBank using many of the same functions as Sequin. It is used primarily for submission of complete genomes and large batches of sequences.

2. Barcode Submission Tool, a WWW-based tool for the submission of GenBank sequences and trace data for Barcode of Life projects. Currently, only mitochondrial cytochrome c oxidase subunit I (COI) genes are being accepted with this tool.

Page 7: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately

Access to GenBank

• There are several ways to search and retrieve data from GenBank:

1. Search GenBank for sequence identifiers and annotations with Entrez Nucleotide, which is divided into three divisions:

CoreNucleotide (the main collection), dbEST (Expressed Sequence Tags), dbGSS (Genome Survey Sequences).

Page 8: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately

Access to GenBank

• 2-Search and align GenBank sequences to a query sequence using BLAST (Basic Local Alignment Search Tool).

• BLAST searches CoreNucleotide, dbEST, and dbGSS independently.

3-Search, link, and download sequences programatically using NCBI e-utilities.

Page 9: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately

GenBank Data Usage

• The GenBank database is designed to provide and encourage access within the scientific community to the most up to date and comprehensive DNA sequence information.

NCBI places no restrictions on the use or distribution of the GenBank data.

Page 10: GenBank Overview GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are approximately

New Developments

• NCBI is continuously developing new tools and enhancing existing ones to improve both submission and access to GenBank.

• The easiest way to keep abreast of these and other developments is:

1. to sign up on the NCBI Announce e-mail list2. read the NCBI News, available via the web 3. check the "What's New" section of the NCBI Web page