clostox: a clostridium toxin database and phylogeny viewer for pathema-clostridium

1
Clostox: A Clostridium Toxin Database and Phylogeny Viewer for Pathema- Clostridium Seth Schobel , Susmita Shrivastava, Erin Beck, Lauren Brinkac, Tanja Davidsen, Granger Sutton J. Craig Venter Institute, 9704 Medical Center Dr. Rockville, MD, 20850 USA Pathema-Clostridium (http://pathema.jcvi.org) is a clade specific NIAID Bioinformatics Resource Center designed to support the Clostridium biodefense and infectious disease research community. Pathema-Clostridium targets the detailed curation of Clostridium botulinum and C. perfringens, but also includes all other available Clostridium sequenced genomes. After IBRCC 2007 and extensive polling of the Clostridium research community it was determined that a comprehensive database of all Clostridium toxin genes would be an effective extension of the Pathema resource. As a result of this community feedback we have developed Clostox, a Clostridium toxin database, and integrated it with the current Pathema-Clostridium web resource. A literature search was performed to identify all the toxin genes associated with Clostridium botulinum strains. The protein sequences were downloaded from NCBI and processed through the JCVI’s annotation pipeline and curated. Additionally, each protein was blasted against all other proteins in the Pathema-Clostridium database to incorporate informative comparative analyses into the Clostox repsoitory. To leverage the addition of Clostox to Pathema, we have developed an integrated tool for viewing the phylogeny of toxin sequences, both from the Clostox database and from a user- specified list. This allows for classification of unidentified strains for which the neurotoxin sequence is known. The tool has been expanded to benefit Pathema as a whole by accepting input from the Protein vs. All Alignment page of any gene, as well as from the Gene Cart. Abstract Pathema-Clostridium, as part of the Bioinformatics Resource Center, is a community research portal designed to suit the needs of the community. In response to the community’s request for a tool oriented toward toxin research, the J. Craig Venter Institute (JCVI) has created CLOSTOX . Clostox is web resource available through Pathema-Clostridium. Underlying the web resource is a database containing Clostridium toxin gene annotation data. The set of genes currently available on Clostox consists of Clostridium botulinum toxins from serotypes A, B, C, D, E, F and G; as well as physiological groups I, II, III, and IV. The toxin genes are thus divided into nine serotype groups: group I (A, B, F), group II (B, E, F), group III (C, D) and group IV (G). The main advantage of collecting toxin genes in one repository is that JCVI is able to apply consistent annotations and annotation data types to the entire set of publicly available toxin genes. This consistency allows for more informative comparison between the various toxin annotations by the utilization of a common set of terms and evidence types. Furthermore, each of the toxin genes has been run through Pathema’s all vs all protein analysis. This allows Clostox to display a pre-computed functional comparison of each toxin gene against all proteins in Pathema-Clostridium. As with all genes found on Pathema, Clostox gene annotations can be updated to include community annotations. In addition to annotation data, there are several tools integrated into the site that allow for tailored toxin research. First, a CLOSTOX GENE SEARCH has been adapted to display gene collections in any of the nine serotype groups. Next, Clostox supports BLAST searches against all of the Clostox protein sequences. Finally, CLOSTOX PHYLOGENY VIEWER , a multiple sequence alignment and tree drawing tool has been integrated to Clostox. This tool allows for phylogenetic comparisons of the toxin protein sequences. Clostox Phylogeny Viewer has three gateways and currently uses ClustalW to generate the multi sequence alignment. The main gateway is a stand alone page that allows the user to choose between all, one, or any combination of serotype groups and draws a rooted dendrogram. Similarly, the user may provide their own protein sequences to generate a tree using the sequences from any or all of the nine serotype groups. Multiple sequence alignments are also available through the Protein vs. All link, accessible from the gene page for each toxin. The GENE CART is the last gateway to the phylogeny tool. With this approach a user can place all the genes of which they are interested into their cart and generate a tree from just those protein sequences. As with all data and tools on Pathema-Clostridium, users have the option to download all gene and protein sequences, annotation data, and analysis results associated with the Clostox resource. Overview We wish to acknowledge the efforts of Sean Daugherty from the University of Maryland School of Medicine. This project is funded by The National Institute of Allergy and Infectious Diseases (NIH-NIAID-DMID-04-34). Acknowledgments In the coming months JCVI plans on expanding the Clostox database by adding toxin genes from Clostridium perfringens and Clostridium butyricum. In addition to adding these toxin genes, NAPs will be added to the database across the entire Clostridium clade. Routine updates will be made to reflect updates to current Clostox annotation and inclusion of additional toxin and NAP genes as they are released publicly. The Clostox web resource is slated for enhancements to include a synonymous accessions search available from the Clostox Toxin Search, as well as a strain search. The phylogeny viewer will be enhanced with the option to use T-Coffee to generate alignments for tree generation in addition to ClustalW. Future Development Contact Us We are actively soliciting feedback from the community. Please e-mail [email protected] if you have any suggestions or data corrections. Clostox Tools: •Toxin Search •Blast •Phylogeny Viewer

Upload: pathema

Post on 17-Jul-2015

1.098 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Clostox: A Clostridium Toxin Database and Phylogeny Viewer for Pathema-Clostridium

Clostox: A Clostridium Toxin Database and Phylogeny Viewer for Pathema-ClostridiumSeth Schobel, Susmita Shrivastava, Erin Beck, Lauren Brinkac, Tanja Davidsen, Granger Sutton

J. Craig Venter Institute, 9704 Medical Center Dr. Rockville, MD, 20850 USA

Pathema-Clostridium (http://pathema.jcvi.org) is a clade specific NIAID Bioinformatics Resource Center designed to support the Clostridium biodefense and infectious disease research community. Pathema-Clostridium targets the detailed curation of Clostridium botulinum and C. perfringens, but also includes all other available Clostridium sequenced genomes. After IBRCC 2007 and extensive polling of the Clostridium research community it was determined that a comprehensive database of all Clostridium toxin genes would be an effective extension of the Pathema resource. As a result of this community feedback we have developed Clostox, a Clostridium toxin database, and integrated it with the current Pathema-Clostridium web resource. A literature search was performed to identify all the toxin genes associated with Clostridium botulinum strains. The protein sequences were downloaded from NCBI and processed through the JCVI’s annotation pipeline and curated. Additionally, each protein was blasted against all other proteins in the Pathema-Clostridium database to incorporate informative comparative analyses into the Clostox repsoitory. To leverage the addition of Clostox to Pathema, we have developed an integrated tool for viewing the phylogeny of toxin sequences, both from the Clostox database and from a user-specified list. This allows for classification of unidentified strains for which the neurotoxin sequence is known. The tool has been expanded to benefit Pathema as a whole by accepting input from the Protein vs. All Alignment page of any gene, as well as from the Gene Cart.

Abstract

Pathema-Clostridium, as part of the Bioinformatics Resource Center, is a community research portal designed to suit the needs of the community. In response to the community’s request for a tool oriented toward toxin research, the J. Craig Venter Institute (JCVI) has created CLOSTOX. Clostox is web resource available through Pathema-Clostridium. Underlying the web resource is a database containing Clostridium toxin gene annotation data. The set of genes currently available on Clostox consists of Clostridium botulinum toxins from serotypes A, B, C, D, E, F and G; as well as physiological groups I, II, III, and IV. The toxin genes are thus divided into nine serotype groups: group I (A, B, F), group II (B, E, F), group III (C, D) and group IV (G). The main advantage of collecting toxin genes in one repository is that JCVI is able to apply consistent annotations and annotation data types to the entire set of publicly available toxin genes. This consistency allows for more informative comparison between the various toxin annotations by the utilization of a common set of terms and evidence types. Furthermore, each of the toxin genes has been run through Pathema’s all vs all protein analysis. This allows Clostox to display a pre-computed functional comparison of each toxin gene against all proteins in Pathema-Clostridium. As with all genes found on Pathema, Clostox gene annotations can be updated to include community annotations. In addition to annotation data, there are several tools integrated into the site that allow for tailored toxin research. First, a CLOSTOX GENE SEARCH has been adapted to display gene collections in any of the nine serotype groups. Next, Clostox supports BLAST searches against all of the Clostox protein sequences. Finally, CLOSTOX PHYLOGENY VIEWER, a multiple sequence alignment and tree drawing tool has been integrated to Clostox. This tool allows for phylogenetic comparisons of the toxin protein sequences. Clostox Phylogeny Viewer has three gateways and currently uses ClustalW to generate the multi sequence alignment. The main gateway is a stand alone page that allows the user to choose between all, one, or any combination of serotype groups and draws a rooted dendrogram. Similarly, the user may provide their own protein sequences to generate a tree using the sequences from any or all of the nine serotype groups. Multiple sequence alignments are also available through the Protein vs. All link, accessible from the gene page for each toxin. The GENE CART is the last gateway to the phylogeny tool. With this approach a user can place all the genes of which they are interested into their cart and generate a tree from just those protein sequences. As with all data and tools on Pathema-Clostridium, users have the option to download all gene and protein sequences, annotation data, and analysis results associated with the Clostox resource.

Overview

We wish to acknowledge the efforts of Sean Daugherty from the University of Maryland School of Medicine. This project is funded by The National Institute of Allergy and Infectious Diseases (NIH-NIAID-DMID-04-34).

Acknowledgments

In the coming months JCVI plans on expanding the Clostox database by adding toxin genes from Clostridium perfringens and Clostridium butyricum. In addition to adding these toxin genes, NAPs will be added to the database across the entire Clostridium clade. Routine updates will be made to reflect updates to current Clostox annotation and inclusion of additional toxin and NAP genes as they are released publicly. The Clostox web resource is slated for enhancements to include a synonymous accessions search available from the Clostox Toxin Search, as well as a strain search. The phylogeny viewer will be enhanced with the option to use T-Coffee to generate alignments for tree generation in addition to ClustalW.

Future Development

Contact UsWe are actively soliciting feedback from the community. Please e-mail [email protected] if you have any suggestions or data corrections.

Clostox Tools:•Toxin Search•Blast•Phylogeny Viewer