comprehensive microbial resource bioinformatics visualization workshop owen white may 30, 2002

28
Comprehensive Microbial Resource www.tigr.org/CMR Bioinformatics Visualization Bioinformatics Visualization Workshop Workshop Owen White Owen White May 30, 2002 May 30, 2002

Post on 22-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Comprehensive Microbial Resource

www.tigr.org/CMR

Bioinformatics Visualization Bioinformatics Visualization WorkshopWorkshop

Owen WhiteOwen White

May 30, 2002May 30, 2002

Page 2: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Curation Genome AnnotationGenome Annotation

Michelle GwinnMichelle Gwinn Bob DodsonBob Dodson Bob DeBoyBob DeBoy James KolonayJames Kolonay Bill NelsonBill Nelson Ramana MadupuRamana Madupu Sean DaughertySean Daugherty Maureen BeananMaureen Beanan Scott DurkinScott Durkin Lauren BrinkacLauren Brinkac

Bioinformatics EngineersBioinformatics Engineers Jeremy PetersonJeremy Peterson Lowell UmayamLowell Umayam Samual AngiuoliSamual Angiuoli

TIGRFAMs/GroupsTIGRFAMs/Groups Dan HaftDan Haft Jeremy SelengutJeremy Selengut

Maria Ermolaeva Maria Ermolaeva (Operons/Terminators)(Operons/Terminators)

Erik Ferlanti (All vs. All)Erik Ferlanti (All vs. All) FacultyFaculty

Jonathan Eisen (DNA Jonathan Eisen (DNA repair)repair)

Ian Paulsen Ian Paulsen (transporters)(transporters)

Steven Salzberg Steven Salzberg CollaboratorsCollaborators

Swiss-protSwiss-prot Monica RileyMonica Riley The open source crowdThe open source crowd Art Delcher (Glimmer)Art Delcher (Glimmer)

Page 3: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

RetrievalH

eter

ocer

cal-

For

ked-

Lun

ate-

Em

argi

nate

-

Tru

ncat

e-

Rou

nded

-

Poi

nted

-Caudal Fins

http://web.pdx.edu/~bowersn/bi399/lecture2.html

Page 4: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Caudal FinsDorsal Spines Dorsal Rays

Retrieval across data types.

Page 5: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Typical annotation datatypesclone_info: Tracks information related to the parent nucleotide assembly, including its annotation status,

which institution the sequence was derived, and whether it is part of a larger assembly such as a chromosome.

asm_feature: All major features of the parent assembly are stored here, including annotated genes, predicted genes, repetitive elements, splice sites, and all underlying components of a gene (models, transcript exons, and cds exons).

phys_ev: Attribute for each gene component within the asm_feature table. For example, each predicted and annotated gene has a model and multiple exons stored in the asm_feature table. Linking the feature to phys_ev will identify the type of feature present: ie. glimmer, genscan+, genemarkHMM, or working (annotation). This becomes important if a single feature in the asm_feature table is shared by multiple model types.

feat_link: This table is key to the principles behind representing gene models in the database. All parent and child relationships are defined here.

evidence: The main repository for all sequence database search results. Also, it retains information regarding gene model attributes such as the best blast match and all Pfam matches.

ident: Stores attributes for the highest element of the gene component hierarchy, the transcriptional unit. Gene names, loci, EC symbols, and other attributes are available.

role_link: The role category assignments for each gene are available here. Roles include examples such as ‘transcription’, ‘DNA synthesis’, ‘translation’, ‘DNA repair’, ‘amino acid metabolism’, etc.

Page 6: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Omniome Content, GenesTotal # of genes: 132,998 from world-wide effort. (43,311 TIGR projects). 36,274 w/ genetic names. 15,098 genes placed into 5,451 paralogous

families.

413 rRNAs.

1311 tRNAs.

49 sRNAs.

293 IS elements.

Page 7: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Omniome ContentEvidence: 1073 distinct EC#s, assigned to 17308 genes Rows of allVall data:  3,996,851 Rows of HMM TIGRFAM data: 91,550 Rows of HMM Pfam data: 131,963 Rows of COG data: 149,940 Rows of Interpro data: 175,760 Rows of Prosite data: 53,132 Rows of BER data: 91,899

Page 8: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 9: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 10: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 11: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

TIGRFAM Matrix

Page 12: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 13: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 14: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

The Genome Browser: Linear Display of DNA Molecules

Page 15: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Genome vs. Genome Protein Hits

Page 16: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

MUMmer: The Whole Genome Alignment Tool

Page 17: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 18: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Role Category Graph

Page 19: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 20: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 21: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 22: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 23: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 24: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Multi-Genome Query ToolQuery across all genomes based on different properties MW, pI, membrane spanning regions Taxon, Paralogous families, TIGRFAMs, Role

Category Best Match to: organism, locus, kingdom, etc.

“Genes with >5 membrane spanning regions and MW 36,000-51,000d.”

“E. coli genes with best match to Archeoglobis involved in DNA metabolism.”

Page 25: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Pseudo-Restriction Digest and Linear Depiction of Cuts

Page 26: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 27: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002
Page 28: Comprehensive Microbial Resource  Bioinformatics Visualization Workshop Owen White May 30, 2002

Position effect: