picodiv will amass large amount of data –cultures –sequences –environmental data databases...
DESCRIPTION
Web siteTRANSCRIPT
• PICODIV will amass large amount of data– cultures– sequences– environmental data
• Databases– keep track of data produced– verify the data – avoid errors– make data quickly available to all
• EU requirement
PICODIV databases
Databases• Taxonomy• Cultures• SSU rRNA sequences• Probes• Environmental data• Other ?
– Pigments– TEM pictures
Web site
Web data interface
Taxonomy
Taxonomy: pigments
Cultures: RCC catalog
Cultures: additional informationPicture
Synechococcus MAX 42
0
0.2
0.4
0.6
0.8
1
400 450 500 550Longueur d'onde
Spectre Pigments
• Flow cytometry, RFLP
Cultures
• Starter cutures --> Environmental database• Unialgal --> RCC catalog (not released)
EMBL
Sequence data bases: input
PICODIVenvironmentalcultures
Access database
Automatic query
Email as fasta file
• SSU vs LSU
• Full length ?
• Taxonomy ?
VB program
Filtre
ARB aligned
- phylogeny (trees)- probe design
Sequence data bases: output
Raw sequences
- BLAST
Access database
Full sequences
All sequences
Webperiodic update
• Import files under EMBL format
• Mark all new sequences aligned: date + person (e.g. 20-jun-2000 DV) pub: n or PICODIV author: e.g. K Valentin
• Fast align by finding the closest relative with the PT-server SSU_RNA
• Quick add marked species to existing tree (use a sub-tree rather than the full tree)
• If tree incorrect remove from tree and align again to closest relative (either known or from BLAST search)
• Save only changes (not whole database)
• Update PT-Server
ARB processing
Novel sequences have not been added to the full tree (tree_all_dec98), except for mitochondrial sequences. Two subtrees have been extracted and new sequences added to them:
Tree name Method Sequences Type of sequences added
tree_all_dec98 Parsimony 13804 mito
tree_euk_algae Parsimony 1695 nuclear: only lower eukaryotes
tree_cyano_plastid Parsimony 341 cyanobacteria and plastids
ARB trees
Probe database
Environmental data bases• One per site
• Sampling code
• Hydrological and meteorological data
• Sampling information (volumes, protocols etc…)
• Culture isolation data
• Measurement data
• flow cytometry
• pigments
• TEM
• probes
Cultures Sequence
ProbesTaxonomy
Environment
Interacting data bases
It is our responsabilty to keep PICODIV databases updated
for the benefit of all