wfleabase daphnia genome database from common components daphnia genomic consortium meeting, sept....

16
wFleaBase Daphnia Genome Database from Common Components Daphnia Genomic Consortium Meeting, Sept. 2003 Don Gilbert, gilbertd @indiana.edu

Upload: susan-barnett

Post on 25-Dec-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

wFleaBaseDaphnia Genome Database from Common Components

Daphnia Genomic Consortium

Meeting, Sept. 2003

Don Gilbert, [email protected]

http://iubio.bio.indiana.edu/daphnia

A Replicable Genome infOrmation System ( Argos )

http://eugenes.org/argos | flybase.net/flybase-ng

common/java/ ; perl/ -- program libraries and packages

servers/ -- major programs (BLAST, MySql/PostgreSQL, others)

systems/ -- OS executables of programs

daphnia/ .. implemented organism genome systems

eugenes/

flybase/

docs/ & install/ -- Argos instructions and usage

template/ -- structure for new projects

ROOT/ -- common directory of installed projects

Argos featuresCommon genome tool set

Share benefits of “best of breed” genome tools Common parts are tested & maintained by others Minimal IT expertise (no compiles or system management) Choice of tools (existing or new genome DB use parts desired)

Flexible project packages Project needs specify tool set (compare EnsEMBL where all use one set) Own look’n’feel web pages, contents, functions Security for protected and public sections

Easy replication to any Unix computer ‘Live’ database system replication using rsync Keep remote servers up-to-date every day Local cluster/grid for high-volume traffic Works on common workstations, laptops

Argos common parts

Java common library, Ant builds, XML Tools,

Web Services (Axis), Lucene for “Google”-like searches

Perl common library of BioPerl, GBrowse, others

Servers include

Apache, Tomcat web servers

MySQL, PostgreSQL databases

BLAST (NCBI)

Systems compiled for

apple-powerpc-darwin, intel-linux, sun-sparc-solaris

wFleaBase structure

Cgi-bin -- Web programs(Perl)

Common -- Link to common, shared tools

Conf -- Site configurations for web, data

Data -- Bulk data & FTP site folder

Dbs -- Project databases: blast, lucene, mysql

Indices -- Database indices

Lib -- Program libraries

Web -- Web structure and documents

Genomics, Sequences, Maps, Literature, Stocks, Docs, other

includes Public and Protected (project member only) parts

Webapps -- Web programs (Java)

includes Search system, Secure web and editing

Sea

rch

wF

leaB

ase

BLA

ST

wF

leaB

ase

Edit wFleaBase

Where to put Daphnia Genome?

Database needs Automated annotation and curated updates Search and retrieve data subsets

Choices EnsEMBL - working now, Gramene & others

use GMOD:Chado - in development

(FlyBase,WormBase, ChlamyGenome,TIGR, others will use)

Others choices?

Generic Model Organism Database Construction Set

Genome+ Database (more than annotations)

Genome visualization tools Genome annotation pipeline planned Literature curation and Gene Ontology

tools Component system (pick and choose) Developing - more complete in 2004

www.gmod.org

EnsEMBL Genome Database

Genome annotation database Genome visualization tools Genome annotation pipeline Comprehensive system (all or none) Production - useable now

www.ensembl.org

From Shawn Hoon, Fugu Informatics Group

wFleaBase issues

• Basic web system ready for genome data?

• Start with EnsEMBL for management; move to GMOD:Chado if better choice?

• Add GMOD GBrowse; Apollo Editor with genome

• Add “Self-service” database features for?• Easy management by scientists • Genome data; stocks; research literature• Add evolutionary, ecological, environmental data

Prototype at http://iubio.bio.indiana.edu/daphnia/

GB

row

se M

aps

Apo

llo A

nnot

ator