gmod in the cloud genome informatics november 3, 2011 scott cain gmod project coordinator ontario...

Post on 20-Jan-2016

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

GMOD in the Cloud

Genome InformaticsNovember 3, 2011

Scott CainGMOD Project CoordinatorOntario Institute for Cancer Researchscott@scottcain.net

Click to edit the title text formatIntroduction: GMOD is …

• A set of interoperable open-source software components for visualizing, annotating, and managing biological data.

• An active community of developers and users asking diverse questions, and facing common challenges, with their biological data.

Click to edit the title text formatWho uses GMOD?

Plus hundreds of others

Click to edit the title text formatGMOD in the Cloud

What GMOD in the cloud isn't:

Clouds

Guy gettingblown up

Garry's MOD (aka gmod.com)

Click to edit the title text formatSeveral GMOD Cloud Projects

Galaxy - Web-based platform for data intensive biomedical research

CloVR - Automated and portable sequence analysis

GBrowse2 - Web-based, scalable genome browser

cloud.gmod.org - Several integrated GMOD tools

http://gmod.org/wiki/Cloud

Click to edit the title text formatGalaxy Cloudman

Get Galaxy without the data or usage limitations.

Combine with Cloud BioLinux to have access to MANY tools.

Create an analysis cluster in minutes. Use autoscaling to get good performance at

low cost.

http://wiki.g2.bx.psu.edu/Admin/Cloud

Click to edit the title text formatDeploying Galaxy cluster on AWS

1.

2.3.

4.

Click to edit the title text formatExercising elasticity with autoscaling

Computation time: 9 Computation time: 9 hrshrs

Fixed cluster size5 nodes Computation cost: $20Computation cost: $20

20 nodes Computation cost: $50Computation cost: $50

Computation time: 6 hrsComputation time: 6 hrs

1 to 16 nodes

Computation time: 6 hrsComputation time: 6 hrs

Dynamic cluster size

Computation cost: $20Computation cost: $20

Click to edit the title text formatCloVR

Cloud Virtual Resource. Automated pipeline for sequence analysis. Uses 2 GMOD tools: Workflow and Ergatis. Use a virtual machine locally to interact with

resources in the cloud.

http://clovr.org/

Click to edit the title text formatCloVR Architecture

Click to edit the title text formatWhy the virtual machine?

Running the pipeline happens on the local machine, while the heavy lifting is done on the cloud/cluster

Click to edit the title text formatGBrowse2

Installed and configured recent release of GBrowse2.

Tools to allow automatically adding rendering servers.

Ability to add standard data sets.

http://gmod.org/wiki/GBrowse

Click to edit the title text formatGBrowse2

Yeast

Fly Worm

Human

Amazon Snapshots

RenderSlaves

Master

GBrowse2 in the Cloud

Click to edit the title text format

Click to edit the title text formatcloud.gmod.org

Tripal Drupal-based web frontend

Chado Generic organism DB schema

GBrowse Venerable genome browser

JBrowse Fast, AJAX genome browser

Sample data Saccharomyces cerevisiae

GMOD tools preinstalled:

Can be run as a micro machine (albeit slowly)

Click to edit the title text formatA little more on Tripal

Based on the popular CMS Drupal.

Several modules written to serve as an interface for Chado:Controlled Vocabularies

Features

Analyses

Libraries

Stocks

Integrated job management

Click to edit the title text format

Click to edit the title text format

Click to edit the title text format

Click to edit the title text formatPotential use case for Cloud GMOD

Community annotation:Just add a web-start Apollo and set the security

group to allow it to connect to the database.

When WebApollo is ready, it's even easier: WA is an addon to JBrowse but allows collaborative editing.

Tripal and Drupal allow editing of most data types in Chado, and commenting on pages similar to a blog.

Click to edit the title text formatWhy use the cloud?

Avoid installation related issues (saves you time and frustration!)

Save money (how much, of course, depends)

Availability of common genomic data sets (several projects already make these available at AWS)

Click to edit the title text formatFuture work

Get GBrowse2 AMI public (very soon)

Add Apollo to gmod.cloud.org (relatively soon)

Add WebApollo to gmod.cloud.org (as soon

as it's released)

Click to edit the title text formatConclusion

http://gmod.org/wiki/Cloud for more information on GMOD work in the cloud.

http://cloud.gmod.org/ for a running example of cloud.gmod.org.

http://clovr.org/ for more info on CloVR and to download the client VM.

http://getgalaxy.org/ for more information on getting Cloudman.

Click to edit the title text formatAcknowlegements

• Funding agencies: NIH, USDA ARS, NSF, Ontario Ministry of Economic Development and Innovation • Lincoln Stein, Chris Vandevelde • Enis Afgan and the Galaxy Team• Sam Angiuoli et al at UofM SOM• Stephen Ficklin and the Tripal group• Mitch Skinner and JBrowse developers• The rest of the GMOD community

top related