creating a community cyberinfrastructure for advanced marine microbial ecology research and analysis...

18
Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury Gordon and Betty Moore Foundation Palo Alto, CA March 18, 2009 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD

Upload: olivia-forbes

Post on 27-Mar-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA)

Invited Talk Honoring David Kingsbury

Gordon and Betty Moore FoundationPalo Alto, CA

March 18, 2009

Dr. Larry Smarr

Director, California Institute for Telecommunications and Information Technology

Harry E. Gruber Professor,

Dept. of Computer Science and Engineering

Jacobs School of Engineering, UCSD

Page 2: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

The Beards are Still Working TogetherTwo Decades Later!

David Kingsbury and John WooleyNSF 1987

Larry SmarrNCSA 1985

Page 3: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

PI Larry Smarr

David Kingsbury Call to LS July 31, 2005 Grant Announced January 17, 2006

Page 4: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

The Moore Foundation Was an Early Funder As The National Consensus Emerged

“The emerging field of metagenomics,

where the DNA of entire communities of microbes is studied simultaneously,

presents the greatest opportunity -- perhaps since the invention of

the microscope – to revolutionize understanding of

the microbial world.” –

National Research CouncilMarch 27, 2007

NRC Report:

Metagenomic data should

be made publicly

available in international archives as rapidly as possible.

Page 5: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

Calit2 Microbial Metagenomics Cluster-Next Generation Optically Linked Science Data Server

512 Processors ~5 Teraflops

~ 200 Terabytes Storage 1GbE and

10GbESwitched/ Routed

Core

~200TB Sun

X4500 Storage

10GbE

Source: Phil Papadopoulos, SDSC, Calit2

Page 6: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

CAMERA Timeline

2006 2007 2008 2009

Alpha Preview of

CAMERA 2.0

CAMERA 1.3.2.28

CAMERA 2.0

CAMERA 2.0 Beta

Start of CAMERA

Availability of GOS Data (0.7)

CAMERA 1.0

CAMERA 1.2.6

Source: Jeff Grethe, NCMIR, CAMERA, UCSD

Page 7: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

Marine Genome Sequencing Project – CAMERA Anchor Dataset Launched March 13, 2007

Measuring the Genetic Diversity of Ocean Microbes

Specify Ocean Data

Each Sample ~2000

Microbial Species

Page 8: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

Moore Foundation Enabled the Sequencing of the Full Genome Sequence of 155+ Marine Microbes

www.moore.org/microgenome

Page 9: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

CAMERA Houses the Community’s ExpandingEnvironmental Metagenomics Datasets

Rapidly Expanding to Include New Community DatasetsNow Releasing An Additional Dataset Per Week!

March 16, 2008

Page 10: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

CAMERA Timeline

2006 2007 2008 2009

Alpha Preview of

CAMERA 2.0

CAMERA 1.3.2.28

CAMERA 2.0

CAMERA 2.0 Beta

Start of CAMERA

Availability of GOS Data (0.7)

CAMERA 1.0

CAMERA 1.2.6

Source: Jeff Grethe, NCMIR, CAMERA, UCSD

Page 11: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

Current CAMERA InterfaceMarch 17, 2009

Page 12: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

The CAMERA Project Has Established a GlobalMarine Microbial Metagenomics Cyber-Community

2700 Registered Users From 76 Countries

Page 13: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

Building the Metagenomics Community Through Annual Meetings

Page 14: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

Prototyping Next Generation User Access and Analysis-Between Calit2 and U Washington

Ginger Armbrust’s Diatoms:

Micrographs, Chromosomes,

Genetic Assembly

Photo Credit: Alan Decker Feb. 29, 2008

iHDTV: 1500 Mbits/sec Calit2 to UW Research Channel Over NLR

The Disease is Spreading!• c.f. Dave Karl, Hawaii• Ed DeLong, MIT

Page 15: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

CAMERA Timeline

2006 2007 2008 2009

Alpha Preview of

CAMERA 2.0

CAMERA 1.3.2.28

CAMERA 2.0

CAMERA 2.0 Beta

Start of CAMERA

Availability of GOS Data (0.7)

CAMERA 1.0

CAMERA 1.2.6

Source: Jeff Grethe, NCMIR, CAMERA, UCSD

Page 16: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

Calit2 is Creating CAMERA 2.0 --Advanced Cyberinfrastructure Service Oriented Architecture

Source: CAMERA CTO Mark Ellisman

Page 17: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

CAMERA Is a Contributing Member of the Genome Standards Consortium

• Standardizing Contextual Metadata– Members from EU, UK, US

• Goals are to Promote– Standardization of Genomic Descriptions– Exchange & Integration of Genomic Data

• Metadata Standardization Key Enabler– MIMS: Min Info for Metagenomic Sample– GCDML: Standard format

• NSF Research Coordination Network for Genomic Standards Consortium (John Wooley = PI) – Allows Calit2 to Support Genomic and Metagenomic Standards– Extends the GSC to Broader Biocommunity– Provides Through CAMERA Another Channel for GBMF Investigators

and CAMERA to be Central to Community Dialogue

Source: Paul Gilna, John Wooley, Calit2

Page 18: Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury

Investigator submits proposal to GBMF

Investigator submits metadata to CAMERA CAMERA sends

acknowledgement to Investigator, Seq. Group, GBMF

Seq. Group send barcoded sample “kit” to investigators Seq. Group

Upload data to CAMERA (& Investigator)

Data & Metadata Released in six months

Metadata now collected before sequence data: GSC-compliant

Project-ID serves as acceptance-proof

Sample is Received and Sequenced

Solexa and SOLiD Next!

Webb Miller and Stephan C. Schuster, and Roche / 454 Genome Sequencer

GBMF Data Acquisition Pipeline:A New Data Submission Paradigm-Metadata First!

Source: Paul Gilna, Calit2