iplant-highlights-pag2015
TRANSCRIPT
The iPlant Collaborative
Matt VaughnDirector, Life Sciences Computing
Texas Advanced Computing [email protected]
www.iplantc.org
Biology Cyberinfrastructure to Meet the Challenges of Large Datasets
What is iPlant?
The iPlant Collaborative is an NSF-
sponsored community-driven
organization that builds, operates, and
supports extensible and powerful
cyberinfrastructure for life sciences
iPlant Software Products
Name Description Target Audience
DNA Subway Educational interface to genomics topics Beginning users and educators
Discovery Environment User-friendly, petascale graphical workbench Command-line naïve users who
need to do scalable bioinformatics
Atmosphere User-friendly, on demand cloud computing and
persistent services
Users with desktop use cases or
complex software environments
Bisque Platform to facilitate cloud-based exchange and
exploration of biological images
Command-line naïve users who
need to do image analysis
Spatial Data
Infrastructure*
Platform for developing geospatial information
systems & deploying spatial data infrastructures
Command-line naïve users who
need to work with GIS data
iPlant Science APIs RESTful interface to all iPlant capabilities Advanced users, developers, 3rd
party infrastructure or service
providers
iPlant Data Store Capacious, scalable, shareable storage Shared and used by all iPlant users
iPlant Services
• Education, Outreach and
Training
• Real-time user support
• Hackathons & workshops
• Extended Collaborative
Support
• Powered by iPlant Program
2014-2015 Highlights: Thousand Plant Transcriptomes
• Marker paper1 out along with several coordinated manuscripts
• 100x increase in green plant gene coverage by Genbank
• Key insights into relationships between land plants and green algae
• Original sequence reads, assemblies & downstream analyses, plus data access APIs and workflows available via iPlant2
1. Wickett & Mirarab et al. Proc Natl Acad Sci U S A. 2014 Nov 11;111(45):E4859-68. doi: 10.1073/pnas.1323926111
2. Matasci et. Al 2014 GigaScience 2014, 3:17 doi:10.1186/2047-217X-3-17
2014-2015 Highlights: iMicrobe
iMicrobe Data Commons
• Hurwitz Lab, University of Arizona
• Funded by Gordon & Betty Moore Foundation
• Aim: Make the high-value CAMERA microbial datasets available through an interactive data commons
• Required just two months of development thanks to iPlant cyberinfrastructure
– http://data.imicrobe.us/
• Being replicated to power a viral genomics platform
iPlant offers a powerful toolbox for rapidly developing next-
generation community resources
2014-2015 Highlights: iPlant’s Broadening Impact
• Powered by iPlant program
• Foundation for other life sciences projects
• Adoption outside the life sciences
JETSTREAM
2014-2015 Highlights: Jetstream
• iPlant Atmosphere demonstrated value of user-provisioned cloud
• Partnership: Indiana University, TACC, UArizona, U Chicago, UTSA, Johns Hopkins & Penn State
• NSF ACI #1445604
• January 2016 via XSEDE
• ~50x capacity of iPlant Atmosphere. Same great UI. Innovative new capabilities.
A national science and engineering cloud
What’s Coming Next?
• New high performance tools and workflows
– MAKER-P and a host of assembly and expression workflows
• iPlant Data Commons
– Discoverability, persistence, provenance
• Expanded support for pro users and developers
– APIs, workshops, tutorials, and more
• New capabilities to support Science Communities
– Expanding participation and fostering cooperation
The iPlant CollaborativeNew and Continuing Peer Collaborations
• CoGe – Comparative genomics• EPIC – CoGe extension to support
epigenetics• iAnimal – 2x USDA AFRI grants for CI• Galaxy – Hosting usegalaxy.org• BioExtract Server • IBP – GCP led• IRRI/CAS – Resequenced rice varieties• KBase – DOE’s CI for bioenergy• transPLANT – Elixir’s CI for plants• TAIR – Hosting for sustainability
The iPlant CollaborativeScientific Achievements through iPlant’s Open Infrastructure
1KP – 1000 Plant Transcriptome Project
• Stored tens of millions of sequence reads with iPlant, all assemblies, plus data access APIs exposing 3+ million compute hours of downstream analysis
• Demonstrates TNRS, tree creation, ortholog clustering, etc.
• Claimed to create 100-fold increase in plant genes in GenBank
• Dozens of papers out or on the way
Presenter Title
David Horvath Progress in Sequencing the Genome of an Invasive Polyploid Weed (Leafy Spurge)
Joshua Der A Global Gene Family Classification Resource for Plants and Its Utility for Comparative
Genomics, Genome Annotation, and Gene Family Studies
Kranthi Mandadi Transcriptomic Analyses and Alternative Splicing Landscapes of Brachypodium
Infected with Panicum Mosaic Virus
John Duvick Genome Annotation in the Cloud through XGDBvm Virtual Server Instances Deployed
at iPlant
Dong Xhu Soybean Knowledge Base (SoyKB): A Web Resource for Integration of Soybean
Translational Genomics and Molecular Breeding
Bonnie Hurwitz iMicrobe: Advancing Clinical and Environmental Microbial Research using the iPlant
Cyberinfrastructure
The iPlant CollaborativeSuccess Stories from our Users