Dr. Tsengdar Lee
Acting CTO for IT
August 16, 2011
Advancing Science at NASA through Cloud Computing:Examples from Nebula
Nebula Case Examples-- August 16, 2011
Nebula Pioneers a New Frontier for Cloud Computing
One of the first cloud computing
platforms built by the Federal
Government for the Federal
Government Over 300 users at 9 Centers
+ JPL + HQ White House was first client
Nebula Case Examples-- August 16, 2011
Nebula Pioneers a New Frontier for Cloud Computing
Nebula developed to provide: “Instant-on” IT Infrastructure Automated provisioning capabilities, and Quick scale-up services
All Necessary to…
Process large datasets quickly, easily share them with colleagues and ultimately store them securely at a good price
Nebula Case Examples-- August 16, 2011
Why NASA Created Nebula
In 2008, limited commercial cloud offerings could not meet NASA requirements for:
Security Network performance for managing data in and out of the
cloud Private cloud customization capabilities Limit vendor lock-In
Nebula Case Examples-- August 16, 2011
Nebula Principles
Nebula Case Examples-- August 16, 2011
Open and public APIs, everywhere Open-source platform, apps, and data Full transparency»Open source code and documentation releases
Reference platform»Cloud model for Federal Government
Nebula IaaS Services
Nebula Case Examples-- August 16, 2011
Software to provision virtual machines on standard hardware at massive scale
Software to reliably store billions of objects distributed across standard hardware
Nebula Case Examples-- August 16, 2011
Previous Options for NASA Scientists
Science-scale
application developm
ent
Very large data set processing
Compute intensive
processing
Timely sharing of
results with
collaborators and
the public
Missions
BUILD IT
Build my own IT infrastructure that may/may not comply with Federal/Agency IT security standards.
BUY IT
Go through a lengthy procurement and provisioning process for basic IT services
DO NOTHING
The current basic IT services model is cost prohibitive and I cannot afford to process my data and share with collaborators and the public at large.
Current Options*Requirements*
Nebula Case Examples-- August 16, 2011
Supercomputing (grid)
Tasks are distributed among subset of nodes of supercomputer All data is accessible to all nodes via high-speed interconnects Failure of a node results in failure of a job Nodes cannot be added or removed during Job
Cloud (batch)
Nebula Case Examples-- August 16, 2011
Work separated into many individual tasks
Each task is performed with only the subset of data needed
Failed tasks can be restarted by re-issuing tasks to new node
Nodes may be added and removed as needed/ available
Nebula Case Examples
Nebula Case Examples-- August 16, 2011
SERVIR integrates satellite observations, ground-based data and forecast models to monitor environment changes and improveresponse to natural disasters
SPoRT transitions unique NASA satelliteobservations and capabilities to NOAA to predict short-term weather events
iRODS is an open-source, data grid softwaresolution to manage, share, search and distribute large, diverse scientific datasets
SERVIR & SPoRT: Modeling Capabilities
Application Concept:»Create Nebula images that are capable of supporting the
research and operational goals of both SERVIR & SPoRT.»Potential Benefits:
• Rapid deployment of standard models to respond to natural disasters without disrupting other activities.
• Reduces the installation and maintenance of IT resources at remote or offsite location
Nebula Case Examples-- August 16, 2011
SERVIR: Weather Forecasting Severe weather is natural hazard of interest to both SERVIR
and SPoRT Use the Weather Research and Forecasting (WRF) Model to
produce high-resolution, short-term forecasts Instances can be used to:»Use one instance for single region»Share resources for a high-resolution run or a larger
forecast domain»Provide rapid response to new events or research
opportunities without impacting other resourcesNebula Case Examples-- August 16, 2011
A True Nebula Story…
Nebula Case Examples-- August 16, 2011
On April 27th tornadoes devastated parts of Central and Northern Alabama including a large stretch of downtown Tuscaloosa.
Nebula Case Examples-- August 16, 2011
S
SPoRT used Nebula to process datasets provided to National Weather Service through Google Earth to verify path length and width of tornado combining spectral channels to obtain false color imagery of damages impacting vegetation and ecology
Nebula Case Examples-- August 16, 2011
Nebula hosted tiling application with large hi-res images
Rapidly configured
Tiles created pushed back to local web server
Made available via Google Earth
Nebula Case Examples-- August 16, 2011
EF-4 Tornado marked in orange
EF-4 Tornado identified in red
EF-5 Tornado marked in purple
Broad view of tiled ASTER images and Tornado tracks heading northeast on March 27
Comments from SERVIR & SPoRT
“Our Linux machines were busy processing data from other tasks and could not be interrupted… But even if they could… They would not have been as easily configurable as Nebula.”
“Nebula gave us the chance to ‘play in a sandbox’ where configuration testing was easy and fast and could be used without disrupting other local systems.”
Nebula Case Examples-- August 16, 2011
That’s Fantastic!
NASA SPoRT says they’re pleased with Nebula’s scalability capabilities….
“An earlier test run of my forecast model ran
for 54 consecutive days without issue before I brought it down. That’s fantastic!”
Andrew Molthan
Senior Meteorologist
Servir and SPoRT
NASA Marshall Space Flight Center
Nebula Case Examples-- August 16, 2011
Technology – Integrated Rule-Oriented Data System (iRODS)
Targets large repositories and digital preservation Supports the federation of independent, distributed collections Supports server-side workflows that are implemented by
chaining execution rules together based on data policies Includes features such as domain-specific validation,
automatic replication, and digital signature/checksum computation
Validates assertions about data such as integrity and authenticity
Nebula Case Examples-- August 16, 2011
NCCS Develops iRODS DMS for Climate Studies
NASA Center for Climate Simulation (NCCS) provides compute engines, analytics, data sharing, long-term storage, networking and other high-end computing services for Earth science community
NCCS completed a pilot project to develop an iRODS-based Data Management System (DMS) to handle massive amounts of observations and model data used in climate and weather studies
DMS team used Nebula to host DMS prototype with goal of managing and publishing climate simulation data using iRODS with a distributed set of Nebula instances
Nebula Case Examples-- August 16, 2011
Steps for iRODS Distributed Data Storage and Management
Modern Era Retrospective-Analysis for Research and Applications (MERRA) data placed under iRODS control
MERRA data stored on file system and registered with iRODS Registration process stored metadata about MERRA files in
iRODS database Entire catalog of monthly MERRA products resulted in
ingestion of 360 files that occupy 47 GB Data was shared between two instances
Nebula Case Examples-- August 16, 2011
iRODS Results By eliminating the need to explicitly switch an iRODS client
between distinct grids, federation allowed perusal or download of data from multiple iRODs depositories through a single interface
Upon completion, users could examine,
search for, and download simulation
data from either Nebula instance though
a single iRODS web interface
Nebula Case Examples-- August 16, 2011
Nebula Case Examples-- August 16, 2011
Thank You