bakinam t. essawy and jonathan l. goodall - post-processing workflows using data grids to support...
Upload: consortium-of-universities-for-the-advancement-of-hydrologic-science-inc
Post on 01-Nov-2015
13 views
DESCRIPTION
2015 CUAHSI Conference on HydroinformaticsTRANSCRIPT
-
Post-processing Workflows Using Data Grids to Support Hydrologic
Modeling
Bakinam T. Essawy and Jonathan L. Goodall Department of Civil and Environmental Engineering
University of Virginia
3rd CUAHSI Conference on Hydroinformatics July 17, 2015
-
VIC Output data set
VIC data set http://boto.ocean.washington.edu/story/show/45
VIC Output data set on the iRODS server
Variable Infiltration Capacity (VIC) Macro-scale Hydrologic Model
Example for a flux file. Fluxes_x_y: x = latitude, y = longitude flux files contain information about moisture and energy fluxes for each time step for the three layers of soil (Top, Middle, and Deep).
-
The VIC Model
Source : Gao et al. (2009)
VIC = Variable Infiltration Capacity; A regional-scale land surface hydrology model
VIC developed at UWashington and Princeton; applied worldwide
Spatial resolution: 1/8-degree grid cell
Three layers of soil:
top layer (Layer 0, 0-10cm)
mid layer (Layer 1, 10-30cm)
lower layer (Layer 2, 30-100cm)
-
The County-level population data extracted from Terra Populus Website
-
Integrated Rule-Oriented Data System (iRODS)
The iRODS-enabled Data Federation Consortium (DFC) is an NSF project that provides support for both federation of resources and services.
This work is funded the by DFC project, and uses a DFC data grid for storage and long term access to the stored datasets over heterogeneous resources.
The DFC data grid also supports sharing of workflows that enable the reproducibility of the model results
-
Workflow Structured Object (WSO)
Within the iRODS data grid, a Workflow Structured Object (WSO) enables the execution of a workflow, while capturing provenance information and archiving results.
The workflow, the input files, and the output files can be shared.
The workflow can be re-executed with new input files and versions of the output file are automatically saved.
-
Objectives
Demonstrating how different data transfer approaches can be used for connecting cyber-infrastructure systems developed by different groups.
Demonstrate how iRODS can provide federation across data grids.
-
Objectives
Using the AWS (Amazon Web Services) for computing, and how public repositories like SEAD allow sharing and uniquely identifying data and modeling resources used within analyses.
We are trying to reach an approach for model reproducibility, where a scientist can easily share his model, input and output in an easy way so others can benefit from it.
-
Main components and data flow in the post-processing system
-
Shell Script
Python Scripts Parameter File
Workflow file
Visualization
WSO files used by WSO for creating the visualization
-
Two main directories for storing all files required by the WSO on the iRODS server
The location were the shell script and the python scripts located on
the iRODS server
Component of one of the runDir associated with the WSO like the staged data in or out, the cvs files
output from python scripts, and the stdout
The parameter file, the generated run file and the output RunDir for each time the
run file is accessed
The mounted collection
The location of the mounted WSO on the iRODS server
-
The execution of the WSO installed on the hydrology grid from the client machine
The user log in to client machine where
the icommands are installed on
ils to list all the collection under the path:
/hydrology/home/bakinam
Running the WSO through iget command to run the
generated .run file. Output Message indicates that the
WSO has been executed successfully
icd to change collection were the WSO files are
located
Listing the mounted collection
-
Conclusion Reproducible data visualizations on large
hydrological data collections Using strong and weakFederation of data across
communities (e.g., TerraPop interoperability example)
Publishing data along with workflow-produced metadata (e.g., SEAD interoperability example) using unique Identifier.
-
Future Plans
Swap SEAD with Hydroshare to share my datasets and create a resource type from my WSO.
-
Bakinam T. Essawy
Department of Civil and Environmental Engineering
University of Virginia
Questions