a kepler-based three tier architecture applied to lidar interpolation and analysis efrat frank,...
Post on 22-Dec-2015
216 views
TRANSCRIPT
A Kepler-based Three Tier Architecture applied to LiDAR Interpolation and Analysis
Efrat Frank, Ilkay AltintasSan Diego Supercomputer Center, UCSD
Configuration phase
Subset: DB2 query on DataStarPortal
Grid
Analyze
move process
Visualize
move render display
Interpolate: Grass RST, Grass IDW, GMT…
Visualize: Global Mapper, FlederMaus, ArcIMSScheduling/OutputProcessing
Monitoring/Translation
Example of LiDAR data acquired along the Northern San Andreas fault in Sonoma County, California. Left: Hillshade produced from the first return surface DEM (Digital Elevation Model) derived from the LiDAR data. In this heavily forested region the first return surface largely shows the tree canopy top. Right: Hillshade of the last return surface DEM for the same area shown in left image. The multiple returns offered by the LiDAR workflow allow for “virtual deforestation” and the creation of a “bare-earth” model of the ground surface. Note San Andreas fault and roads not visible in the first return hillshade. LiDAR data represents an important new tool for the study of the earth’s surface, especially in regions where heavy vegetation makes traditional techniques such as aerial photography ineffective. (Source: Christopher J. Crosby, J. Ramon Arrowsmith, GEON, ASU)
R. Haugerud, U.S.G.S
D. Harding, NASA
Point Cloudx, y, zn, …
LiDAR IntroductionSurvey
Process & Classify
Analyze / “Do Science”
Interpolate / Grid •LiDAR (Light Distance And Ranging, a.k.a ALSM, Airborne Laser Swath Mapping) point cloud datasets, a high performance processing of high point density datasets. •LiDAR generates massive data volumes - billions of returns are common.•Distribution of these volumes of point cloud data to users via the internet represents a significant challenge. •Processing and analysis of these data requires significant computing resources not available to most geoscientists.•Interpolation of these data challenges typical GIS/ interpolation software.
•our tests indicate that ArcGIS, Matlab and similar software packages struggle to interpolate even a small portion of these data.
•Traditionally: Popularity > Resources
The Computational Challenge:
•GOAL: Efficient three-tier architecture for LiDAR interpolation and analysis using GEON infrastructure and tools
•GEON Portal - front end layer•Kepler Scientific Workflow System - control layer
•Kepler is used as a batch execution engine•GEON Grid - computation layer
•Use scientific workflows to glue/combine different tools and the infrastructure•The architecture provides an efficient and reliable LiDAR data analysis
GEON’s Solution:A Three-Tier Architecture for LiDAR Processing
Render Map
DB2
DB2Spatialquery
NFS Mounted Disk
ArcInfo
Compute Cluster
x,y,z and attribute
raw data process
output
KEPLER WORKFLOW
Parameterxml
CreateWorkflow
Description
ArcSDE ArcIMS
Map onto the grid
Grass surfacing algorithms: Spline IDW block mean …
Download data
Binary gridASCII grid
Text fileTiff/Jpeg/Gif ASCII grid
Client/ GEON Portal
Map and Attributes
Grass Functions and Parameterssubmit
http://geongrid.org
Kepler includes contributors from GEON, SEEK, SDM Center, Ptolemy II, ROADNet, CIPRes and Resurgence supported by NSF ITRs 0225673 (GEON), 022567 (SEEK), DOE DE-FC02-01ER25486 (SciDAC/SDM), and DARPA F33615-00-C-1703 (Ptolemy).
Future Plans• Improve overall performance using advanced processing tools
•Parallel interpolation, enhanced visualization• Extend built-in failure recovery and reporting features• Additional portal execution and registration support• Utilize provenance information for workflow product registration• Create graphical illustration of job progress / location in the workflow to demonstrate the distributed nature of the systemULTIMATE GOAL: Make it useful to a wide range of earth science users!
Contributors Efrat Jaeger-Frank, Ilkay Altintas, Chaitan Baru, Ashraf Memon, Viswanath Nandigam, (GEON, San Diego Supercomputer Center, UCSD)Christopher J. Crosby, Jefferey S. Conner, J. Ramon Arrowsmith (GEON, ASU)
•An extensible, easy to use, workflow design and prototyping tool•On-the-fly creation of workflow instances from workflow templates
•Integrating heterogeneous local and remote tools in a single interface:•Gridding and Imaging services via Web and Grid services•GIS services•Remote tools via SSH, SCP and GridFTP•Relational and spatial databases access•Direct access to data and tools from remote repositories•Reusable generic and domain specific actors
•Support for High Performance Computations:•Job submission and monitoring•Logging of execution trace and registering intermediate products•Data provenance and failure recovery
•Portal accessibility. •GEON LiDAR Workflow is deployed on the GEON portal
•Reverse engineering of traditional approach
•GLW is exposed to a high risk of components failures•Long running process•Distributed computational resources under diverse controlling authorities•Kepler provides transparent/background error handling using provenance data
•A unified interface to follow up on the status of submitted jobs•View job metadata•Zoom to a specific bounding box location •Track errors •Modify a job and re-submist•View the processing results•In the future, register desired workflow products•Useful for publication
LiDAR Job Management and Monitoring
•Online data acquisition and access•Managing large databases
•Indexing data on spatial and temporal attributes •Quick subsetting operations
•Large scale resource sharing and management•Collaborative and distributed applications•Parallel gridding algorithms on large data sets using high performance computing•Integrate data with other related data sets, e.g. geologic maps, and hydrology models•Provide easy-to-use user interfaces from portals and scientific workflow environments
Increasing Usage of Technology in Geosciences
LiDAR Processing via Kepler