enabling user-oriented data access in a satellite data portal rajesh kalyanam lan zhao taezoon park...

26
Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette, IN 47907 Larry Biehl PTO, Purdue University, West Lafayette, IN 47907

Upload: aldous-dawson

Post on 29-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Enabling User-Oriented Data Access in a Satellite Data

Portal

Rajesh KalyanamLan Zhao

Taezoon ParkCarol X. Song

RCAC, Purdue University, West Lafayette, IN 47907

Larry Biehl PTO, Purdue University, West Lafayette, IN 47907

Page 2: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Outline

• Background

• Motivation

• System Design

• Data Production

• Data Subscription

• Data Delivery

• Future Work

Page 3: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Background

• Overview of Purdue Terrestrial Observatory (PTO)– Remote-sensing research facility

– Goes-12 GVAR, AVHRR, and MVISR sensor systems – AQUA/TERRA satellites

– Component of the TeraGrid data provider framework

• Satellite data products– Land, ocean and atmosphere data

– Provide trends on local or continental scales

– Used in climatology, hydrology, agriculture and transportation

Page 4: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

March 20, 2007

Example MODIS Products

• Level 1A (MOD01)

• Level 1B (MOD02) with/without bowtie correction

• Geolocation (MOD03)

• Aerosol (MOD04)

• Water Vapor (MOD05)

• Clouds (MOD06)

• Atmospheric Profiles (MOD07)

• Reflectance (MOD09)

• Snow (MOD10)

• Fire Detection (MOD14)

• Ocean Color (MOD18)

• Sea Surface Temperature (MOD28)

• Sea Ice (MOD29)

• Cloud Mask (MOD35)

• Also Multiday composites of above

Note that each data set product may contain a few to many variables.

Page 5: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Aerosol_Type_LandAngstrom_Exponent_1_OceanAngstrom_Exponent_2_OceanAngstrom_Exponent_LandAsymmetry_Factor_Average_OceanAsymmetry_Factor_Best_OceanBackscatter_Ratio_Average_OceanBackscatter_Ratio_Best_OceanCloud_Condensation_Nuclei_OceanCloud_Fraction_LandCloud_Fraction_OceanCloud_Mask_QAContinental_Optical_Depth_LandCorrected_Optical_Depth_LandCritical_Reflectance_LandEffect_Optical_Depth_Ave_OceanEffect_Optical_Depth_Best_OceanEffect_Radius_OceanError_Critical_Reflectance_LandError_Path_Radiance_LandEstimated_Uncertainty_LandLeast_Squares_Error_OceanMass_Concentration_LandMass_Concentration_OceanMean_Reflectance_LandMean_Reflectance_Land_AllMean_Reflectance_OceanNumber_Pixels_Percentile_Land

Number_Pixels_Used_OceanOptDepth_Ratio_Small_LandOptDepth_Ratio_Small_Land_Ocean OptDepth_Ratio_Small_OceanOptical_Depth_Land_And_OceanOptical_Depth_Large_Ave_OceanOptical_Depth_Large_Best_OceanOptical_Depth_Small_Ave_OceanOptical_Depth_Small_Best_OceanOptical_Depth_by_models_oceanPath_Radiance_LandQualityWt_Critical_Reflect_LandQualityWt_Path_Radiance_LandQuality_Assurance_Crit_Ref_LandQuality_Assurance_LandQuality_Assurance_OceanReflected_Flux_Average_OceanReflected_Flux_Best_OceanReflected_Flux_LandReflected_Flux_Land_And_OceanSTD_Reflectance_LandSTD_Reflectance_OceanScan_Start_TimeScattering_AngleSensor_AzimuthSensor_ZenithSolar_AzimuthSolar_Zenith

Solution_Index_Ocean_LargeSolution_Index_Ocean_SmallStd_Dev_Reflectance_Land_AllTransmitted_Flux_Average_OceanTransmitted_Flux_Best_OceanTransmitted_Flux_LandLatitudeLongitude

Variables in MOD04 Product

Page 6: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Motivation

• User Requirement– Custom-tailored data configurations

– Receive continuous data updates

– Real-time or near-real-time access

• Current Systems– Impossible to generate complete range of data products

– Have to route through the support staff

– Manual process which is time consuming and error-prone

Page 7: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Motivation

“Web-based data configuration, subscription and delivery

system”

Page 8: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

System Design

• Processing and Storage Backbone– PTO infrastructure

– PTO data processing cluster

– SDSC SRB middleware

• Publish-Subscribe manager– Interface between the client side and the data processing backend

– Manager user subscriptions

– Handles enabling/disabling data production

• Client side applications– Subscription interface

– Data access portal

Page 9: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

System Design

Data Manufacturer

Pub-Sub Manager

Data Management

Portlets

PTO Satellite Ground Station- Tracking antenna

- Stationary antenna

Predicate Sharing- Tagging

SATPro Portal - GridSphere, JSR 168 portlets - Web2.0: AJAX, Tagging, RSS

Visualization- Animation, QuickView

GoogleEarth

Access- HTTP/FTP, Email,

RSS

Discovery- Metadata search

Subscription- On demand, - User controlled

SRB Data Grid- MCAT

- SRB server

PTO Processing Cluster- TeraScan system

Sub. Information Manager- User information table

- Product table- Predicate Maaper

Sub. Workflow- Web Services modules- Pre-composed workflow

Data Services- Enable/Disable data subscription

- SRB data access, query- Monitoring Component

Page 10: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

System Design

• User-driven publish/subscribe model

– Dynamic data generation

– User specifies, controls, and receives custom-tailored data

– Continuous data updates in near-real-time

– Multiple ways to access the data

Page 11: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Production

• Data production software– SeaSpace TeraScan software

– Configuration variables

– Various projections and output formats

• On-demand data production– User choice driven production

– “configproc” file mechanism

– Automatic enabling and disabling

– scp based data transfer to SRB archive and webserver

Page 12: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Production

• Example configproc file input_directory: products/tdf/Local/modis/ndvi input_files: %yyyy.%mmdd.%hhmm.%satel.MYD_NDVI image_variable: EVI image_format: jpeg scale_range: -0.25 1.00 color_palette: modis_ndvi grid_delta: 0 boundaries: dcw.coast dcw.states max_width: 256 output_template: %yyyy.%mmdd.t_evi.jpg save_directory: products/images/modis save_files: 20??.????.t_evi.jpg

Page 13: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Subscription

• Data Subscription Components– Publish-Subscribe based subscription manager

– Subscription Interface

• Publish-Subscribe subscription manager– Simulates operation of a PubScribe system

– Implemented through an Apache Axis webservice

• Subscription Interface– Available on a web-based scientific gateway portal

– Naïve and advanced user interfaces

Page 14: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Subscription

• Advanced user interface– Requires knowledge of variables involved in data product

– Choice-list based configuration

– AJAX dynamic filtering of choice lists

– Will allow advanced configuration variables with strict logical composition rules

• Naïve user interface– Plain English description : “bimonthly composite of vegetation

data”

– Scoring mechanism for selecting possible products

– Learning mechanism for improving performance over time

– Work in progress

Page 15: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Subscription

• Predicate matching– Keyword definitions for each data product : “BIMONTHLY

COMPOSITE of VEGETATION data”

– Score captures the degree of correlation between descriptions and products

– Additional keywords are added to a list for further consideration, scores are updated based on repetition frequency

– Successful product descriptions are tagged

– Tags can be reused by other users to search for common products

Page 16: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Subscription

Page 17: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Subscription

• Subscription Manager– Subscription data management

– Receives updates from data generator

– Distributes notifications to subscribed users

– Enabling and disabling data generation

• Subscription data management– MySQL database

– Product information – product key, generation frequency, configuration variables, filename pattern, webserver path

– User subscription information – userid, product key, date range, email address

Page 18: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Subscription

• Pull-based notifications– Simpler approach

– Perl script tracks updates to data repository

– Loops through all data products based on the highest generation frequency

– Trade-off between performance and notification delays

Web Server

P1+

Config FilesPTO Cluster

UserWorkstation

Subscription Database

WS1

WS2

Monitoring Agent

Web Dir

Config Files

Remote Sensing Data

Sub Form

Ground Station

Satellite

Config Files

P1+

SeaSpace Data

Processing Daemon

Pub-Sub Manager

Page 19: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Subscription

• Push-based notifications– Requires tight integration with data generation process– Included as an entry in the configproc file– Product name argument is used to query list of users– Constraints on the execution node and environment

Web Server

PTO Cluster

Subscription Database

WS1

WS2

Web Dir

P1+

Config FilesConfig Files

Remote Sensing Data

Config Files

Sub FormGround Station

Satellite

P1+

Monitoring Agent

WS3

SeaSpace Data

Processing Daemon

Pub-Sub Manager

UserWorkstation

Page 20: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Data Delivery

• Http access– Users can download images off the webserver– Cannot verify if they are interested in the image– Images cannot be stored for a long time on the webserver

• RSS feed based access– Thumbnails are sent as RSS feeds when new images are

available– Users can download the actual image from the feed link based on

the thumbnail

• Data portal access of archive data– Can access archived data from the SRB server– Difficult to sift through the large number of images

Page 21: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

RSS Feed notification

Page 22: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Future Work

• Future Direction– Explore advantages of standard PubScribe models– Utilise current state of the art in ontology based

methods for predicate mapping– Performance studies for scalability– Transfer data automatically to user specified location

Page 23: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Conclusion

“A user-oriented subscription framework that will encourage broader access from the grid

user community”

Page 24: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Acknowledgements

This work was made possible by the National Science

Foundation, TeraGrid Resource Partners grant OCI-0503992

Page 25: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

References• C. Baru, R. Moore, A. Rajasekar, M. Wan, "The SDSC Storage Resource Broker," Proc. CASCON’98 Conference, 1998.

• Content Standard for Digital Geospatial Metadata” (CSDGM) Version 2 (FGDC-STD-001-1998), http://www.fgdc.gov/standards/documents/standards/metadata/v2_0698.pdf.

• Content Standard for Digital Geospatial Metadata: Extensions for Remote Sensing Metadata (FGDC-STD-012-2002), http://www.fgdc.gov/standards/documents/standards/remote_sensing/MetadataRemoteSensingExtens.pdf.

• C. Pautasso, "JOpera: An Agile Environment for Web Service Composition with Visual Unit Testing and Refactoring, " VL/HCC 2005.

• Earth System Grid (ESG), http://www.earthsystemgrid.org/.

• J. Novotny, M. Russell, O. Wehrens, "GridSphere: An Advanced Portal Framework, " EUROMICRO 2004, 412-419

• JSR 168: Portlet Specification http://www.jcp.org/jsr/detail/168.jsp.

• L. Zhao, T. Park, R. Kalyanam, S. Goasguen, "Purdue Multidisciplinary Data Management Framework Using SRB", SRB Workshop, Vol. 1, pp. 6-11, February 2006.

• LEAD Portal, http://lead.ou.edu.

• MODIS portal from the Oregon State University direct broadcast station, http://sugar.coas.oregonstate.edu/MODIS/.

• M. E. Pierce, G. C. Fox, H. Yuan, and Y. Deng, "Cyberinfrastructure and Web 2.0, " Proceedings of HPC2006, July 4 2006, Cetraro Italy.

• M. E. Pierce, G. C. Fox, M. S. Aktas, G. Aydin, H. Gadgil, Z. Qi, and Ahmet Sayar, "The QuakeSim Project: Web Services for Managing Geophysical Data and Applications, " PAGEOPH Special Issue for 5th ACES International Workshop, Island of Maui, Hawaii.

• nanoHUB, http://www.nanohub.org.

• NEES portal, http://neesforge.nees.org/projects/simportal/.

• Purdue Terrestrial Observatory, http://www.itap.purdue.edu/pto/.

• R. Kalyanam, L. Zhao, T. Park and S. Goasguen, "A Service-Enabled Distributed Workflow System for Scientific Data Processing," Proceedings of IEEE Int’l Workshop on Future Trends of Distributed Computing Systems (FTDCS’07), Sedona, AZ, March, 2007.

• SeaSpace Corporation, http://www.seaspace.com.

• U. Nambiar, B. Ludaescher, K. Lin, C. Baru, "The GEON portal: accelerating knowledge discovery in the geosciences," Workshop On Web Information And Data Management Archive, Proceedings of the eighth ACM international workshop on Web information and data management, 2006.

• Java Message Service, http://java.sun.com/products/jms

Page 26: Enabling User-Oriented Data Access in a Satellite Data Portal Rajesh Kalyanam Lan Zhao Taezoon Park Carol X. Song RCAC, Purdue University, West Lafayette,

Questions?