a cyberinfrastructure framework for discovery, integration, and analysis of earth science data a...
Post on 21-Jan-2016
217 Views
Preview:
TRANSCRIPT
A Cyberinfrastructure Framework for Discovery,
Integration, and Analysis of Earth Science Data
A Prototype SystemA. K. Sinha, Z. Malik, A. Rezgui, A. Dalton, K.
Lin
* Virginia Tech
** San Diego Supercomputer Center
* * * * **
2
Hypothesis Evaluation: Are A-Type Rocks in Virginia related to a Hot Spot Trace ?
Spatio-Temporal Distribution of Igneous Rocks
Laurentian Crust and Lithosphere
Plume Head
Hot Spot Trace ?
3
GEON’s DIA Engine
Evaluating a Hypothesis requires
Discovery - Access to Data Integration of Data – Provide data
products Analysis of Data – Verify Hypothesis
4
Data Discovery
Registration of Data : Pre-requisite for Data Discovery
Level 1 Registration – Keywords Level 2 Registration – Ontologic Classes Level 3 Registration – Item Detail Level
5
Registration of Data:Key to Discovery, Integration and Analysis
Level 1 Discovery of data resources (e.g., gravity, geologic maps,
etc) requires registration through use of high level index terms. GEON has deployed extension of AGI Index terms -will be cross indexed to others such as GCMD, AGU
Level 2 Discovering Item level databases requires registration at
data level ontologies (e.g. bulk rock geochemistry, gravity database)
Level 3 Item detail level registration (e.g., column in geochemical
database that represents SiO2 measurement). This level of registration is a requirement for semantic integration
6
AGI Index Terms
GEON Index Ontology
http://www.geoscienceworld.org/
Level 1 Registration
7
Ontological Look at Virginia Tech Igneous Rock Database
RockGeologic Images
Methods & References
Isotope
LocationMineral
Structure
MapReference References
FeTreatmentMinerals BulkRockGeochemMethods
AnalyticalMethods BodyShapes
Fractures Fabric
RockGeoChemistry
ModelComposition
ImagesGeologicLocation MineralChemistry
Rb_Sr_Isotope_Whole_Rock
Sm_Nd_Isotope_Whole_Rock U_Th_Pb_Isotope_Whole_RockRb_Sr_Isotope_Mineral
Sm_Nd_Isotope_Mineral U_Th_Pb_Isotope_Mineral
Level 2: Registration at the Item Level
Mineral
Rock
Element
Isotope
Structure
Location
Level 2 Registration
8
1 0..n
A Section from Planetary Material Ontology
AnalyticalOxideConcentration
analyticalOxide: AnalyticalOxideconcentration : ValueWithUniterrorOfConcentration : ValueWithUnit
GEON approach of registering data to concepts removesstructural (format) andsemantic heterogeneity
Level 3 Registration
9
DIA Engine (1) How does GEON discover data
Keywords, Resource Type, Temporal, Spatial Invoke GEON protocol for discovering
databases Discovery, Integration and Analysis Engine
Retrieve the discovered data from registered databases
Emphasize Geospatial and Aspatial Discoveries (Not all things to be done through a Map-based browser)
10
DIA Engine (2)
Geoscience TemplatesGeologic Map (USA)Geologic Map (States)
Terrane MapGeologic Provinces
Geophysical Map
- Experimental Databases
- Tools
Geospatial Engine Aspatial Engine
11
High-Level View of the DIA Engine
User specifies class of data for analysis
The DIA Engine derives and retrieves the different data sets needed for the requested analysis
The DIA Engine applies processing and filtering techniques to generate the requested data product
Data products and Query Steps can be saved
RawData
QueryTool
DataProduct
Modeling Computation
12
Data products (1) Data products can be in the form of Interactive
Maps, Interactive Filtering Diagrams or Excel Data Files
Examples: A map showing the A-Type bodies in the Mid-Atlantic
region
An Excel file giving the ages of those A-Type bodies
A gravity database table spatially related to A-Type bodies
Saved as a contoured gravity map
13
Data products (2)
Data products can be: Pre-Packaged
Quickly queried but not flexible and provide little support for complex scientific discovery
Created Dynamically May require on-the-fly, extensive query
processing but enables far richer possibilities for scientific discovery
Requires Semantic Integration
14
Data Integration (1)
Semantic integration of data products requires: Ontologies: a common language to
interpret data from different sources Data sharing: requires data registration
Fine grain (i.e., item-level) registration is necessary to enable the automatic processing (by tools) of shared data.
15
Data Integration (2)
QueryTool
DataProduct
Integration within anontological class
OntologicallyRegistered Data 1
DP 1
Integration acrossontological classes
OntologicallyRegistered Data(Geo-chemistry)
OntologicallyRegistered Data(Geo-physics)
DP 2
QT 1
QT 2
OntologicallyRegistered Data 2
RawData
Data OwnerData Owner
Geo-chemistryOntology
RawData
RegisterData
Geo-chemistryOntology
Geo-physicsOntology
IntegrationClass
Location
16
Limitations of Current Data Sharing Approaches
Each research group adopts its own acronyms, notations, conventions, units, etc.
Data sharing is of limited scope Data discovery is ad-hoc Only a small community of scientists may be aware of
and share a given data set Integration is difficult
Extensive conversion efforts may be needed Absence of streamlined integration leads to poor
ability to answer complex scientific questions Solution: Ontology-based Data Registration
17
Menu-based (Used in the Demo) The GUI lets the user select only specific items
which in turn queries only a subset of the data A robust system informs the user of any incorrect
input and guides in the right direction Results are guaranteed as the query is
definitely answered Text-based
The entire database can be queried Result sets may be empty Only a small mistake in the query can return
incorrect results, without the user being able to point out the fallacy
Query Building
18
Menu-based Query Building In a selected “region of interest” the user is
provided with a number of options (the menu)
User clicks through the different menus to build an exact query Click history is maintained to enable future referencing
Menu # 1 Menu # 3Menu # 2 Menu # 4
Menu # 5
19
Query Tool Selection Tools provided by GEON can be used to answer a query
OR Other geologic tools can be incorporated (invocation
interfaces need to be defined) Example: GCD-Kit can be used for classification, geotectonic
and normative calculations for Igneous Rocks
20
Analysis
Data Product(s) generated can be analyzed using various techniques Modeling Computation
21
10000*Ga/Al vs.
Zr
User
Geo-Chemical
Data
FeO*/MgO vsZr+Nb+Ce+Y
Web ServerSDSC
RockClassification
Ontology
US NationalGazeteer
Q: A-Type polygons in a region Rusing discrimination diagram D ?
GEONServer -Virginia
Tech
DiscriminationFunctions
Geo-SpatialData
Geo-SpatialData Server
Geo-ChemicalData Server 1 -Virginia Tech(Mid-Atlantic)
Geo-Chemical
Data
Geo-ChemicalData Server 2
(Wyoming)
Geo-Chemical
Data
Geo-ChemicalData Server 3
(Texas)
Y vs. Nb
Java/VB ScriptASP.netVB.net
Visual Basic
Java/VB Script-enabled
Web browser
ESRIArcSDE
ESRI ArcGISServer
MS SQLServer
MS SQLServer
MS SQLServer
Workflow Associated with the Demo
22
Used Technologies User Interface:
Java / VB Script ASP.net VB.net
Back-End: ESRI ArcGIS Server 9.1 ESRI ArcSDE 9.1 (Spatial Database) Microsoft SQL Server (Geo-Chemical
Database) Functionality Coding:
Visual Basic (to code the discrimination filters)
23
Demo Starts Here
24
Current Tool Sharing Approaches
Each research group develops its own tools
Tools developed by a research group are rarely used by other groups
Redundancy of development efforts Little interoperability amongst tools
Interaction amongst different tools is often not possible or requires extensive (re)coding
Solution: Wrap Tools as Web Services Accessible to the Scientific Community Worldwide
25
The Future: Integration through Ontologies and Web Services
Benefits of Web Services Facilitate Integration
Tools developed independently may easily be integrated into new applications
Example: Discrimination tools may be made as Web services
Provide High Reusability More tools available to the research community
Reduce development time, effort, and cost
26
Web Services Explained (1)
Function 1
ServiceProvider 1
Function 2
ServiceProvider 2
Function 3
ServiceProvider 3
W e b
UserUser
ApplicationProvider 1
ApplicationProvider 2
UDDI Registry
WSDL ServiceDescriptions
UDDI Registry
PublishWeb
Service
1
DiscoverWeb
Service
2
InvokeWeb
Service
3
SOAPMessages
WebServices
WS Standards
WSDL: Web Services Description Language
UDDI:Universal Description, Discovery, and Integration
SOAP:Simple Object Access Protocol
27
Web Services Explained (2)
WSDL (Service provider describes service using WSDL) An XML-based language to describe the capabilities of Web
services The capabilities of a WS are described as a set of end points
that can exchange messages WSDL is part of UDDI
UDDI (Service provider publishes service using UDDI) A Web-based directory where service providers may list their
services and where service consumer may retrieve the services published by the providers (like yellow pages)
SOAP (Clients and services communicate using SOAP) An XML-based protocol used to encode the messages
(requests and responses) exchanged between a Web service and its clients.
28
Within Same Ontologic Class
Discovery
Integration
Geochemical Geophysics Geologic Time
Ontologically Registered Data
Data Product
Analysis
Hypothesis Evaluation: Are A-Type Rocks in Virginia related to a Hot Spot Trace ?
Geospatial Query Aspatial Query
Between Different Ontologic Classes
Data Product
Geochemical
A-Type Identification
VA. Ontologically Registered Data
WY. Ontologically Registered Data
TX. Ontologically Registered Data
top related