aginfra science gateway for workflows and integrated services 07/02/2012 robert lovas...
TRANSCRIPT
agINFRA science gateway for workflows and integrated services
07/02/2012
Robert Lovas
MTA SZTAKI
Why workflows?
• ‘Orchestration of Tasks’ (not only in sequence; organized in Direct Acyclic Graphs)
• Data-driven• To exploit large computational resources / process large data sets• To make the complex applications run faster
– By applying paralellization techniques on them
• Paralellization Techniques:– Indepent tasks can be executed concurrently– Execute tasks against LARGE datasets parameter study, domain
decomposition, etc.
Parameter study / domain decomposition
GEN
SEQ
COLL
SEQSEQSEQ
Generates input
parameter space
Evaluates the results of the
simulation
Parameter sweep jobs
3
Liferay-based WS-GRADE/gUSE portal for agINFRA:– http://aginfra-portal.lpds.sztaki.hu/liferay-portal-6.0.5/– Open Registration for project participants– X509 Certificate authentication required to be able to submit jobs NEW: robot certs– agINFRA VO is accessible (ca. 5000 CPU cores, >50 TB storage)
Access modes
5
WS-PGRADEWF
DeveloperUI
gUSE DCI Bridge
DCI 1
DCI 2
DCI n
ApplicationSpecific
User Interface
ExistingApplicationSpecific UI
WS-PGRADEEnd-User
UI
Remote API
BES interface
ASM API
A
B
C
D
E
BES interface
ASM API
WS-PGRADE UI
Customized UI
Other, existing UI
gUSE Workflow engine
agINFRA VOagINFRA VO
Volunteers’ computersVolunteers’ computers
Workflow building blocks (glossary)• “Jobs” operating on data• Jobs can be:
– Grid-enabled applications (e.g. AgrovocTagging)
– Web-services
– NEW: REST services
– Another workflows (embedded)
• “Ports” representing inputs and outputs for the jobs– Available port types:
• Value• Local file e.g. from the scientists laptop• Remote File (gsiftp, lfc) in the Grid• Database Queries (SQL)• ….
– Extensions or improvement might be required (e.g. Drupal / Dublin Core / SPARQL / CIARD RING support?) 6
agINFRA overview
Services to be integrated
Harvesting, validation, transformation
The Organic.Edunet Ingest Workflow
Schematic Representation of the AGRIS workflow
Cross-community workflows identified at the Athens
DEMONSTRATION I.
AgroTagger
First demo application
First demo application - details
Job details
Inputs and outputs
Monitoring of execution
18
Successful execution of AgrovocTagging application
DEMONSTRATION II.
Harvesting workflow
ARIADNE aggregation panel
- Select schedulingLink to the workflow interface
-Invoke the workflow-Check the status of the workflow
-Stop the workflow-Add metadata for the
aggregation
ARIADNE aggregation panel
- Select schedulingLink to the workflow interface
-Invoke the workflow-Check the status of the workflow
-Stop the workflow-Add metadata for the
aggregation
gUSE WS-PGRADEgUSE WS-PGRADE
Harvesting- Add parameters of
agDataHarvesters web service
Harvesting- Add parameters of
agDataHarvesters web service
Metadata Validation vs target
schema - Add parameters of
agMetadataValidation web service
Metadata Validation vs target
schema - Add parameters of
agMetadataValidation web service
Target Validation- Add parameters of
agTargetValidation web service
Target Validation- Add parameters of
agTargetValidation web service
Target schema?
Target schema?
No
• Stop the process for the specific target• send message or store logs and send them
through the gUSE API
Yes
Transformation- Get parameters for the
transformation and invoke agMetadataTransformation
web service
Transformation- Get parameters for the
transformation and invoke agMetadataTransformation
web serviceNo
Store metadata on the GRID (agINFRA VO)
Store metadata on the GRID (agINFRA VO)
YesValid?Valid?
starting a pre-defined procedure as an agINFRA workflow
Integration of multiple components/services
Details in demo…
Plan: Continue the development of aggregation workflow
Plan: Further integration
+ volunteers…
07/02/2012
Questions?
26