gtlab: grid tag libraries supporting workflows within science gateways

23
GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways Mehmet A. Nacar, Marlon E. Pierce, Geoffrey C. Fox Community Grids Lab Indiana University

Upload: deiter

Post on 17-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways. Mehmet A. Nacar, Marlon E. Pierce, Geoffrey C. Fox Community Grids Lab Indiana University. Grid-style portal as used in Earthquake Grid. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Mehmet A. Nacar, Marlon E. Pierce, Geoffrey C. Fox

Community Grids LabIndiana University

Page 2: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Now to Portals22

Grid-style portal as used in Earthquake GridThe Portal is built from portlets

– providing user interface fragments for each service that are composed into the full interface – uses OGCE technology as does planetary science VLAB portal with University of Minnesota

Page 3: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Introduction to GTLAB• Grid Tag Libraries and Beans (GTLAB)• Encapsulates clients to common Grid services as XML tag libraries and

server side Java beans.– Embedded by portlet developers in their portlet pages to invoke common tasks– Specification of the composite action you want to occur when a user hits the

submit button.– Allows portal developers to concentrate on the user interface components.

• Tags can be arranged in directed acyclic graphs (dependency chains).– These represent simple workflows.

• Based on extensions to Java Server Faces (JSF) and the Java CoG Kit.• Implements a component model for UI’s that is more powerful than

portlets

Page 4: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Motivation• OGCE Grid portlets typically wrap each single Grid capability in a

separate portlet – GridFTP-->GridFTP Portlet

• Gateway portlets encapsulate sophisticated but specialized functionality.– Such as submitting multiple linked jobs– Each functionality requires a separate portlet

• We need a middle way to “automatically” build complex functionality from component parts– Traditional portlets don’t work as no agreed way to combine component portlets

into an integrated user capability– We need a component model for portlets: reusable portlet parts

• JSF is our starting point– Removes dependencies on the Servlet API.– Backend software is just beans, so can be reused more easily outside of web and

portlet applications.• JSF also provides an extensible framework (tag libraries)• Apache JSF portlet bridge allows you to convert standalone JSF

applications (development phase) into portlets (deployment phase).

Page 5: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

ArchitectureOverview

• Grid portals are client to backend codes through Web/Grid services.• Grid tags are part of user interface tier and embedded into portlet container. • Grid tags use local services in Apache Tomcat to manage sessions and handlers.• Grid beans implement a layer on top of Grid client APIs such as Java CoG• Implies that “portal” is quite sophisticated as must support integration• Note CoG kit is here to act as a broker to hide complexities of evolving services (e.g. different versions of Globus)• “violates service model” in that core software centralized in portal application

5

Page 6: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

GTLAB Example

<html><body> < f:view><!-- Other user interface tags go here-->

<f:form> <o:submit id=”test” action=”next_page”>

<o:myproxy id=”pr” hostname=”gf1.ucs.indiana.edu” port=”7512” lifetime=”2” username=“mnacar” password=”***” />

<o:jobsubmit id=”task” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/ls” stdout=”tmp/result

stderr=”tmp/error” /> </o:submit> </f:form>

</f:view></body> </html>

• Grid tags are associated with Grid services via Grid beans• Grid Beans wrap the Java COG Kit (version 4)

• We show an example JSF page section below.• This allows you to develop new Grid portlets with no additional Java code.

Page 7: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

GTLAB features• GTLAB provides common components for building portlets using tags and

reusable parts.• The goal of GTLAB to simplify Grid portlet development

– Enable rapid development from reusable components• GTLAB capabilities include Grid operations with XML based tags within

JSF framework.• Grid tag libraries are built using JSF custom component development

techniques• Grid tags are interfaces to backing Grid beans

– End users pass values to Grid beans by using tag attributes.• Each backing Grid bean has equal capability with a portlet application in

case of Grid portlet approach.– GridFtp tag generates GridFTP Portlet

• But tags allow service backend component model to map into an intermediate UI component models – each tag set is a component– The components tags can easily be composed into a rich UI which portlets cannot

do7

Page 8: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Grid Tags Associated Grid Beans Features

<submit/> ComponentBuilderBean Creating components, job handlers, submitting jobs. This is visually rendered as HTML

<handler/> MonitorBean Handling monitoring page actions

<multitask/> MultitaskBean Constructing simple workflow

<dependency/> MultitaskBean Defining dependencies among sub jobs

<myproxy/> MyproxyBean Retrieving myproxy credential

<fileoperation/> FileOperationBean Providing Gridftp operations

<jobsubmission/> JobSubmitBean Providing GRAM job submissions

<filetransfer/> FileTransferBean Providing Gridftp file transfer

(Other JSF UI Tags)

ResourceBean Describes common properties among all tags and beans. Passing values given by standard visual JSF components.

Page 9: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

VLAB Computational Chemistry Portal

Page 10: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

VLAB: The Virtual Laboratory for Earth and Planetary Materials

• Primarily a traditional job submission, monitoring, and management portal.– Originally tried to use first generation of OGCE where all capabilities are portlets

• Collaborative Grid services and portals support computational material science.

• VLAB Challenges:– Generic GRAM job submission portlet does not fit with VLab requirements

• Quantum Espresso requires parameter entries and checking

– Grid Portlets must be easy to develop using component libraries.• Transfer data files in and out of the desired remote host.• Run one or more executables.• Keep track of job progress• Store all of the information as “job archive” for reproducibility.

10

Page 11: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Vlab portlet pages

• VLab portlets include job preparation and submission, job monitoring• Vlab portlet pages can handle sophisticated user interfaces

Page 12: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Case Study: VLAB Portal

12

Page 13: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Encoding DAGs in Portlets

depends

depends

Input

Multi-staged task

Task DFile Transfer

Task AFile Operation

(mkdir)

Task BFile Transfer

Task CJob Submission

Output

depends

• Multitask provides a simple Directed Acyclic Graph (DAG)

• This example demonstrates a composite Grid job using multi-staged multitask

• GTLAB handles lifecycle of DAG within JSF application

Page 14: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

<o:submit id=”test” action=”next_page” /> <o:multitask id=”mytask” taskname=”test” persistent=”true” > <o:myproxy id=”pr” hostname=”gf1.ucs.indiana.edu” port=”7512” lifetime=”2”

username=“nacar” password=”***” /> <o:fileoperation id=”taskA” command=”mkdir” hostname=”cobalt.ncsa.teragrid.org” path=”/home/manacar/tmp/” /> <o:filetransfer id=”taskB”

from=”gridftp://gf1.ucs.indiana.edu:2811/home/manacar/input_file” to=”gridftp://cobalt.ncsa.teragrid.org:2811/home/manacar/input_file” /> <o:jobsubmit id=”taskC” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4”

executable=”/bin/execute” stdin=”tmp/input_file” stdout=”tmp/result” stderr=”tmp/error” /> <o:filetransfer id=”taskD”

from=”gridftp://cobalt.ncsa.teragrid.org:2811/home/manacar/tmp/result” to=” gridftp://gf1.ucs.indiana.edu:2811/home/manacar/result” /> <o:dependency id=”dep1” task=”taskB” dependsOn=”taskA” /> <o:dependency id=”dep2” task=”taskC” dependsOn=”taskB” /> <o:dependency id=”dep3” task=”taskD” dependsOn=”taskC” /> </o:multitask></o:submit>

DAG Example JSF Page

This encodes the DAG on the previous page.

Page 15: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

• Grid tags are embedded into JSF view pages and decorated with standard JSF form, input, output and button components– Grid components are non-visual.

15

GTLAB

Standard JSF

Page 16: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

DAG extensions: Condor DAGMan• GTLAB architecture is extensible

– Besides Globus Grid services support, Grid community also use other common Grid services such as Condor

• We extend GTLAB to support Condor capabilities– Condor DAGMan is a tool for complex application

workflows on Condor– Birdbath is Web services provider of Condor capabilities– GTLAB tags for DAGMan use Birdbath service to create

client stubs• Grid tags integrate DAGMan with the following tags:

– <o:condorDagman/> and <o:condorSubmit/>• GTLAB executes and monitors DAGs by Condor• Composing DAGMan workflow is out of scope.

16

Page 17: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Taverna workflows• DAGs provide advantages to pipe Grid invocations (dataflow)• But DAGs cannot support full-fledged workflow capabilities like

conditional branches and loops. – We studied Taverna as a test case.– We can investigate Kepler and BPEL implementations as future

extensions• Workflow systems have different features:

– Composing– Enacting– Monitoring

• GTLAB supports Taverna enactment and monitoring.• GTLAB imports well studied built-in workflows collected by the

community– Bioinformatics workflows and their metadata is available

• Workflow composition is out of scope of this work– There is much ongoing research in this area

17

Page 18: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Taverna use case

• A user interacts with a workflow portlet (left of picture) to utilize Taverna enactor.

• User provides parameters by submitting a web form that start the sequence of events.

April 21, 2023 Mehmet Nacar 18

JSF XMLGrid tags Scufl document

load scufl document

JSF Action Taverna bean

Taverna enactor

execute

execute

submit

load input parameters

extract grid tags

provide input parametersHTTP

(1)

(3)

(2) (4)

(4)

(5)

Page 19: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Advantages of GTLAB• GTLAB provides simplicity to develop science portals

– Rapid development– Easy deployment

• Grid tags provide rich selection of attributes to initialize Grid beans.

• Composite tasks can contain an unlimited number of subtasks

• GTLAB gives flexibility to developers to use their own Grid beans library or add more Grid beans to the existing ones.– Following the method name convention of GTLAB

• Grid beans also can be imported to any presentation logic– GCEShell is a command-line tool

• Grid bean methods are bound to tags with attributes to simplify the building of new Grid portlets

Page 20: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

• Total overhead = Tsubmit-Trequest = 156 msec (that includes JSF overhead)

• Average overhead of GTLAB is about few milliseconds• GTLAB does not add up significant delay on processing the requests. 20

GTLAB Processing JSF Processing Handler storing Submitting

Time (msec) 2 153 1 410

GTLAB Overhead

Page 21: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

GTLAB Related Work

• GridSphere’s Grid Portlets 1.3 – Grid Portlets 1.3 provide API and User Interface (UI) tags to

build Grid portlets– An effort called Vine (Portlet Vine) refactors Grid portlets and

decouples the portlets from GridSphere• Karajan

– XML based workflow language and engine for Grid computing– Built on Java CoG Kit – Requires additional effort to aggregate it into Grid portals

• MyGrid Portlet Interface (MPI)– Based on Taverna workflows (Scufl)– Workflow execution and monitoring portlets

Page 22: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

Conclusion and Future Work

• Described the process for creating Grid portlets using GTLAB.– We gained rapid development by using reusable

components• GTLAB can handle composite Grid operations• Extended our approach to support different workflow

engines.– Condor DAGMan and Taverna

• Investigate integrating AJAX and more generally gadget architectures (Web 2.0 portals) with our session management and monitoring.

22

Page 23: GTLAB: Grid Tag Libraries Supporting Workflows within Science Gateways

More Information

• GTLAB version 1.0 Beta release available at– http://grids.ucs.indiana.edu/users/manacar/GTLAB

-website

• See link from main OGCE web site– http://www.collab-ogce.org

• Contact OGCE: [email protected]• Contact Mehmet: [email protected]

23