1 p-grade portal tutorial at egee'09 gergely sipos mta sztaki egee training and induction

Post on 17-Jan-2018

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

3 Workflow The automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules to achieve, or contribute to, an overall business goal. Workflow management system (WFMS) is the software that does it Workflow Reference Model, 19/11/1998

TRANSCRIPT

1

P-GRADE Portal tutorial at P-GRADE Portal tutorial at EGEE'09EGEE'09

www.lpds.sztaki.hu/gasuc www.portal.p-grade.hu

Gergely SiposMTA SZTAKI

sipos@sztaki.hu

EGEE Training and InductionEGEE Application Porting Support

2

Agenda of the morningAgenda of the morning

• Introduction to workflow concept• Workflow hands-on

~ Break

• Parameter studies• Parameter study hands-on

• Further information and next steps

3

WorkflowWorkflow

The automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules to achieve, or contribute to, an overall business goal.

• Workflow management system (WFMS) is the software that does it

www.wfmc.org

Workflow Reference Model, 19/11/1998

4

Why use workflowWhy use workflowss in Grid? in Grid?

• Build distributed applications through orchestration of multiple services

• A single job or a single service is good for nothing…

• Integration of multiple teams involved• Collaborative work

• Unit of reusage• (E-)science requires traceable, repetable analysis

• (Typically) ease of use grids• Graphical representation

9

Grid WFMSGrid WFMS

Source: Jia Yu and Rajkumar Buyya: A Taxonomy of Workflow Management Systems for Grid Computing, Journal of Grid Computing, Volume 3, Numbers 3-4 / September, 2005

15

(Some of the) available grid (Some of the) available grid workflow systemsworkflow systems

http://www.gridworkflow.org Categories for

– Composition tools – Description languages

• Scientific• Industrial• Formalism

– Engines

Some relevant tools for ARC, gLite, Globus, UNICORE grid users• Condor DAGMan

– Used as an enactor in P-GRADE Portal, Pegasus, …– Uses DAGMan WF language (DAG = Directed Acyclic Graph)

• MOTEUR– Interfaced with “pilot job” framework on EGEE (pull style job execution)– Uses SCUFL WF language

• gLite WMS– Describe workflows in JDL– Share Input-Output sandboxes with multiple jobs

• Taverna– Mainly for cluster computing– ARC interface is available by Lubeck University

• …

16

P-GRADE PortalP-GRADE Portal

A Grid WFMS

www.portal.p-grade.hu

17

Short History of P-GRADE portalShort History of P-GRADE portal

• Parallel Grid Application Development Environment

• Initial development started in the Hungarian SuperComputing Grid project in 2003

• It has been continuously developed since 2003• Around 30 manyear development + training + user support

• Detailed information: http://portal.p-grade.hu/ • Open Source community development since

January 2008: https://sourceforge.net/projects/pgportal/

• Current version: 2.8

18

Current Current P-GRADE P-GRADE Portal Portal related projectsrelated projects

• GGF GIN (Since 2006)– Providing the GIN Resource Testing portal

• EU EGEE-II, EGEE-III (2006-2010)– Tool recommended for application development– Intensively used in new users’ training

• EU SEE-GRID-SCI (2008-2010)– Interfacing to DSpace-based workflow storage– Infrastructure testing workflows

• EU CancerGrid (2007-2009)– Development of new generation P-GRADE (gUSE

and WS-PGRADE)– Integration with desktop grids

• EU EDGeS (2008-2009)– Transparent access to Desktop Grid systems

19

Portal installationsPortal installations

P-GRADE Portal services:– SEE-GRID infrastructure– Several VOs of EGEE:

• Biomed, Astronomy, Central European, NA4,...– GILDA: Training VO of EGEE– Many national Grids (UK National Grid Service,

HunGrid, Turkish Grid, etc.)– US Open Science Grid, TeraGrid– OGF Grid Interoperability Now (GIN) VO– …

Portal services and account request:http://portal.p-grade.hu/index.php?m=3&s=0 Account request form on portal login page

20

Multi-Grid portal installation:Multi-Grid portal installation:www.lpds.sztaki.hu/multi-gridwww.lpds.sztaki.hu/multi-grid

21

Design principlesDesign principles of P-GRADE portalof P-GRADE portal

• P-GRADE Portal is not only a user interface, it is a – General purpose– Workflow-level – Multi-Grid – Application Development and Execution Environment

• P-GRADE Portal includes a high-level middleware layer for orchestrating jobs on grid resources – inside a grid– among several different grids (and several VOs)

• P-GRADE Portal is grid-neutral:– Unlike many existing grid portals it is not tailored to any particular grid

type– Can be connected to various grids based on different grid middleware

• LCG-2, gLite, GT2, GT4, ARC, Unicore, etc.– Implements the high-level grid middleware services on top of the

existing grid middleware services– The workflow interface is the same no matter which type of grid is

connected to it

22

What is a P-GRADE Portal workflow?What is a P-GRADE Portal workflow?

• A directed acyclic graph where– Nodes represent jobs (batch

programs to be executed on a computing element)

– Ports represent input/output files the jobs expect/produce

– Arcs represent file transfer operations

• semantics of the workflow:– A job can be executed if all

of its input files are available

23

Three levels of parallelismThree levels of parallelism

– PS workflow level: Parameter study execution of the workflow

– Workflow level: Parallel execution among workflow nodes (WF branch parallelism)

Multiple jobs run parallel

Each job can be a parallel program

– Job level: Parallel execution inside a workflow node (MPI job as workflow component)

Multiple instances of the same workflow process

different data files

24

~100independent

jobs torun

Example: Computational ChemistryExample: Computational Chemistry

Department of Chemistry, University of Perugia

SOLUTION OF SCHRODINGER EQUATION FOR TRIATOMIC SYSTEMS USING TIME-DEPENDENT (RWAVEPR) OR TIME INDEPENDENT (ABC) METHOD

A single execution can be between 5 hours and 10 hours

SEQUENTIAL FORTRAN 90

Many simulations at the same time

Full story: EGEE Grid Application Porting Support - http://www.lpds.sztaki.hu/gasuc/index.php?m=7&s=3

25

Typical user scenarioTypical user scenarioJob compilation phaseJob compilation phase

Portalserver

Gridservices

DOWNLOAD BINARI(ES)

UPLOAD JOB SOURCE(S)

Client COMPILE – EDIT

26

Typical user scenarioTypical user scenarioWorkflow development phaseWorkflow development phase

Portalserver

Gridservices

START EDITOR

OPEN & EDIT WORKFLOW

ADD BINARIES

SAVE WORKFLOW

Client

DSpace WFrepository

IMPORT WORKFLOW

27

MyProxyCertificate servers

Portalserver

Gridservices

TRANSFER FILES, SUBMIT JOBS

DOWNLOAD (SMALL)

RESULTS

DOWNLOAD (SMALL) RESULTS

Typical user scenariosTypical user scenarios Workflow execution phaseWorkflow execution phase

VISUALIZE JOBS and

WORKFLOW PROGRESS

MONITOR JOBS

DOWNLOAD PROXY CERTIFICATES

Client

28

Accessing local and remote filesAccessing local and remote files

Portalserver

Gridservices

Computing elements

Storage elements and File catalogs

REMOTE INPUTFILES

REMOTE OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

Only the permanent

files!

Use legacy executables with Grid files without touching the code

29

Extended DAGMan

Java Webstartworkflow editorWeb browser

EGEE, Globus (and ARC) Grid services + MyProxy service (gLite WMS, LFC,…; Globus GRAM, …)

Globus and gLite command line clients + scripts

P-GRADE PortalP-GRADE Portal structural overviewstructural overview

Extended DAGMan WF specification

Globus GIISgLite BDII

DSpacerepository

30

Web interface - PortletsWeb interface - Portlets

31

Email notificationsEmail notifications

NOTIFY

32

Workflow portletWorkflow portlet

WORKFLOW EDITOR

33

Graphical workflow editingGraphical workflow editing

• To define a graph:1. Drag & drop components:

jobs and ports2. Define their properties3. Connect ports by

channels (no cycles, no loops)

System generates JDL for each job automatically

34

Workflow Workflow EditorEditorProperties of a jobProperties of a job

Properties of a job:• Executable file• Type of executable

(Sequential / Parallel)• Command line parameters• Which resource to use?

• Which VO?• Broker or Computing

element?

35

Workflow Workflow EditorEditorDefining input-output filesDefining input-output files

File propertiesType: input: the executable reads output: the executable generatesFile type: local: comes from my desktop remote: comes from an SEFile: location of the fileInternal file name: Executable uses this e.g. fopen(“file.in”, …)File storage type (output files only): Permanent: final result Volatile: temp. data channel

36

• Client side location:result.dat

• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04_-_result.dat

• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/result.dat

Local fileLocal file

Remote fileRemote file

How to refer to an I/O file?How to refer to an I/O file?

• Client side location:c:\experiments\11-04.dat

• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04.dat

• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/11-04.dat

Input file Output file

37

Upload a workflow from client side Upload a workflow from client side or from FTP serveror from FTP server

UPLOAD

STORED on FTP server

38

Importing an applicationImporting an application

INCOMPLETE WORKFLOW Open it in editor and save it again

39

Import a workflow from DSpace Import a workflow from DSpace repositoryrepository

40

External access to DSpaceExternal access to DSpacehttp://pgrade-dspace.sztaki.huhttp://pgrade-dspace.sztaki.hu

41

Certificate and proxy Certificate and proxy management Portletmanagement Portlet

42

OGF GIN interoperability portal by P-GRADEAcccessing Globus, gLite and ARC based grids/VOs simultaneously

P-GRADEGEMLCA

Portal

GEMLCA GEMLCA RepositoryRepository

P-GRADEportal

Proxy 1

Proxy 2

Proxy 5

Proxy 4

Proxy 3

Proxy 6

43

Application executionApplication execution

44

Fault-tolerant executionFault-tolerant execution

• Utilizing– Condor DAGMan’s rescue mechanism– EGEE job resubmission mechanism of WMS

• If the EGEE broker leaves a job stuck in a CEs’ queue, the portal automatically – kills the job on this site and – resubmits the job to the broker by prohibiting this site.

• As a result – the portal guarantees the correct submission of a job

as long as there exists at least one matching resource

– job submission is reliable even in an unreliable grid

45

Information system visualizationInformation system visualization

46

LFC-SELFC-SE file browser portlet file browser portlet

47

Compilation supportCompilation support

48

WORKFLOW HANDS-ONWORKFLOW HANDS-ON

49

From workflows to From workflows to parameter studiesparameter studies

Advanced execution patterns

50

Scaling up a workflow to a Scaling up a workflow to a parameter studyparameter study

Complete workflow

P-GRADE Portal:Files in the same LFC catalog

(e.g. /grid/gilda/sipos/myinputs)

P-GRADE Portal:Results produced in

the same catalog

51

Advanced parameter studiesAdvanced parameter studiesGenerator

component(s)Initial input data

Generate orcut input into smaller pieces

Collector component(s)

Aggregate result

Complete workflow

P-GRADE Portal:Files in the same LFC catalog

(e.g. /grid/gilda/sipos/myinputs)

P-GRADE Portal:Results produced in

the same catalog

52

Concept of parameter study Concept of parameter study workflowsworkflows

GEN

SEQ

COLL

SEQSEQSEQ

Parameter study part

Collector part evaluates and

integrates the results

Generator part generates the

input parameter space

53

Turning a WF into a parameter studyTurning a WF into a parameter study

By switching at least one of the open input ports

into a “PS Input port” the WF is turned into a Parameter Study

54

Input-output files are stored in SEsInput-output files are stored in SEs/grid/gilda/sipos/InputImages Image.0 Image.1

/grid/gilda/sipos/XCoordinates XCoordinate.0 XCoordinate.1

/grid/gilda/sipos/YCoordinates YCoordinate.0 YCoordinate.1

/grid/gilda/sipos/Output ImagePart.0 ImagePart.1 . . .

2 x 2 x 2 = 8 execution of the whole workflow

CROSS PRODUCT of data items

55

A B

Typical data-flow compositionsTypical data-flow compositions

A X B

MWF

A1

A2

A3

B1

B2

B3

{A1, A2, A3} {B1, B2, B3}

XWF

A1

A2

A3

B1

B2

B3

{A1, A2, A3} {B1, B2, B3}

dot iterator:one-to-one

cross iterator:all-to-all

WF

Ai Bj

{A1, A2, A3}

match iterator

If Ai and Bj have acommon ancestor

{B1, B2, B3}

A M B

CROSS ITERATOR DOT ITERATOR MATCH ITERATOR

Find these in e.g. TAVERNA, MOTEURP-GRADE Portalsupports this

56

PS Input PortPS Input Port

Grid Directory instead of

FILE reference

57

Parameter generatorParameter generator

Generator can be attached to any parameter input port

Generator can be• Auto generator: to generate text files• Custom generator: to generate any content

Generated files are moved into SE by the portal

58

Definition Window of Auto Generator JobDefinition Window of Auto Generator Job

User defines the template of the text file

User puts key(s) into the template

User defines values for the key(s)• Integer number• Real number• Custom set• …

59

PPlacement of resultlacement of result

60

Will contain one compressed file for each execution of the workflow.

Use the default value!

Choose a „reliable” Storage Element

PPlacement of resultlacement of result

61

Executing PS workflowsExecuting PS workflows

PS Details for parameter sweep

workflows applications

62

Detailed view of a PS workflowDetailed view of a PS workflow

Workflow instances

Overall statistics of workflow instances

Collector job(s)

Generator job(s)

63

PARAMETER STUDY PARAMETER STUDY HANDS-ONHANDS-ON

64

Thank you!Thank you!

www.portal.p-grade.hupgportal@lpds.sztaki.hu

Learn once, use everywhereDevelop once, execute anywhere

65

Backup slides to answer Backup slides to answer questionsquestions

66

Proxy delegations Proxy delegations MyProxy

server

P-GRADE Portalserver GILDA

services

Proxy VOMSserver

ProxyProxy

VOMS ext.

Proxy

VOMS ext.

usernamepassword

Proxy based authentication

Login & psw based

authentication

usernamepassword

67

SettingsSettings

Portal administrator can – connect the portal

to several grids– register default

resources of the connected grids

68

SettingsSettings

User can customize the connected grids by adding and removing resources

top related