pegasus-a framework for planning for execution in grids karan vahi [email protected] usc information...

41
Pegasus-a framework for planning for execution in grids Karan Vahi [email protected] USC Information Sciences Institute May 5 th , 2004

Upload: jonas-patterson

Post on 20-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

Pegasus-a framework for planning for execution in grids

Karan [email protected]

USC Information Sciences Institute

May 5th , 2004

Page 2: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

2May 5th, 2004Karan Vahi, ISI [email protected]

People Involved

USC/ISI

Advanced Systems: Ewa Deelman, Carl Kesselmann, Gaurang Mehta, Mei-Hui Su, Gurmeet Singh, Karan Vahi.

Page 3: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

3May 5th, 2004Karan Vahi, ISI [email protected]

Outline

Introduction To Planning DAX Pegasus Portal Demonstration

Page 4: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

4May 5th, 2004Karan Vahi, ISI [email protected]

Planning in Grids

One has various alternatives out on the grid in terms of data and compute resources.

Planning – Select the best available resources and data sets,

and schedule them on to the grid to get the best possible execution time.

– Plan for the data movements between the sites

Page 5: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

5May 5th, 2004Karan Vahi, ISI [email protected]

Recipe For Planning

Understand the request– Figure out what data product the request refers to, and

how to generate it from scratch.

Locations of data products– Final data product – Intermediate data products which can be used to generate

the final data product.

Location of Job executables

State of the Grid– Available processors, physical memory available, job

queue lengths etc.

Page 6: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

6May 5th, 2004Karan Vahi, ISI [email protected]

Constituents of Planning

Domain Knowledge

ResourceInformation

Location Information

Planner

Plan submitted the grid

Page 7: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

7May 5th, 2004Karan Vahi, ISI [email protected]

Terms (1)

Abstract Workflow (DAX)– Expressed in terms of logical entities– Specifies all logical files required to generate the

desired data product from scratch– Dependencies between the jobs– Analogous to build style dag

Concrete Workflow– Expressed in terms of physical entities– Specifies the location of the data and executables– Analogous to a make style dag

Page 8: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

8May 5th, 2004Karan Vahi, ISI [email protected]

Outline

Introduction to Planning DAX Pegasus Portal Demonstration

Page 9: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

9May 5th, 2004Karan Vahi, ISI [email protected]

DAX

The format for specifying the abstract workflow, that identifies the recipe for creating the final data product at a logical level.

In case of montage, the IPAC webservice ends up creating the dax for the user request.

Developed at University Of Chicago

Page 10: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

10May 5th, 2004Karan Vahi, ISI [email protected]

DAX Example

<?xml version="1.0" encoding="UTF-8"?> <!-- generated: 2003-09-25T11:51:19-05:00 --> <!-- generated by: vahi [??] --> <adag xmlns="http://www.griphyn.org/chimera/DAX" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation="http://www.griphyn.org/chimera/DAX http://www.griphyn.org/chimera/dax-1.6.xsd" count="1" index="0" name="black-diamond">

<!-- part 1: list of all files used (may be empty) --> <filename file="f.a" link="input"/> <filename file="f.b" link="inout"/> <filename file="f.c" link="output"/> <!-- part 2: definition of all jobs (at least one) --> <job id="ID000001" namespace="montage" name="preprocess" version="1.0" level = "2"> <argument>-a top -T60 -i <filename file="f.a"/> -o <filename file="f.b"/> </argument> <uses file="f.a" link="input" dontRegister="false" dontTransfer="false"/> <uses file="f.b" link="output" dontRegister="true" dontTransfer="true" temporaryHint="true"/> </job> <job id="ID000002" namespace="montage" name="analyze" version="1.0" level="1" > <argument>-a bottom -T60 -i <filename file="f.b"/> -o <filename file="f.c"/></argument> <uses file="f.b" link="input" dontRegister="false" dontTransfer="false"/> <uses file="f.c" link="output" dontRegister="false" dontTransfer="false"/> </job>

<!-- part 3: list of control-flow dependencies (empty for single jobs) --> <child ref="ID000002"> <parent ref="ID000001"/> </child> </adag>

Page 11: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

11May 5th, 2004Karan Vahi, ISI [email protected]

Outline

Introduction to Planning DAX Pegasus Demonstration Portal

Page 12: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

12May 5th, 2004Karan Vahi, ISI [email protected]

Pegasus

A configurable system to map and execute complex workflows on the grids.– DAX Driven Configuration

– Metadata Driven Configuration

Can do full ahead planning or deferred planning to map the workflows.

Page 13: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

13May 5th, 2004Karan Vahi, ISI [email protected]

Full Ahead Planning

At the time of submission of the workflow, you decide where you want to schedule the jobs in the workflow.

Allows you to perform certain optimizations by looking ahead for bottleneck jobs and then scheduling around them.

However, for large workflows the decision you

make at submission time may no longer be valid or optimum at the point the job is actually run.

Page 14: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

14May 5th, 2004Karan Vahi, ISI [email protected]

Deferred Planning

Delay the decision of mapping the job to the site as late as possible.

Involves partitioning of the original dax into smaller daxes each of which refers to a partition on which Pegasus is run.

Construct a Mega DAG that ends up running pegasus automatically on the partition daxes, as each partition is ready to run.

Page 15: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

15May 5th, 2004Karan Vahi, ISI [email protected]

High Level Block Diagram

GridGridGrid

workflow executor(DAGman)Execution

WorkflowPlanning

Globus ReplicaLocation Service

Globus Monitoringand Discovery

Service

Information andModels

Application Models

detector

Raw data

Co

ncr

ete

Wo

rkfl

ow

tasks

Replica LocationAvailableReources

Monito

ring in

form

atio

n

Abstract Worfklow

Dyn

amic

info

rmat

ion

Request Manager

Replica andResourceSelectorSubmission and

Monitoring System

WorkflowReduction

DataPublication

IPAC/JPL WebService

DataManagement

Page 16: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

16May 5th, 2004Karan Vahi, ISI [email protected]

Replica Discovery

Pegasus needs to know where the input files for the workflow reside.

In Montage case, it should know where the fits files that are required for the mProject jobs reside.

Hence Pegasus needs to discover the files that are required for executing a particular abstract workflow.

Page 17: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

17May 5th, 2004Karan Vahi, ISI [email protected]

RLS

RLI

LRCA LRCCLRCB

Each LRC sends periodic updates to the RLI

Each LRC is responsible for one pool

Pegasus

1) Pegasus queries RLI with the LFN

2) RLI returns the list of LRC’s that contain the desired mappings.

3) Pegasus queries each LRC in the list to get the PFN’s.

Figure (1) RLS Configuration for Pegasus

Interfacing to RLS done by Karan Vahi, Shishir

Page 18: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

18May 5th, 2004Karan Vahi, ISI [email protected]

Alternate Replica Mechanisms

Replica Catalog– Pegasus supports the LDAP based Replica

Catalog

User defined mechanisms– Pegasus provides the flexibility for the user

to specify his own replica mechanism instead of RLS or Replica Catalog

– The user just has to implement the concerned interface

Design and Implementation done by Karan Vahi

Page 19: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

19May 5th, 2004Karan Vahi, ISI [email protected]

Transformation Catalog

Pegasus needs to access a catalog to determine the pools where it can run a particular piece of code.

If a site does not have the executable, one should be able to ship the executable to the remote site.

Generic TC API for users to implement their own transformation catalog.

Current Implementations– File Based– Database Based

Page 20: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

20May 5th, 2004Karan Vahi, ISI [email protected]

File based Transformation Catalog

Consists of a simple text file.– Contains Mappings of Logical Transformations to Physical

Transformations.

Format of the tc.data file#poolname logical tr physical tr envisi preprocess /usr/vds/bin/preprocess VDS_HOME=/usr/vds/;

All the physical transformations are absolute path names.

Environment string contains all the environment variables required in order for the transformation to run on the execution pool.

Page 21: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

21May 5th, 2004Karan Vahi, ISI [email protected]

DB based Transformation Catalog

Presently ported on MySQL. Postgres to be tested.

Adds support for transformations, compiled for different architectures, OS, OS version and glibc combination, that would enable us to transfer transformation to remote sites if the executable does not reside there.

Supports multiple profile namespaces. At present using only the env namespace.

Supports multiple physical transformations for the same logical transformation,pool,type tuple.

Page 22: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

22May 5th, 2004Karan Vahi, ISI [email protected]

Pool Configuration (1)

Pool Config is an XML file which contains information about various pools on which DAGs may execute.

Some of the information contained in the Pool Config file is– Specifies the various job-managers which are available on

the pool for the different types of condor universes.– Specifies the GridFtp storage servers associated with each

pool.– Specifies the Local Replica Catalogs where data residing in

the pool has to be cataloged.– Contains profiles like environment hints which are common

site wide.– Contains the working and storage directories to be used on

the pool.

Page 23: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

23May 5th, 2004Karan Vahi, ISI [email protected]

Pool Configuration (2)

Two Ways to construct the Pool Config File.– Monitoring and Discovery Service

– Local Pool Config File (Text Based)

Client tool to generate Pool Config File– The tool genpoolconfig is used to query the

MDS and/or the local pool config file/s to generate the XML Pool Config file.

Page 24: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

24May 5th, 2004Karan Vahi, ISI [email protected]

Pool Configuration (3)

This file is read by the information provider and published into MDS.

Formatgvds.pool.id : <POOL ID>gvds.pool.lrc : <LRC URL>gvds.pool.gridftp : <GSIFTP URL>@<GLOBUS VERSION>gvds.pool.gridftp : gsiftp://sukhna.isi.edu/nfs/asd2/[email protected] : <UNIVERSE>@<JOBMANAGER URL>@<

GLOBUS VERSION>gvds.pool.universe : [email protected]/jobmanager-

[email protected] : <Path to Kickstart executable>gvds.pool.workdir : <Path to Working Dir>gvds.pool.profile : <namespace>@<key>@<value>gvds.pool.profile : env@GLOBUS_LOCATION@/smarty/gt2.2.4gvds.pool.profile : vds@VDS_HOME@/nfs/asd2/gmehta/vds

Page 25: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

25May 5th, 2004Karan Vahi, ISI [email protected]

DAX Driven Configuration(1)

Pegasus uses IPAC/JPL webservice as an abstract workflow generator

Pegasus takes in this abstract workflow and creates a concrete workflow by consulting the various grid services described before

Page 26: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

26May 5th, 2004Karan Vahi, ISI [email protected]

(5) Full Abstract Dag

DAX Driven Configuration(2)IPAC/JPL Service

MCS

RLS

MDS

Transformation Catalog

Current State Generator

Request Manager

Concrete Planner

Abstract Dag Reduction

Abstract and Concrete Planner

VDL GeneratorSubmit File Generator

DAGMan Submission& Monitoring

Condor-G/ DAGMan

(1) Abstract Workflow(DAG)

(2) Abstract Dag

(3) Logical File Names(LFN’s)

(4) Physical File Names(PFN’s)

(6) Reduced Abstract DAG

(7) LogicalTransformations

(8) PhysicalTransformations and

Execution EnvironmentInformation

(9) Concrete Dag

(10) Concrete Dag

(11) DAGMan files

(13) DAG (14) Log files

(15) Monitoring

(16) Results

(12) DAGMan files

Page 27: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

27May 5th, 2004Karan Vahi, ISI [email protected]

DAG Reduction

Abstract Dag Reduction– Pegasus queries the RLS with the LFN’s

referred to in the Abstract Workflow

– If data products are found to be already materialized, Pegasus reuses them and thus reduces the complexity of CW

Page 28: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

28May 5th, 2004Karan Vahi, ISI [email protected]

KEYThe original node

Pull transfer node

Registration node

Push transfer node

Job e

Job g Job h

Job d

Job aJob c

Job f

Job i

Job b

Abstract Dag Reduction

Pegasus Queries the RLS and finds the

data products of jobs d,e,f already

materialized. Hence deletes those jobs

On applying the reduction algorithm additional jobs a,b,c

are deleted

Implemented by Karan Vahi

Page 29: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

29May 5th, 2004Karan Vahi, ISI [email protected]

Pegasus adds replica nodes for each job that materializes data (g, h, i ).

These three nodes are for transferring the output files of the leaf job (f) to the output pool, since job f has been deleted by the Reduction Algorithm.

KEYThe original node

Pull transfer node

Registration node

Push transfer node

Node deleted by Reduction algo

Inter-pool transfer node

Job e

Job g Job h

Job d

Job aJob c

Job f

Job i

Job b

Concrete Planner (1)

Pegasus schedules job g,h on pool X and job i on pool Y. Hence adding an interpool transfer node

Pegasus adds transfer nodes for transferring the input files for the root nodes of the decomposed dag (job g)

Implemented by Karan Vahi

Page 30: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

30May 5th, 2004Karan Vahi, ISI [email protected]

Transient Files

Selective Transfer of output files– Data Sets generated by intermediate nodes in DAG

are huge– However, user maybe interested only in outputs of

selected jobs– Transfer of all the files could severely overload the

jobmanagers on the compute sites Need For Selective Transfer of Files

– For each file at the virtual data, user can specify whether it is transient or not.

– Pegasus bases it’s decision on whether to transfer the file or not on this.

Implemented by Karan Vahi

Page 31: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

31May 5th, 2004Karan Vahi, ISI [email protected]

Outline

Introduction to Planning DAX Pegasus Portal Demonstration

Page 32: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

32May 5th, 2004Karan Vahi, ISI [email protected]

Portal Architecture

Page 33: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

33May 5th, 2004Karan Vahi, ISI [email protected]

Portal Demonstration

Page 34: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

34May 5th, 2004Karan Vahi, ISI [email protected]

Outline

Introduction to Planning DAX Pegasus Portal Demonstration

Page 35: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

35May 5th, 2004Karan Vahi, ISI [email protected]

Demonstration

Run a small black diamond dag using both full ahead planning and deferred planning on the isi condor pool.

Show the various configuration files (tc.data and pool.config) and how to generate them (pool.config).

Generate the condor submit files.

Submit the condor dag to condor dagman.

Page 36: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

36May 5th, 2004Karan Vahi, ISI [email protected]

Software Required!!

Submit Host– Condor DAGMAN (to submit the workflows on the grid).– Java 1.4 (to run Pegasus)– Globus 2.4 or higher– Globus RLS (the registration jobs run on the local host).– Xerces, ant , cog etc that come with the VDS distribution

Compute Sites (Machines in the pool)– Globus 2.4 or higher (gridftp server, g-u-c, MDS)– On one machine per pool, an lrc should be running. – Condor daemon running.– Various jobmanagers correctly configured.

Page 37: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

37May 5th, 2004Karan Vahi, ISI [email protected]

TC File

Walk through the editing of TC file.

A command line client is also in the works that allows you to update, add and modify the entries in your transformation catalog regardless of the underlying implementation.

Page 38: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

38May 5th, 2004Karan Vahi, ISI [email protected]

GenPoolConfig (Demo)

genpoolconfig is the client to generate the pool config file required by Pegasus.

It queries the MDS and/or a local pool config file (text based) and generates a XML file.

Am going to generate the pool config file from the text based configuration.

Usage : – genpoolconfig –Dvds.giis.host <MDS GIIS hostname> -

Dvds.giis.dn <MDS GIIS DN> --poolconfig <comma separated local pool config files> --output <pool config output>

Page 39: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

39May 5th, 2004Karan Vahi, ISI [email protected]

gencdag

The Concrete planner takes the DAX produced by Chimera and converts into a set of condor dag and submit files.

Usage : gencdag –dax|--pdax <file> --p <list of execution pools> [--dir <dir for o/p files>] [--o <outputpool>] [--force]

You can specify more then one execution pools. Execution will take place on the pools on which the executable exists. If the executable exists on more then one pool then the pool on which the executable will run is selected randomly.

Output pool is the pool where you want all the output products to be transferred to. If not specified the materialized data stays on the execution pool

Page 40: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

40May 5th, 2004Karan Vahi, ISI [email protected]

Mei’s Exploits

Mei has been running the montage code for the past one year, including some huge 6 and 10 degree dags (for the m16 cluster).

The 6 degree runs had about 13,000 compute jobs and the 10 degree run had about 40,000 compute jobs!!!

The final mosaic files can be downloaded from http://www.isi.edu/~griphyn/out_M16_10.fits

http://www.isi.edu/~griphyn/out_M16_6.fits

Page 41: Pegasus-a framework for planning for execution in grids Karan Vahi vahi@isi.edu USC Information Sciences Institute May 5 th, 2004

41May 5th, 2004Karan Vahi, ISI [email protected]

Questions?