pegasus: planning for execution in grids

17
Pegasus: Planning for Execution in Grids Ewa Deelman, Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Karan Vahi Information Sciences Institute University of Southern California

Upload: harrison-glass

Post on 31-Dec-2015

32 views

Category:

Documents


1 download

DESCRIPTION

Pegasus: Planning for Execution in Grids. Ewa Deelman, Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Karan Vahi Information Sciences Institute University of Southern California. Pegasus Acknowledgement. - PowerPoint PPT Presentation

TRANSCRIPT

Pegasus: Planning for Execution in

Grids

Ewa Deelman, Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Karan Vahi

Information Sciences InstituteUniversity of Southern California

Pegasus Acknowledgement

Ewa Deelman, Carl Kesselman, Saurabh Khurana, Gaurang Mehta, Sonal Patil, Gurmeet Singh, Mei-Hui Su, Karan Vahi (ISI)

James Blythe, Yolanda Gil (ISI) http://pegasus.isi.edu Research funded as part of the NSF

GriPhyN, NVO and SCEC projects.

Grid Applications Increasing in the level of complexity Use of individual application components Reuse of individual intermediate data products Description of Data Products using Metadata Attributes

Execution environment is complex and very dynamic Resources come and go Data is replicated Components can be found at various locations or staged in

on demand

Separation between the application description the actual execution description

FFT

FFT filea

/usr/local/bin/fft /home/file1

transfer filea from host1://home/filea

to host2://home/file1

ApplicationDomain

AbstractWorkflow

ConcreteWorkflow

ExecutionEnvironment

host1 host2

Data

Data

host2

App

licat

ion

Dev

elop

men

t and

Exe

cutio

n P

roce

ss

DataTransfer

Resource SelectionData Replica Selection

Transformation InstanceSelection

ApplicationComponentSelection

Retry

Pick different Resources

Specify aDifferentWorkflow

Failure RecoveryMethod

Abstract Workflow

Generation

ConcreteWorkflow

Generation

Pegasus:Planning for Execution in Grids

Maps from abstract to concrete workflow Algorithmic and AI based techniques

Automatically locates physical locations for both components (transformations) and data Use Globus RLS and the Transformation Catalog

Finds appropriate resources to execute via Globus MDS

Reuses existing data products where applicable Publishes newly derived data products

Chimera virtual data catalog

GridGridGrid

workflow executor (DAGman)Execution

WorkflowPlanning

Globus Replica Location Service

Globus Monitoring and Discovery

Service

Information and Models

detector

Raw data

Co

ncr

ete

Wo

rkfl

ow

Replica LocationAvailable Reources

Abstract Worfklow

Dyn

amic

in

form

atio

n

Request Manager

Replica and Resource SelectorSubmission and

Monitoring System

Workflow Reduction

DataPublication

Virtual Data Language Chimera

Data Management

Chimera is developed at ANLBy I. Foster, M. Wilde, and J. Voeckler

LIGO-specific interface

Montage-specific

Interface

Globus MDS

Globus RLS

Chimera

User

Metadata/

VDLMetadata/

Abstract Workflow

PegasusPortal

The Grid

Metadata CatalogService

TransformationCatalog

MyProxy

DAGMan

Authentication

VDL/

Abst

ract

Wor

fklo

w

Execution

records

Abstract Workflow/

Information

Information

Concr

ete

Wor

kflo

w/

Info

rmat

ion

Jobs/Information

Information

Sim

plifi

ed V

iew

of S

C 2

003

Por

tal

LIGO Scientific Collaboration

Continuous gravitational waves are expected to be produced by a variety of celestial objects

Only a small fraction of potential sources are known Need to perform blind searches, scanning the regions of

the sky where we have no a priori information of the presence of a source

Wide area, wide frequency searches Search is performed for potential sources of continuous

periodic waves near the Galactic Center and the galactic core

The search is very compute and data intensive LSC is using the occasion of SC2003 to initiate a month-

long production run with science data collected during 8 weeks in the Spring of 2003

Additional resources used: Grid3 iVDGL resources

Thanks to everyone involved in standing up the tested and contributing the resources!

LIGO Acknowledgements Bruce Allen, Scott Koranda, Brian Moe, Xavier Siemens,

University of Wisconsin Milwaukee, USA

Stuart Anderson, Kent Blackburn, Albert Lazzarini, Dan Kozak, Hari Pulapaka, Peter Shawhan, Caltech, USA

Steffen Grunewald, Yousuke Itoh, Maria Alessandra Papa, Albert Einstein Institute, Germany

Many Others involved in the Testbed

www.ligo.caltech.edu www.lsc- group.phys.uwm.edu/lscdatagrid/ http://pandora.aei.mpg.de/merlin/

LIGO Laboratory operates under NSF cooperative agreement PHY-0107417

Montage Montage (NASA and NVO) Deliver science-grade

custom mosaics on demand

Produce mosaics from a wide range of data sources (possibly in different spectra)

User-specified parameters of projection, coordinates, size, rotation and spatial sampling.

Mosaic created by Pegasus based Montage from a run of the M101 galaxy images on the Teragrid.

Small Montage Workflow

Montage Acknowledgments Bruce Berriman, John Good, Anastasia Laity,

Caltech/IPAC Joseph C. Jacob, Daniel S. Katz, JPL http://montage.ipac. caltech.edu/

Testbed for Montage: Condor pools at USC/ISI, UW Madison, and Teragrid resources at NCSA, PSC, and SDSC.

Montage is funded by the National Aeronautics and Space Administration's Earth Science Technology Office, Computational Technologies Project, under Cooperative Agreement Number NCC5-626 between NASA and the California Institute of Technology.

Current System

Original Abstract Workflow

Current Pegasus

Pegasus(Abstract Workflow)

DAGMan(CW))

Co

ncre

te W

orfklo

w

Workflow Execution

Just In-time planning Partition Abstract workflow into partial

workflows

PW A

PW B

PW C

A Particular PartitioningNew Abstract

Workflow

Meta-DAGMan

Pegasus(A)

Pegasus(B)

Pegasus(C)

DAGMan(Su(A))

Su(B)

Su(C)

DAGMan(Su(B))

DAGMan(Su(C))

Pegasus(X) –Pegasus generates the concrete workflow and the submit files for X = Su(X)

DAGMan(Su(X))—DAGMan executes the concrete workflow for X

Other Applications Using Pegasus Other GriPhyN applications:

High-energy physics: Atlas, CMS (many) Astronomy: SDSS (Fermi Lab, ANL)

Astronomy: Galaxy Morphology (NCSA, JHU, Fermi, many others,

NVO-funded) Biology

BLAST (ANL, PDQ-funded) Neuroscience

Tomography (SDSC, NIH-funded)

http://pegasus.isi.edu Funding by NSF GriPhyN, NSF NVO, NIH