CCGrid 2003, Tokyo, Japan
GridFlow: Workflow Management for Grid Computing
Junwei Cao (曹军威 )C&C Research Labs, NEC Europe Ltd., Germany
Stephen A. Jarvis and Graham R. NuddDept. of Computer Science, Univ. of Warwick,
UKSubhash Saini
NASA Ames Research Center, USA
CCGrid 2003, Tokyo, Japan
Outline
Background – Grid Workflow System Architecture GridFlow User Portal Global Grid Workflow Management Local Grid Sub-workflow Scheduling Fuzzy Timing Techniques Summary Ongoing and Future Work
CCGrid 2003, Tokyo, Japan
Background – Grid Workflow
Workflow DefinitionWPDL, BPEL4WS, GSFL, ASCI Grid, … Workflow SystemsWebFlow, Symphony, GridAnt, BPWS4J, TENT, … Component-based SystemsCCA/XCAT, SCIRun, CXML, … Other SystemsCondor DAGMan, UNICORE, MyGrid, GEMSS,
GridLab, BioOpera, USC Grid failure handling, …
CCGrid 2003, Tokyo, Japan
Grid Resources Grid Resource: A particular grid resource
is a high-end computing or storage resource that can be accessed remotely.
Local Grid: A local grid consists of multiple grid resources that belong to one organization.
Global Grid: The global grid includes all grid resources that belong to different organizations within a virtual organization.
CCGrid 2003, Tokyo, Japan
Grid Tasks Task: Tasks are the smallest elements in a
grid workflow, e.g. MPI & PVM programs. Sub-workflow: A sub-workflow is a flow of
closely related tasks that is to be executed in a predefined sequence on grid resources of a local grid (within one organization).
Workflow: A grid application can be represented as a flow of several different activities, each activity represented by a sub-workflow.
CCGrid 2003, Tokyo, Japan
Grid Management
Mapping grid workflows to the global grid
Mapping grid sub-workflows to local grids
Mapping grid tasks to grid resources
CCGrid 2003, Tokyo, Japan
System Architecture
Global Grid
GridFlow User Portal
Grid Resources
Workflow Management
Resource Management(ARMS)
Info
rmatio
n S
erv
ices
(Glo
bus M
DS)
Local GridSub-workflow scheduling
Resource Scheduling(Titan)
Perfo
rmance
Serv
ices
(PA
CE …
)
Grid Users
CCGrid 2003, Tokyo, Japan
PACE Performance Prediction
Application Tools
Resource Tools
Evaluation Engine
Source CodeAnalysis
ObjectEditor
ObjectLibrary
PSL Compiler
CPUNetwork
(MPI, PVM)Cache
(L1, L2)
HMCL Compiler
CCGrid 2003, Tokyo, Japan
Titan Resource Scheduling
Heuristic Evolutionar
y Near-
optimal: Makespan Idletime Deadlines
CCGrid 2003, Tokyo, Japan
ARMS Grid Management
Agent structure Communication layer Decision-making layer Local management
layer Agent hierarchy Service advertisement Service discovery Agent Capability Tables
A
A A
A A
User
CCGrid 2003, Tokyo, Japan
GridFlow User Portal
CCGrid 2003, Tokyo, Japan
Global Grid Workflow Management
S2
startT = 0exeT = 3endT = 3
S1
startT = 0exeT = 0endT = 0
S3
startT = 0exeT = 5endT = 5
S4
startT = 5exeT = 7
endT = 12
S5
startT = 5exeT = 4endT = 9
S6
startT = 12exeT = 0
endT = 12
/ 7/ 12
S2
S1
S3
S4
S5
S6
/ 5/ 5
CCGrid 2003, Tokyo, Japan
Local Grid Sub-workflow Scheduling
Scheduling a flow of tasks onto grid resources within a local grid is very similar to the process that schedules a workflow onto different local grids. There are two challenges:
It is a difficult task to provide an accurate prediction on task/workflow start, execution and end times.
Multiple tasks from different sub-workflows may require the same grid resource at the same time.
CCGrid 2003, Tokyo, Japan
Fuzzy Timing Techniques
Turning the “prediction accuracy” into a fuzzy concept that is represented using fuzzy numbers.
0
0.5
1
0 1 2 3 4 5 6 7
π1(τ)=0.5(0,2,6,7)
π2(τ)=(2,4,4,6)
CCGrid 2003, Tokyo, Japan
Fuzzy Number Operations
0
0.5
1
0 1 2 3 4 5 6 7
0
0.5
1
0 1 2 3 4 5 6 7
0
0.5
1
0 1 2 3 4 5 6 70
0.5
1
0 1 2 3 4 5 6 7
latest earliest
min max
0
0.5
1
0 1 2 3 4 5 6 7
0
0.5
1
0 4 8 12
sum
CCGrid 2003, Tokyo, Japan
Resource Conflict Solving I The start time of a task cannot be configured with the
latest end time of its pre-tasks directly, since other tasks exists that may use the same resource at the same time.
A first-come possibly-first-serve policy is adopted. This does not order the conflictive tasks explicitly, but adds some information on degrees of possibilities of task start times.
CCGrid 2003, Tokyo, Japan
Resource Conflict Solving II
All possible start sequences are considered and are combined to provide an estimation of the end time.
CCGrid 2003, Tokyo, Japan
Summary GridFlow is a prototype grid workflow
management system, focusing on grid workflow simulation and scheduling.
GridFlow is based on a specific grid resource management infrastructure implemented using agent-based methodologies and performance-driven scheduling technologies.
Making grid workflow management a reality also requires to address general grid computing challenges: openness, standards, security and QoS support.
CCGrid 2003, Tokyo, Japan
Ongoing Work – Applications
Developing grid enabled medical simulation services (GEMSS) using GT3
Developing grid performance services based on historical information analysis
Developing medical application workflows using BPEL4WS
TargetObject
Scanning &Preprocessing
Numerical Modeling
Analysis, Diagnosis,
Design
HPCSimulation
CCGrid 2003, Tokyo, Japan
Future Work – Agile Computing
Workflow techniques – one of keys for next generation agile (grid) computing
Flexibility (Performance, Adaptation, QoS, Individualization)
Efficiency (Cheap, Large-scale, Pervasive, Continuous, Massive)
Cluster ComputingHPC
Supercomputing
Grid ComputingP2P Computing
Internet ComputingAgile (Grid) Computing
CCGrid 2003, Tokyo, Japan
For More Information http://www.dcs.warwick.ac.uk/~hpsg/ http://www.ccrl-nece.de/~cao/ http://www.ccrl-nece.de/gemss/ http://www.agilecomputing.org/ mailto:[email protected]