1 m. tudruj, j. borkowski, d. kopanski inter-application control through global states monitoring on...
TRANSCRIPT
1
M. Tudruj, J. Borkowski, D. Kopanski
Inter-Application Control Through Global States Monitoring On a Grid
Polish-Japanese Institute of Information Technology, Koszykowa 86, 02-008 Warsaw, Poland
2
Arrows represent reliable, asynchronous communication channels
P1
P2 P3 P4
S a
state information
S b
control
Processes can communicate with a number of Synchronizers. Synchronizers learn state information from processes and send back control information.
synchronizers
processes
Monitoring global states
3
•There is no global clock, no shared memory
•Synchronizer must be able to order properly incoming events to build Strongly Consistent Global States (SCGS)
•SCGS is a combination of process local states, one state from each process, such that the local states are pairwise concurrent. E.g. <s1,t1> is a SCGS, <s1,t2> is not.
P1
P2
sync
e1 e2
f1 f2
s1
t2t1
m1m2
Monitoring consistent global states
4
•Events must have timestamps to be able to order messages correctly. Logical vector clocks or real time intervals based on roughly synchronized local clocks can be used
•If process local clocks are synchronized with a known accuracy, then real time interval timestamps can be used to identify SCGS
process 1
process 2
e1 e2 e3 e4 e5
e1 e3 e2 e4
SCGS duration period e1 event occurrence interval, an event occurred somewhen within this time interval
S1 S2 S3
S1 a SCGS on a linear lattice
Strongly Consistent Global States
5
computations
activated procedure
continued computations
activity
cancellation handling
cancelled fragment
control signal
control signal
another activity
spared time
Computation activation and cancellation caused by predicate evaluation
6
A complete graphical programming environment for developing message passing applications designed at Parallel and Distributed Systems
Laboratory of the SZTAKI Institute of Hungarian Academy of Sciences
•Application level specifies processes
and their interconnections
•Process level defines control flow diagram of a process
•Text level is used to enter sequential C code into elements of a flow diagram
loopbegin
receivefrom port 0
send throughport 1
sequential code in C
GRADE system
7
standard message passing channels
local state info transfer channels
signal transfer channels
GRADE extension – PS-GRADEState information monitoring
8
Start
Wait for a process/application state information message
Predicate 1 satisfied? ...
Update state records and make them available for
control predicates
Send sync signals
Yes
Predicate 2 satisfied?
Predicate k satisfied?
Send sync signals
Send sync signals
Yes Yes Yes
No
State messages
No No No
Condition 2 Condition k Condition 1
Is a new SCGS / OGS detected?
PS-GRADE - synchronizer
9
condition
send signal
reception of state variables
PS-GRADE– synchronizer
control flow window
condition window
10
Start signal- sensitive region "watching-
signal"
End signal- sensitive region "endwatching-
signal"
Resume interrupted
computations
Cancel computations
Send state
End signal- insensitive region
Start signal- insensitive region
PS-GRADE– Process - control flow window
12
The principles of Grid application control
Control of Grid application by:•Data control flow (similarly to P-GRADE Workflow implemented by SZTAKI) , based on input and output files for cluster application •Grid Synchronizer :
•Collects information (vector of state) about application state•Detects SCGS or OGS •Computes conditions •Sends signals to the application
13
A Grid-level synchronizer inserted into a workflow graph
A1
1
A2 A3
1 1
A4
1
4 4 5 5
A5
1 2
Synch1
2 1 2 2
3 3 4 3
5 6
2 3
A6
3 4
1
14
A Grid-level synchronizer and an application
GRID Synchronizer
Application A2
Application A3
Application A4
Application A5
16
Simple workflow. A selected application starts executing after a set of selected applications is completed. Example: complicated scientific computations performed layer by layer in different computer networks.
Organizing Grid-level program execution control The following co-operation schemes included into the proposed Grid environment will be discussed:
A2 A3
1
A1
1
1
3 2 2
A4 A5
1 1 2 2
17
Alternative workflow. One of several applications is selected for execution depending on the results (state) of former applications. Example: one of two available program packages is run depending on computation results performed so far.
2
2 1 2
…
A1
1
A2.1
Synch1 1
2
A2.N
1
Partial canceling of workflow: Applications that become superfluous from the point of view of the general purpose of computation are stopped. Example: The exhaustive parallel search on a Grid for the optimal solution in a solution space is stopped or restricted when the search provides a satisfactory solution.
18
Supporting workflow: A set of currently executed applications require activation of auxiliary applications, which will provide useful results. Example: In a coarse grain simulation of a system of moving objects a collision that appears, stimulates a change in the Grid application configuration. An application which models the collision in a detailed way (with a fine granularity of events) is activated. After detailed simulation of the collision the coarse grain of the simulation process is restored.
A1.1
2
3 Synch1
1 2
3 4 A2.N
2
3
Workflow coupling: A common (global) status of many applications is monitored and control directives are distributed to particular applications of needed. Applications compute parameters that are subject to mutual exchange. Some parameters in meta-level applications are updated with the use of results of some auxiliary computations
A1.1
2
3 A2.1
Synch1
1 2
2
3 3 4
A1.N
2
3 A2.N
2
3
… …
19
Example – A Grid TSP application structure
PostProc
4
4
4 4
4
1
Synch1 B&B1
2 1
0 1 B&B2
2
1
B&B3 2
1
Heur2 2
1 Heur1
2
1
3
2
PreProc
1
3
3
3
3
3
21
The TSP application
B&B part: condition window
Heuristic search Application structure
search processapplication structure
B&B part: communication diagram
22
•The paper has presented how the synchronization-based parallel application control can be extended and ported onto the GRID level. With the use of the proposed method we can create an advanced control of many applications running in the GRID environment. Inter–application coordination between programs, which are executed on different GRID sites, is supported.
•We identify five types of Grid-level program execution control
•The presented example shows that the new programming environment provides convenient means for designing complicated Grid applications control. Being on the Grid, we can extend the time consuming parts of the applications and run it on any available clusters during the middle stage of the algorithm. The best results from the heuristic part of the application obtained in a shorter time than by the B&B computations can support faster finding of the exact solution.
Conclusions