Parallel Programming with PVM
Prof. Sivarama DandamudiSchool of Computer Science
Carleton University
Carleton University © S. Dandamudi 2
Parallel Algorithm Models Five basic models
Data parallel modelTask graph modelWork pool modelMaster-slave modelPipeline modelHybrid models
Carleton University © S. Dandamudi 3
Parallel Algorithm Models (cont’d)
Data parallel modelOne of the simplest of all the modelsTasks are statically mapped onto processorsEach task performs similar operation on different data
Called data parallelism modelWork may be done in phases
Operations in different phases may be differentEx: Matrix multiplication
Carleton University © S. Dandamudi 4
Parallel Algorithm Models (cont’d)
Data parallel modelA11 A12 B11 B12 C11 C12A21 A22 B21 B22 C21 C22A decomposition into four tasks
Task 1: C11 = A11 B11 + A12 B21Task 2: C12 = A11 B12 + A12 B22Task 3: C21 = A21 B11 + A22 B21Task 4: C11 = A21 B21 + A22 B22
. =
Carleton University © S. Dandamudi 5
Parallel Algorithm Models (cont’d)
Task graph modelParallel algorithm is viewed as a task-dependency
graphCalled task parallelism model
Typically used for tasks that have large amount of dataStatic mapping is used to optimize data movement cost
Locality-based mapping is importantEx: Divide-and-conquer algorithms, parallel quicksort
Carleton University © S. Dandamudi 6
Parallel Algorithm Models (cont’d)
Task parallelism
Carleton University © S. Dandamudi 7
Parallel Algorithm Models (cont’d)
Work pool modelDynamic mapping of tasks onto processors
Important for load balancingUsed on message passing systems
When the data associated with a task is relatively smallGranularity of tasks
Too small: overhead in accessing tasks can increaseToo big: Load imbalance
Ex: Parallelization of loops by chunk scheduling
Carleton University © S. Dandamudi 8
Parallel Algorithm Models (cont’d)
Master-slave modelOne or more master processes generate work and
allocate it to worker processesAlso called manager-worker model
Suitable for both shared-memory and message passing systems
Master can potentially become a bottleneckGranularity of tasks is important
Carleton University © S. Dandamudi 9
Parallel Algorithm Models (cont’d)
Pipeline modelA stream of data passes through a series of processors
Each process performs some task on the dataAlso called stream parallelism model
Uses producer-consumer relationshipOverlapped executionUseful in applications such as database query processing
Potential problem One process can delay the whole pipeline
Carleton University © S. Dandamudi 10
Parallel Algorithm Models (cont’d)
Pipeline model
R5
R4
R3
R2R1
Pipelined processing canavoid writing temporaryresults on disk and reading them back
Carleton University © S. Dandamudi 11
Parallel Algorithm Models (cont’d)
Hybrid modelsPossible to use multiple modelsHierarchical
Different models at different levelsSequentially
Different models in different phasesEx: Major computation may use task graph model
Each node of the graph may use data parallelism or pipeline model
Carleton University © S. Dandamudi 12
PVM Parallel virtual machine
Collaborative effortOak Ridge National Lab, University of Tennessee,
Emory University, and Carnegie Mellon UniversityBegan in 1989
Version 1.0 was used internallyVersion 2.0 released in March 1991Version 3.0 in February 1993
Carleton University © S. Dandamudi 13
PVM (cont’d)
Parallel virtual machineTargeted for heterogeneous network computing
Different architecturesData formatsComputational speedsMachine loadsNetwork load
Carleton University © S. Dandamudi 14
PVM Calls Process control
int tid = pvm_mytid(void)Returns pid of the calling processCan be called multiple times
int info = pvm_exit(void)Does not kill the processTells the local pvmd that this process is leaving PVMinfo < 0 indicates error (error: pvmd not responding)
Carleton University © S. Dandamudi 15
PVM Calls (cont’d)
Process controlint numt = pvm_spawn(char *task, char **argv, int flag, char *where, int ntask, int *tids)
Starts ntask copies of the executable file task
Arguments to task (NULL terminated)
Specific host
Specific architecture(PVM_ARCH)
Carleton University © S. Dandamudi 16
PVM Calls (cont’d)
flag specifies options
Value Option Meaning
0 PvmTaskDefault PVM chooses where to span
1 PvmTaskHost where specifies a host
2 PvmTaskArch where specifies a architecture
3 PvmTaskDebug Starts tasks under debugger
Carleton University © S. Dandamudi 17
PVM Calls (cont’d)
Process controlint info = pvm_kill(int tid)
Kills the PVM task identified by tidDoes not kill the calling taskTo kill the calling task
First call pvm_exit()Then exit()
Writes to the file /tmp/pvml.<uid>
Carleton University © S. Dandamudi 18
PVM Calls (cont’d)
Informationint tid = pvm_parent(void)
Returns the tid of the process that spawned the calling task
Returns PvmNoParent if the task is not created by pvm_spawn()
Carleton University © S. Dandamudi 19
PVM Calls (cont’d)
Informationint info = pvm_config( int *nhost, int *narch, struct pvmhostinfo **hostp)
Returns nhost = number of hosts Returns narch = number of different data formats
Carleton University © S. Dandamudi 20
PVM Calls (cont’d)
Message sending Involves three steps
Send buffer must be initialized Use pvm_initsend()
Message must be packed Use pvm_pk*() Several pack routines are available
Send the message Use pvm_send()
Carleton University © S. Dandamudi 21
PVM Calls (cont’d)
Message sendingint bufid = pvm_initsend( int encoding)
Called before packing a new message into the bufferClears the send buffer and creates a new one for
packing a new message bufid = new buffer id
Carleton University © S. Dandamudi 22
PVM Calls (cont’d)
encoding can have three options: PvmDataDefault
XDR encoding is used by default Useful for heterogeneous architectures
PvmDataRaw No encoding is done Messages sent in their original form
PvmDataInPLace No buffer copying Buffer should not be modified until sent
Carleton University © S. Dandamudi 23
PVM Calls (cont’d)
Packing data Several routines are available (one for each data type) Each takes three arguments
int info = pvm_pkbyte(char *cp, int nitem, int stride)
nitem = # items to be packed stride = stride in elements
Carleton University © S. Dandamudi 24
PVM Calls (cont’d)
Packing datapvm_pkint pvm_pklongpvm_pkfloat pvm_pkdoublepvm_pkshort
Pack string routine requires only the NULL-terminated string pointer
pvm_pkstr(char *cp)
Carleton University © S. Dandamudi 25
PVM Calls (cont’d)
Sending dataint info = pvm_send(int tid, int msgtag)
Sends the message in the packed buffer to task tidMessage is tagged with msgtag
Message tags are useful to distinguish different types of messages
Carleton University © S. Dandamudi 26
PVM Calls (cont’d)
Sending data (multicast)int info = pvm_mcast(int *tids, int ntask, int msgtag)
Sends the message in the packed buffer to all tasks in the tid array (except itself)
tid array length is given by ntask
Carleton University © S. Dandamudi 27
PVM Calls (cont’d)
Receiving dataTwo steps
Receive dataUnpack it
Two versionsBlocking
Waits until the message arrivesNon-blocking
Does not wait
Carleton University © S. Dandamudi 28
PVM Calls (cont’d)
Receiving dataBlocking receiveint info = pvm_recv(int tid, int msgtag)
Wait until a message with msgtag has arrived from task tid
Wildcard value (1) is allowed for both msgtag and tid
Carleton University © S. Dandamudi 29
PVM Calls (cont’d)
Receiving dataNon-blocking receive
int info = pvm_nrecv(int tid, int msgtag)
If no message with msgtag has arrived from task tidReturns bufid = 0
Otherwise, behaves like the blocking receive
Carleton University © S. Dandamudi 30
PVM Calls (cont’d)
Receiving dataProbing for a messageint info = pvm_probe(int tid, int msgtag)
If no message with msgtag has arrived from task tidReturns bufid = 0
Otherwise, returns a bufid for the messageDoes not receive the message
Carleton University © S. Dandamudi 31
PVM Calls (cont’d)
Unpacking data (similar to packing routines)pvm_upkint pvm_upklongpvm_upkfloat pvm_upkdoublepvm_upkshort pvm_upkbyte
Pack string routine requires only the NULL-terminated string pointer
pvm_upkstr(char *cp)
Carleton University © S. Dandamudi 32
PVM Calls (cont’d)
Buffer informationUseful to find the size of the received message
int info = pvm_bufinfo(int bufid, int *bytes, int *msgtag, int *tid)
Returns msgtag, source tid, and size in bytes
Carleton University © S. Dandamudi 33
Example Finds sum of elements of a given vector
Vector size is given as inputThe program can be run on a PVM with up to 10 nodes
Can be modified by changing a constantVector is assumes to be evenly divisable by number of
nodes in PVM Easy to modify this restriction
Master (vecsum.c) and slave (vecsum_slave.c) programs
Carleton University © S. Dandamudi 34
Example (cont’d)
vecsum.c#include <stdio.h>#include <sys/time.h>#include "pvm3.h"
#define MAX_SIZE 250000 /* max. vector size */#define NPROCS 10 /* max. number of PVM nodes */
Carleton University © S. Dandamudi 35
Example (cont’d)
main(){int cc, tid[NPROCS];long vector[MAX_SIZE];double sum = 0,
partial_sum; /* partial sum received from slaves */long i, vector_size;
Carleton University © S. Dandamudi 36
Example (cont’d)
int nhost, /* actual # of hosts in PVM */ size; /* size of vector to be distributed */
struct timeval start_time, finish_time;long sum_time;
Carleton University © S. Dandamudi 37
Example (cont’d)
printf("Vector size = ");scanf("%ld", &vector_size);
for(i=0; i<vector_size; i++) /* initialize vector */ vector[i] = i;
gettimeofday(&start_time, (struct timezone*)0); /* start time */
Carleton University © S. Dandamudi 38
Example (cont’d)
tid[0] = pvm_mytid(); /* establish my tid */
/* get # of hosts using pvm_config() */pvm_config(&nhost, (int *)0,
(struct hostinfo *)0);
size = vector_size/nhost; /* size of vector to send to slaves */
Carleton University © S. Dandamudi 39
Example (cont’d)
if (nhost > 1) pvm_spawn("vecsum_slave", (char **)0,
0, "", nhost-1, &tid[1]);for (i=1; i<nhost;i++){
/* distribute data to slaves */ pvm_initsend(PvmDataDefault); pvm_pklong(&vector[i*size],size,1); pvm_send(tid[i],1);}
Carleton University © S. Dandamudi 40
Example (cont’d)
for (i=0; i<size;i++) /* perform local sum */ sum += vector[i];for (i=1; i<nhost;i++){
/* collect partial sums from slaves */ pvm_recv(-1,2); pvm_upkdouble(&partial_sum,1,1); sum += partial_sum;}
Carleton University © S. Dandamudi 41
Example (cont’d)
gettimeofday(&finish_time, (struct timezone*)0); /* finish time */
sum_time = (finish_time.tv_sec – start_time.tv_sec) * 1000000 + finish_time.tv_usec -
start_time.tv_usec;
Time in secs
Carleton University © S. Dandamudi 42
Example (cont’d)
printf("Sum = %lf\n",sum);printf("Sum time on %d hosts =
%lf sec\n", nhost, (double)sum_time/1000000);pvm_exit();
}
Carleton University © S. Dandamudi 43
Example (cont’d)
vecsum_slave.c#include "pvm3.h"#define MAX_SIZE 250000main(){int ptid, bufid, vector_bytes;long vector[MAX_SIZE];double sum = 0;int i;
Carleton University © S. Dandamudi 44
Example (cont’d)
ptid = pvm_parent(); /* find parent tid */bufid = pvm_recv(ptid,1);
/* receive data from master *//* use pvm_bufinfo() to find the number of bytes received */pvm_bufinfo(bufid, &vector_bytes,
(int *)0, (int *) 0);
Carleton University © S. Dandamudi 45
Example (cont’d)
pvm_upklong(vector, vector_bytes/sizeof(long),1); /* unpack */for (i=0;
i<vector_bytes/sizeof(long); i++) /* local summation */ sum += vector[i];
Carleton University © S. Dandamudi 46
Example (cont’d)
pvm_initsend(PvmDataDefault); /* send sum to master */pvm_pkdouble(&sum,1,1);pvm_send(ptid, 2);
/* use msg type 2 for partial sum */
pvm_exit();}