high performance computing course notes 2007-2008 message passing programming i
Post on 20-Dec-2015
213 views
TRANSCRIPT
![Page 1: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/1.jpg)
High Performance ComputingHigh Performance ComputingCourse Notes 2007-2008Course Notes 2007-2008
Message Passing Programming IMessage Passing Programming I
![Page 2: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/2.jpg)
2Computer Science, University of WarwickComputer Science, University of Warwick
Message Passing Programming
Message Passing is the most widely used parallel programming model
Message passing works by creating a number of tasks, uniquely named, that interact by sending and receiving messages to and from one another (hence the message passing)
Generally, processes communicate through sending the data from the address space of one process to that of another
Communication of processes (via files, pipe, socket)
Communication of threads within a process (via global data area)
Programs based on message passing can be based on standard sequential language programs (C/C++, Fortran), augmented with calls to library functions for sending and receiving messages
![Page 3: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/3.jpg)
3Computer Science, University of WarwickComputer Science, University of Warwick
Message Passing Interface (MPI)Message Passing Interface (MPI)
MPI is a specification, not a particular implementation
Does not specify process startup, error codes, amount of system buffer, etc
MPI is a library, not a language
The goals of MPI: functionality, portability and efficiency
Message passing model > MPI specification > MPI implementation
![Page 4: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/4.jpg)
4Computer Science, University of WarwickComputer Science, University of Warwick
OpenMP vs MPIOpenMP vs MPI
In a nutshell
MPI is used on distributed-memory systems
OpenMP is used for code parallelisation on shared-memory systems
Both are explicit parallelism
High-level control (OpenMP), lower-level control (MPI)
![Page 5: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/5.jpg)
5Computer Science, University of WarwickComputer Science, University of Warwick
A little historyA little history
Message-passing libraries developed for a number of early distributed memory computers
By 1993 there were loads of vendor specific implementations
By 1994 MPI-1 came into being
By 1996 MPI-2 was finalized
![Page 6: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/6.jpg)
6Computer Science, University of WarwickComputer Science, University of Warwick
The MPI programming modelThe MPI programming model
MPI standards -
MPI-1 (1.1, 1.2), MPI-2 (2.0)
Forwards compatibility preserved between versions
Standard bindings - for C, C++ and Fortran. Have seen MPI bindings for Python, Java etc (all non-standard)
We will stick to the C binding, for the lectures and coursework. More info on MPI www.mpi-forum.org
Implementations - For your laptop pick up MPICH (free portable implementation of MPI (http://www-unix.mcs.anl. gov/mpi/mpich/index.htm)
Coursework will use MPICH
![Page 7: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/7.jpg)
7Computer Science, University of WarwickComputer Science, University of Warwick
MPIMPI
MPI is a complex system comprising of 129 functions with numerous parameters and variants
Six of them are indispensable, but can write a large number of useful programs already
Other functions add flexibility (datatype), robustness (non-blocking send/receive), efficiency (ready-mode communication), modularity (communicators, groups) or convenience (collective operations, topology).
In the lectures, we are going to cover most commonly encountered functions
![Page 8: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/8.jpg)
8Computer Science, University of WarwickComputer Science, University of Warwick
The MPI programming modelThe MPI programming model
Computation comprises one or more processes that communicate via library routines and sending and receiving messages to other processes
(Generally) a fixed set of processes created at outset, one process per processor
Different from PVM
![Page 9: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/9.jpg)
9Computer Science, University of WarwickComputer Science, University of Warwick
Intuitive Interfaces for sending and Intuitive Interfaces for sending and receiving messages receiving messages
Send(data, destination), Receive(data, source)
minimal interface
Not enough in some situations, we also need
Message matching – add message_id at both send and receive interfaces
they become Send(data, destination, msg_id), receive(data, source, msg_id)
Message_id• Is expressed using an integer, termed as message tag
• Allows the programmer to deal with the arrival of messages in an orderly fashion (queue and then deal with
![Page 10: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/10.jpg)
10Computer Science, University of WarwickComputer Science, University of Warwick
How to express the data in the How to express the data in the send/receive interfacessend/receive interfaces
Early stages: (address, length) for the send interface
(address, max_length) for the receive interface
They are not always good The data to be sent may not be in the contiguous memory locations
Storing format for data may not be the same or known in advance in heterogeneous platform
Enventually, a triple (address, count, datatype) is used to express the data to be sent and (address, max_count, datatype) for the data to be received
Reflecting the fact that a message contains much more structures than just a string of bits, For example, (vector_A, 300, MPI_REAL)
Programmers can construct their own datatype
Now, the interfaces become send(address, count, datatype, destination, msg_id) and receive(address, max_count, datatype, source, msg_id)
![Page 11: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/11.jpg)
11Computer Science, University of WarwickComputer Science, University of Warwick
How to distinguish messagesHow to distinguish messages
Message tag is necessary, but not sufficient
So, communicator is introduced …
![Page 12: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/12.jpg)
12Computer Science, University of WarwickComputer Science, University of Warwick
CommunicatorsCommunicators
Messages are put into contexts
Contexts are allocated at run time by the system in response to programmer requests
The system can guarantee that each generated context is unique
The processes belong to groups
The notions of context and group are combined in a single object, which is called a communicator
A communicator identifies a group of processes and a communication context
The MPI library defines a initial communicator, MPI_COMM_WORLD, which contains all the processes running in the system
The messages from different process groups can have the same tag
So the send interface becomes send(address, count, datatype, destination, tag, comm)
![Page 13: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/13.jpg)
13Computer Science, University of WarwickComputer Science, University of Warwick
Status of the received messagesStatus of the received messages
The structure of the message status is added to the receive interface
Status holds the information about source, tag and actual message size
In the C language, source can be retrieved by accessing status.MPI_SOURCE,
tag can be retrieved by status.MPI_TAG and
actual message size can be retrieved by calling the function MPI_Get_count(&status, datatype, &count)
The receive interface becomes receive(address, maxcount, datatype, source, tag, communicator, status)
![Page 14: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/14.jpg)
14Computer Science, University of WarwickComputer Science, University of Warwick
How to express source and destination How to express source and destination
The processes in a communicator (group) are identified by ranks
If a communicator contains n processes, process ranks are integers from 0 to n-1
Source and destination processes in the send/receive interface are the ranks
![Page 15: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/15.jpg)
15Computer Science, University of WarwickComputer Science, University of Warwick
Some other issuesSome other issues
In the receive interface, tag can be a wildcard, which means any message will be received
In the receive interface, source can also be a wildcard, which match any source
![Page 16: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/16.jpg)
16Computer Science, University of WarwickComputer Science, University of Warwick
MPI basicsMPI basics
First six functions (C bindings)
MPI_Send (buf, count, datatype, dest, tag, comm)
Send a messagebuf address of send buffercount no. of elements to send (>=0)datatype of elementsdest process id of destination tag message tagcomm communicator (handle)
![Page 17: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/17.jpg)
17Computer Science, University of WarwickComputer Science, University of Warwick
MPI basicsMPI basics
First six functions (C bindings)
MPI_Send (buf, count, datatype, dest, tag, comm)
Send a messagebuf address of send buffercount no. of elements to send (>=0)datatype of elementsdest process id of destination tag message tagcomm communicator (handle)
![Page 18: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/18.jpg)
18Computer Science, University of WarwickComputer Science, University of Warwick
MPI basicsMPI basics
First six functions (C bindings)
MPI_Send (buf, count, datatype, dest, tag, comm)
Send a messagebuf address of send buffercount no. of elements to send (>=0)datatype of elementsdest process id of destination tag message tagcomm communicator (handle)
![Page 19: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/19.jpg)
19Computer Science, University of WarwickComputer Science, University of Warwick
MPI basicsMPI basics
First six functions (C bindings)
MPI_Send (buf, count, datatype, dest, tag, comm)
Calculating the size of the data to be send …
buf address of send buffer
count * sizeof (datatype) bytes of data
![Page 20: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/20.jpg)
20Computer Science, University of WarwickComputer Science, University of Warwick
MPI basicsMPI basics
First six functions (C bindings)
MPI_Send (buf, count, datatype, dest, tag, comm)
Send a messagebuf address of send buffercount no. of elements to send (>=0)datatype of elementsdest process id of destination tag message tagcomm communicator (handle)
![Page 21: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/21.jpg)
21Computer Science, University of WarwickComputer Science, University of Warwick
MPI basicsMPI basics
First six functions (C bindings)
MPI_Send (buf, count, datatype, dest, tag, comm)
Send a messagebuf address of send buffercount no. of elements to send (>=0)datatype of elementsdest process id of destination tag message tagcomm communicator (handle)
![Page 22: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/22.jpg)
22Computer Science, University of WarwickComputer Science, University of Warwick
MPI basicsMPI basics
First six functions (C bindings)
MPI_Recv (buf, count, datatype, source, tag, comm, status)
Receive a message
buf address of receive buffer (var param)
count max no. of elements in receive buffer (>=0)
datatype of receive buffer elements
source process id of source process, or MPI_ANY_SOURCE
tag message tag, or MPI_ANY_TAG
comm communicator
status status object
![Page 23: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/23.jpg)
23Computer Science, University of WarwickComputer Science, University of Warwick
MPI basicsMPI basics
First six functions (C bindings)
MPI_Init (int *argc, char ***argv)
Initiate a computation
argc (number of arguments) and argv (argument vector) are main program’s arguments
Must be called first, and once per process
MPI_Finalize ( )
Shut down a computation
The last thing that happens
![Page 24: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/24.jpg)
24Computer Science, University of WarwickComputer Science, University of Warwick
MPI basicsMPI basics
First six functions (C bindings)
MPI_Comm_size (MPI_Comm comm, int *size)
Determine number of processes in comm
comm is communicator handle, MPI_COMM_WORLD is the default (including all MPI processes)
size holds number of processes in group
MPI_Comm_rank (MPI_Comm comm, int *pid)
Determine id of current (or calling) process
pid holds id of current process
![Page 25: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/25.jpg)
25Computer Science, University of WarwickComputer Science, University of Warwick
#include "mpi.h" #include <stdio.h> int main(int argc, char *argv[]) { int rank, nprocs;
MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&nprocs); MPI_Comm_rank(MPI_COMM_WORLD,&rank); printf("Hello, world. I am %d of %d\n", rank, nprocs); MPI_Finalize(); }
MPI basics – a basic exampleMPI basics – a basic example
mpirun –np 4 myprog
Hello, world. I am 1 of 4
Hello, world. I am 3 of 4
Hello, world. I am 0 of 4
Hello, world. I am 2 of 4
![Page 26: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/26.jpg)
26Computer Science, University of WarwickComputer Science, University of Warwick
MPI basics – send and recv example (1)MPI basics – send and recv example (1)
#include "mpi.h"#include <stdio.h> int main(int argc, char *argv[]){ int rank, size, i; int buffer[10]; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (size < 2) { printf("Please run with two processes.\n"); MPI_Finalize(); return 0; } if (rank == 0) { for (i=0; i<10; i++) buffer[i] = i; MPI_Send(buffer, 10, MPI_INT, 1, 123, MPI_COMM_WORLD); }
![Page 27: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/27.jpg)
27Computer Science, University of WarwickComputer Science, University of Warwick
MPI basics – send and recv example (2)MPI basics – send and recv example (2)
if (rank == 1) { for (i=0; i<10; i++) buffer[i] = -1; MPI_Recv(buffer, 10, MPI_INT, 0, 123, MPI_COMM_WORLD, &status); for (i=0; i<10; i++) { if (buffer[i] != i) printf("Error: buffer[%d] = %d but is expected to be %d\n", i, buffer[i], i); } } MPI_Finalize();}
![Page 28: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/28.jpg)
28Computer Science, University of WarwickComputer Science, University of Warwick
MPI language bindingsMPI language bindings
Standard (accepted) bindings for Fortran, C and C++
Java bindings are work in progress
JavaMPIJava wrapper to native calls
mpiJavaJNI wrappers
jmpi pure Java implementation of MPI library
MPIJ same idea
Java Grande Forum trying to sort it all out
We will use the C bindings
![Page 29: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/29.jpg)
29Computer Science, University of WarwickComputer Science, University of Warwick
High Performance ComputingHigh Performance ComputingCourse Notes 2007-2008Course Notes 2007-2008
Message Passing Programming II
![Page 30: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/30.jpg)
30Computer Science, University of WarwickComputer Science, University of Warwick
ModularityModularity
MPI supports modular programming via communicators
Provides information hiding by encapsulating local communications and having local namespaces for processes
All MPI communication operations specify a communicator (process group that is engaged in the communication)
![Page 31: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/31.jpg)
31Computer Science, University of WarwickComputer Science, University of Warwick
Forming new communicators – Forming new communicators – one approachone approach
MPI_Comm world, workers;
MPI_Group world_group, worker_group;
int ranks[1];
MPI_Init(&argc, &argv);
world=MPI_COMM_WORLD;
MPI_Comm_size(world, &numprocs);
MPI_Comm_rank(world, &myid);
server=numprocs-1;
MPI_Comm_group(world, &world_group);
ranks[0]=server;
MPI_Group_excl(world_group, 1, ranks, &worker_group);
MPI_Comm_create(world, worker_group, &workers);
MPI_Group_free(&world_group);
MPI_Comm_free(&workers);
![Page 32: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/32.jpg)
32Computer Science, University of WarwickComputer Science, University of Warwick
Forming new communicators - functionsForming new communicators - functions
int MPI_Comm_group(MPI_Comm comm, MPI_Group *group)
int MPI_Group_excl(MPI_Group group, int n, int *ranks, MPI_Group *newgroup)
Int MPI_Group_incl(MPI_Group group, int n, int *ranks, MPI_Group *newgroup)
int MPI_Comm_create(MPI_Comm comm, MPI_Group group, MPI_Comm *newcomm)
int MPI_Group_free(MPI_Group *group)
int MPI_Comm_free(MPI_Comm *comm)
![Page 33: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/33.jpg)
33Computer Science, University of WarwickComputer Science, University of Warwick
Forming new communicators – Forming new communicators – another approach (1)another approach (1)
MPI_Comm_split (comm, colour, key, newcomm)
Creates one or more new communicators from the original comm
comm communicator (handle)colour control of subset assignment (processes with
same colour are in same new communicator)key control of rank assignmentnewcomm new communicator
Is a collective communication operation (must be executed by all processes in the process group comm)
Is used to (re-) allocate processes to communicator (groups)
![Page 34: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/34.jpg)
34Computer Science, University of WarwickComputer Science, University of Warwick
Forming new communicators – Forming new communicators – another approach (2)another approach (2)
MPI_Comm_split (comm, colour, key, newcomm)
MPI_Comm comm, newcomm; int myid, color;
MPI_Comm_rank(comm, &myid); // id of current process
color = myid%3;
MPI_Comm_split(comm, colour, myid, *newcomm);
0 1 2 3 4 5 6 7
0 1 2 21 10 00: 1: 2:
![Page 35: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/35.jpg)
35Computer Science, University of WarwickComputer Science, University of Warwick
Forming new communicators – Forming new communicators – another approach (3)another approach (3)
MPI_Comm_split (comm, colour, key, newcomm)
New communicator created for each new value of colour
Each new communicator (sub-group) comprises those processes that specify its value in colour
These processes are assigned new identifiers (ranks, starting at zero) with the order determined by the value of key (or by their ranks in the old communicator in event of ties)
![Page 36: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/36.jpg)
36Computer Science, University of WarwickComputer Science, University of Warwick
Communications Communications
Point-to-point communications: involving exact two processes, one sender and one receiver
For example, MPI_Send() and MPI_Recv()
Collective communications: involving a group of processes
![Page 37: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/37.jpg)
37Computer Science, University of WarwickComputer Science, University of Warwick
Collective operationsCollective operations
i.e. coordinated communication operations involving multiple processes
Programmer could do this by hand (tedious), MPI provides a specialized collective communications
barrier – synchronize all processes
broadcast – sends data from one to all processes
gather – gathers data from all processes to one process
scatter – scatters data from one process to all processes
reduction operations – sums, multiplies etc. distributed data
all executed collectively (on all processes in the group, at the same time, with the same parameters)
![Page 38: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/38.jpg)
38Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Barrier (comm)
Global synchronization
comm is the communicator handle
No processes return from function until all processes have called it
Good way of separating one phase from another
Collective operationsCollective operations
![Page 39: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/39.jpg)
39Computer Science, University of WarwickComputer Science, University of Warwick
Barrier synchronizationsBarrier synchronizations
You are only as quick as your slowest process
Barrier sync. Barrier sync.
![Page 40: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/40.jpg)
40Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Bcast (buf, count, type, root, comm)
Broadcast data from root to all processes
buf address of input buffer or output buffer (root)
count no. of entries in buffer (>=0)type datatype of buffer elementsroot process id of root processcomm communicator
Collective operationsCollective operations
proc.
data
A0A0
A0
A0
A0
One to all broadcast
MPI_BCAST
![Page 41: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/41.jpg)
41Computer Science, University of WarwickComputer Science, University of Warwick
Broadcast 100 ints from process 0 to every process in the group
MPI_Comm comm;
int array[100];
int root = 0;
…
MPI_Bcast (array, 100, MPI_INT, root, comm);
Example of MPI_BcastExample of MPI_Bcast
![Page 42: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/42.jpg)
42Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Gather (inbuf, incount, intype, outbuf, outcount, outtype, root, comm)
Collective data movement function
inbuf address of input bufferincount no. of elements sent from each (>=0)intype datatype of input buffer elementsoutbuf address of output buffer (var param)outcount no. of elements received from eachouttype datatype of output buffer elementsroot process id of root processcomm communicator
Collective operationsCollective operations
proc.
data
A0A0
A1
A2
A3
All to one gather
MPI_GATHER
A1 A2 A3
![Page 43: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/43.jpg)
43Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Gather (inbuf, incount, intype, outbuf, outcount, outtype, root, comm)
Collective data movement function
inbuf address of input bufferincount no. of elements sent from each (>=0)intype datatype of input buffer elementsoutbuf address of output bufferoutcount no. of elements received from eachouttype datatype of output buffer elementsroot process id of root processcomm communicator
Collective operationsCollective operations
proc.
data
A0A0
A1
A2
A3
All to one gather
MPI_GATHER
A1 A2 A3
Input to gather
![Page 44: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/44.jpg)
44Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Gather (inbuf, incount, intype, outbuf, outcount, outtype, root, comm)
Collective data movement function
inbuf address of input bufferincount no. of elements sent from each (>=0)intype datatype of input buffer elementsoutbuf address of output buffer (var param)outcount no. of elements received from eachouttype datatype of output buffer elementsroot process id of root processcomm communicator
Collective operationsCollective operations
proc.
data
A0A0
A1
A2
A3
All to one gather
MPI_GATHER
A1 A2 A3
Output gather
![Page 45: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/45.jpg)
45Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Gather (inbuf, incount, intype, outbuf, outcount, outtype, root, comm)
Collective data movement function
inbuf address of input bufferincount no. of elements sent from each (>=0)intype datatype of input buffer elementsoutbuf address of output buffer (var param)outcount no. of elements received from eachouttype datatype of output buffer elementsroot process id of root processcomm communicator
Collective operationsCollective operations
proc.
data
A0A0
A1
A2
A3
All to one gather
MPI_GATHER
A1 A2 A3
Receiving proc.
![Page 46: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/46.jpg)
46Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Gather exampleMPI_Gather example
Gather 100 ints from every process in group to root
MPI_Comm comm;
int gsize, sendarray[100];
int root, myrank, *rbuf;
...
MPI_Comm_rank( comm, myrank); // find proc. id
If (myrank == root) {
MPI_Comm_size( comm, &gsize); // find group size
rbuf = (int *) malloc(gsize*100*sizeof(int)); // calc. receive buffer
}
MPI_Gather( sendarray, 100, MPI_INT, rbuf, 100, MPI_INT, root, comm);
![Page 47: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/47.jpg)
47Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Scatter (inbuf, incount, intype, outbuf, outcount, outtype, root, comm)
Collective data movement function
inbuf address of input bufferincount no. of elements sent to each (>=0)intype datatype of input buffer elementsoutbuf address of output bufferoutcount no. of elements received by eachouttype datatype of output buffer elementsroot process id of root processcomm communicator
Collective operationsCollective operations
proc.
data
A0A0A1 A2 A3
One to all scatter
MPI_SCATTER
A1
A2
A3
![Page 48: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/48.jpg)
48Computer Science, University of WarwickComputer Science, University of Warwick
Example of MPI_ScatterExample of MPI_Scatter
MPI_Scatter is reverse of MPI_Gather
It is as if the root sends using
MPI_Send(inbuf+i*incount * sizeof(intype), incount, intype, i, …)
MPI_Comm comm; int gsize, *sendbuf; int root, rbuff[100]; … MPI_Comm_size (comm, &gsize); sendbuf = (int *) malloc (gsize*100*sizeof(int)); … MPI_Scatter (sendbuf, 100, MPI_INT, rbuf, 100, MPI_INT, root, comm);
![Page 49: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/49.jpg)
49Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Reduce (inbuf, outbuf, count, type, op, root, comm)
Collective reduction function
inbuf address of input bufferoutbuf address of output buffercount no. of elements in input buffer (>=0)type datatype of input buffer elementsop operationroot process id of root processcomm communicator
Collective operationsCollective operations
proc.data
2 Using MPI_MINRoot = 0
MPI_REDUCE
4 0 2
5 7
0 3
26
![Page 50: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/50.jpg)
50Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Reduce (inbuf, outbuf, count, type, op, root, comm)
Collective reduction function
inbuf address of input bufferoutbuf address of output buffercount no. of elements in input buffer (>=0)type datatype of input buffer elementsop operationroot process id of root processcomm communicator
Collective operationsCollective operations
proc.data
2 Using MPI_SUMRoot = 1
MPI_REDUCE
4
13 165 7
0 3
26
![Page 51: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/51.jpg)
51Computer Science, University of WarwickComputer Science, University of Warwick
MPI_Allreduce (inbuf, outbuf, count, type, op, comm)
Collective reduction function
inbuf address of input bufferoutbuf address of output buffer (var param)count no. of elements in input buffer (>=0)type datatype of input buffer elementsop operationcomm communicator
Collective operationsCollective operations
proc.data
2 Using MPI_MIN
MPI_ALLREDUCE
4 0 2
5 7
0 3
26
0
0
0
2
2
2
![Page 52: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/52.jpg)
52Computer Science, University of WarwickComputer Science, University of Warwick
Buffering in MPI communicationsBuffering in MPI communications
Application buffer: specified by the first parameter in MPI_Send/Recv functions
System buffer:
Hidden from the programmer and managed by the MPI library
Is limitted and can be easy to exhaust
![Page 53: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/53.jpg)
53Computer Science, University of WarwickComputer Science, University of Warwick
Blocking and non-blocking Blocking and non-blocking communicationscommunications
Blocking send The sender doesn’t return until the application buffer can be re-used (which often
means that the data have been copied from application buffer to system buffer), but doesn’t mean that the data will be received
MPI_Send (buf, count, datatype, dest, tag, comm)
Blocking receive The receiver doesn’t return until the data have been ready to use by the receiver
(which often means that the data have been copied from system buffer to application buffer)
Non-blocking send/receive The calling process returns immediately
Just request the MPI library to perform the operation, the user cannot predict when that will happen
Unsafe to modify the application buffer until you can make sure the requested operation has been performed (MPI provides routines to test this)
Can be used to overlap computation with communication and have possible performance gains
MPI_Isend (buf, count, datatype, dest, tag, comm, request)
![Page 54: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/54.jpg)
54Computer Science, University of WarwickComputer Science, University of Warwick
Testing non-blocking communications Testing non-blocking communications for completionfor completion
Completion tests come in two types:
WAIT type
TEST type
WAIT type: the WAIT type testing routines block until the communication has completed.
A non-blocking communication immediately followed by a WAIT-type test is equivalent to the corresponding blocking communication
TEST type: these routines return TRUE or FALSE value
The process can perform some other tasks when the communication has not completed
![Page 55: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/55.jpg)
55Computer Science, University of WarwickComputer Science, University of Warwick
Testing non-blocking communications Testing non-blocking communications for completionfor completion
The WAIT-type test is:
MPI_Wait (request, status)
This routine blocks until the communication specified by the handle request has completed. The request handle will have been returned by an earlier call to a non-blocking communication routine.
The TEST-type test is:
MPI_Test (request, flag, status)
In this case the communication specified by the handle request is simply queried to see if the communication has completed and the result of the query (TRUE or FALSE) is returned immediately in flag.
![Page 56: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/56.jpg)
56Computer Science, University of WarwickComputer Science, University of Warwick
Testing multiple non-blocking Testing multiple non-blocking communications for completioncommunications for completion
Wait for all communications to complete
MPI_Waitall (count, array_of_requests, array_of_statuses)
This routine blocks until all the communications specified by the request handles, array_of_requests, have completed. The statuses of the communications are returned in the array array_of_statuses and each can be queried in the usual way for the source and tag if required
Test if all communications have completed
MPI_Testall (count, array_of_requests, flag, array_of_statuses)
If all the communications have completed, flag is set to TRUE, and information about each of the communications is returned in array_of_statuses. Otherwise flag is set to FALSE and array_of_statuses is undefined.
![Page 57: High Performance Computing Course Notes 2007-2008 Message Passing Programming I](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d535503460f94a2fa45/html5/thumbnails/57.jpg)
57Computer Science, University of WarwickComputer Science, University of Warwick
Testing multiple non-blocking Testing multiple non-blocking communications for completioncommunications for completion
Query a number of communications at a time to find out if any of them have completed
Wait: MPI_Waitany (count, array_of_requests, index, status)
MPI_WAITANY blocks until one or more of the communications associated with the array of request handles, array_of_requests, has completed.
The index of the completed communication in the array_of_requests handles is returned in index, and its status is returned in status.
Should more than one communication have completed, the choice of which is returned is arbitrary.
Test: MPI_Testany (count, array_of_requests, index, flag, status)
The result of the test (TRUE or FALSE) is returned immediately in flag.