introduction to mpi programming (part iii) michael griffiths, deniz savas & alan real january...
TRANSCRIPT
![Page 1: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/1.jpg)
Introduction to MPI Programming
(Part III)
Michael Griffiths, Deniz Savas & Alan Real
January 2006
![Page 2: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/2.jpg)
OverviewReview blocking and non-blocking communicationsCollective Communication
Broadcast, Scatter & Gather of dataReduction OperationsBarrier Synchronisation
Processor topologiesPatterns for Parallel ProgrammingExercises
![Page 3: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/3.jpg)
Blocking operations
Relate to when the operation has completedOnly return from the subroutine call when the
operation has completed
![Page 4: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/4.jpg)
Non-blocking communication
Separate communication into three phases:Initiate non-blocking communicationDo some work:
Perhaps involving other communications
Wait for non-blocking communication to complete.
![Page 5: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/5.jpg)
Collective Communications(one for all, all for one!!!)
Collective communication is defined as that which involves all the processes in a group. Collective communication routines can be divided into the following broad categories:
Barrier synchronisationBroadcast from one to all.Scatter from one to allGather from all to one.Scatter/Gather. From all to all.Global reduction (distribute elementary operations)IMPORTANT NOTE: Collective Communication operations and point-to-
point operations we have seen earlier are invisible to each other and hence do not interfere with each other.This is important to avoid dead-locks due to interference.
![Page 6: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/6.jpg)
Timers
Double precision MPI functionsFortran, DOUBLE PRECISION t1:
t1 = MPI_WTIME();
C double t1:t1 = MPI_Wtime();
C++ double t1:t1 = MPI::Wtime();
Time is measured in seconds.Time to perform a task is measured by
consulting the timer before and after.
![Page 7: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/7.jpg)
Practice Session 4: diffusion example
Arrange processes to communicate round a ring.Each process stores a copy of its rank in an integer
variable.Each process communicates this value to its right
neighbour and receives a value from its left neighbour.Each process computes the sum of all the values received.Repeat for the number of processes involved and print out
the sum stored at each process.
![Page 8: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/8.jpg)
Generating Cartesian Topologies
MPI_Cart_createMakes a new communicator to which topology
information has been attached
MPI_Cart_coordsDetermines process coords in cartesian topology given
rank in group
MPI_Cart_shiftReturns the shifted source and destination ranks,
given a shift direction and amount
![Page 9: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/9.jpg)
MPI_Cart_create syntax
FortranINTEGER comm_old, ndims, dims(*), comm_cart, ierror
logical periods(*), reorder
CALL MPI_CART_CREATE(comm_old, ndims, dims, periods, reorder, comm_cart, ierror)
C:MPI_Cart_create(MPI_Comm comm_old, int ndims, int *dims, int *periods, int reorder, MPI_Comm *comm_cart );
C++:MPI::Intracomm::Create_cart (int ndims, const int
dims[], const bool periods[], bool reorder );
![Page 10: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/10.jpg)
MPI_Comm_rank Syntax
MPI_Comm_rank - Determines the rank of the calling process in the communicator.
• int MPI_Comm_rank(MPI_Comm comm, int *rank)
• MPI_COMM_RANK(COMM, RANK, IERROR) • int Comm::Get_rank() const
![Page 11: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/11.jpg)
Transform Rank to CoordinatesMPI_Cart_coords syntax
FortranCALL MPI_CART_COORDS(INTEGER COMM,INTEGER
RANK,INTEGER MAXDIMS,INTEGER COORDS(*),INTEGER IERROR)
C:int MPI_Cart_coords(MPI_Comm comm,int rank,int
maxdims,int *coords);
C++:void MPI::Cartcomm::Get_coords(int rank, int
maxdims, int coords[]) const;
![Page 12: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/12.jpg)
Transform Coordinatesto Rank
MPI_Cart_rank syntax
FortranCALL MPI_CART_RANK(INTEGER COMM, INTEGER
COORDS(*),INTEGER)
C:int MPI_Cart_rank(MPI_Comm comm, int *coords,int
*rank);
C++:void MPI::Cartcomm::Get_rank(int coords[],int
*rank) const;
![Page 13: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/13.jpg)
MPI_Cart_shift syntax
FortranMPI_CART_SHIFT(INTEGER COMM,INTEGER
DIRECTION,INTEGER DISP, INTEGER RANK_SOURCE,INTEGER RANK_DEST,INTEGER IERROR)
C:int MPI_Cart_shift(MPI_Comm comm,int direction,int
disp,int *rank_source,int *rank_dest); C++:
void MPI::Cartcomm::Shift(int direction, int disp, int &rank_source, int &rank_dest) const;
![Page 14: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/14.jpg)
Mapping 4x4 Cartesian Topology Onto Processor Ranks
![Page 15: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/15.jpg)
Topologies: Examples
See Diffusion exampleSee cartesian example
![Page 16: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/16.jpg)
Examples for Parallel Programming
Master slaveE.g. share work exampleExample ising model
Communicating Sequential Elements PatternPoisson equation
Highly coupled processesSystolic loop algorithmE.g. md example
![Page 17: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/17.jpg)
Poisson Solver Using Jacobi Iteration
Communicating Sequential Elements PatternOperations in each component depend on partial
results in neighbour components.
SlaveThread
SlaveThread
SlaveThread
SlaveThread
SlaveThread
SlaveThread
Data Exchange
Data Exchange
![Page 18: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/18.jpg)
Layered Decomposition of 2d Array
Distribute 2d array across processorsProcessors store all columnsRows allocated amongst processors
Each proc has left proc and right procEach proc has max and min vertex that it storesUij
new=(Ui+1j+Ui-1j+Uij+1+Uij-1)/4Each proc has a “ghost” layer
Used in calculation of update (see above)Obtained from neighbouring left and right processorsPass top and bottom layers to neighbouring processors
Become neighbours ghost layers
Distribute rows over processors N/nproc rows per procEvery processor stores all N columns
![Page 19: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/19.jpg)
Processor 1
Processor 2
Processor 3
Processor 4
N+1
N+1
1
N
p2min
p3max
p3max
p2min
p1min
p2max
p2max
p1min
Send top layer
Send bottom layerReceive
top layer
Receive bottom layer
![Page 20: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/20.jpg)
Master Slave
A computation is required where independent computations are performed, perhaps repeatedly, on all elements of some ordered data.
ExampleImage processing perform computation on different sets of pixels within
an image
Master
Slave
Slave
SlaveThread
Thread
Thread
Data Exchange
![Page 21: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/21.jpg)
Highly Coupled Efficient Element Exchange
Highly Coupled Efficient Element Exchange using Systolic loop techniques
Extreme example of Communicating Sequential Elements Pattern
![Page 22: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/22.jpg)
Systolic Loop
Distribute Elements Over ProcessorsThree buffers
Local elementsTravelling Elements (local elements at start)Send buffer
Loop over number of processorsTransfer travelling elements
Interleave send/receive to prevent deadlockSend contents of send buffer to next procReceive buffer from previous proc to
travelling elementsPoint travelling elements to send buffer
Allow local elements to interact with travelling elementsAccumulate reduced computations over processors
![Page 23: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/23.jpg)
Systolic Loop Element Pump
Proc 1
LocalElements
MovingElements(from 4)
Proc 2
LocalElements
MovingElements(from 1)
Proc 3
LocalElements
MovingElements(from 2)
Proc 4
LocalElements
MovingElements(from 3)
First cycle of 3 for 4 processor systolic loop
![Page 24: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/24.jpg)
Practice Sessions 5 and 6
Defining and Using Processor TopologiesPatterns for parallel computing
![Page 25: Introduction to MPI Programming (Part III) Michael Griffiths, Deniz Savas & Alan Real January 2006](https://reader035.vdocument.in/reader035/viewer/2022062409/56649cb95503460f94980ed0/html5/thumbnails/25.jpg)
Further InformationAll MPI routines have a UNIX man page:
Use C-style definition for Fortran/C/C++:E.g. “man MPI_Finalize” will give correct syntax
and information for Fortran, C and C++ calls.
Designing and building parallel programs (Ian Foster)http://www-unix.mcs.anl.gov/dbpp/
Standard documents:http://www.mpi-forum.org/
Many books and information on web.EPCC documents.