1 tuesday, october 10, 2006 to err is human, and to blame it on a computer is even more so. -robert...

Tuesday, October 10, 2006

To err is human, and to blame it on a computer is

even more so.

- Robert Orben

MPI (Message Passing Interface)

MPI is a specification for message passing libraries.

Standardized (replaced all previous message passing libraries)

Practical Portable (vendor independent) Efficient

Industry standard for writing message passing programs.

Implementations are available for both vendor and public domains.

1980s - early 1990s: Number of incompatible software tools for

writing message passing programs for distributed memory systems.

The need for a standard arose. MPI Forum

• 175 individuals from 40 organizations• Parallel computer vendors, software programmers,

academia and application scientists.

Originally, MPI was targeted for distributed memory systems.

Popularity of shared memory systems (SMP / NUMA architectures) resulted in appearance of MPI implementations for these platforms.

MPI is now used on just about any common parallel architecture including massively parallel machines, SMP clusters, workstation clusters and heterogeneous networks.

All parallelism is explicit The programmer is responsible for correctly

identifying parallelism and using MPI routines.

Format of MPI callsret = MPI_Xxxx(parameter, ...)

ret is MPI_SUCCESS if successful.

Communicators define which collection of processes may communicate with each other.

MPI_COMM_WORLD

Rank Unique,

integer identifier (begin at zero and are contiguous).

Often used conditionally to control program execution

MPI: the Message Passing Interface

The minimal set of MPI routines.

MPI_Init Initializes MPI.

MPI_Finalize Terminates MPI. MPI_Comm_size Determines the number of processes. MPI_Comm_rank Determines the label of calling process. MPI_Send Sends a message.

MPI_Recv Receives a message.

MPI: Rich set of routines

100+ routines

MPI_InitInitializes the MPI execution

environment. Must be called in every MPI program, Before any other MPI functions Called only once in an MPI program. May be used to pass the command line

arguments to all processes.

MPI_FinalizeTerminates the MPI execution

environment. Last MPI routine called in every MPI

program.

Hello World MPI Program#include <stdio.h>#include <mpi.h>int main(int argc, char *argv[]){ int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("Hello, world! I am %d of %d\n", rank, size); MPI_Finalize(); return 0;}

Processes viewed as arranged in one-dimension

Point-to-Point Communication

Message passing between two different MPI tasks.

Different types of send and receive routines.

Synchronous and asynchronous communication.

The Building Blocks: Send and Receive Operations The prototypes of these operations are as follows:send(void *sendbuf, int nelems, int dest)receive(void *recvbuf,int nelems,int source)

Consider the following code segments:P0 P1a = 100; receive(&b, 1, 0)send(&a, 1, 1); printf("%d\n", b);a = 0;

The semantics of the send operation require that the value received by process P1 must be 100 as opposed to 0.

This motivates the design of the send and receive protocols.

Non-Buffered Blocking Message Passing Operations

Handshake for a blocking non-buffered send/receive operation.It is easy to see that in cases where sender and receiver do not

reach communication point at similar times, there can be considerable idling overheads.

Synchronous communication overhead.

Non-Buffered Blocking Message Passing Operations

Handshake for a blocking non-buffered send/receive operation.It is easy to see that in cases where sender and receiver do not

reach communication point at similar times, there can be considerable idling overheads.

Idling Overhead

Deadlocks in blocking non-buffered operations

In a perfect world, every send operation will time perfectly with a matching receive operation.

Suppose …

A send operation occurs 5 seconds before the receive is ready - where is the message while the receive is pending?

Multiple sends arrive at the same receiving task which can only accept one send at a time.

Buffered Blocking Message Passing Operations A simple solution to the idling and deadlocking

problem outlined above is to rely on buffers at the sending and receiving ends.

The sender simply copies the data into the designated buffer and returns after the copy operation has been completed.

The data must be buffered at the receiving end as well.

Buffering trades off idling overhead for buffer copying overhead.

The MPI implementation (not the MPI standard) decides what happens to data in these types of cases.

Typically, a system buffer area is reserved to hold data in transit.

Blocking

Most of the MPI point-to-point routines can be used in either blocking or non-blocking mode.

Blocking: A blocking send routine will only "return" after it is

safe to modify the application buffer (your send data) for reuse.

Safe means that modifications will not affect the data intended for the receive task. Safe does not imply that the data was actually received - it may very well be sitting in a system buffer.

Blocking

A blocking send can be synchronous which means there is handshaking occurring with the receive task to confirm a safe send.

A blocking send can be asynchronous if a system buffer is used to hold the data for eventual delivery to the receive.

A blocking receive only "returns" after the data has arrived and is ready for use by the program.

Buffered Blocking Message Passing Operations

Bounded buffer sizes can have significant impact on performance.

for (i = 0; i < 1000; i++){ for (i = 0; i < 1000; i++){

produce_data(&a); receive(&a, 1, 0);

send(&a, 1, 1); consume_data(&a);

What if consumer was much slower than producer?

Buffered Blocking Message Passing Operations

receive(&a, 1, 1); receive(&a, 1, 0);

send(&b, 1, 1); send(&b, 1, 0);

Buffered Blocking Message Passing OperationsDeadlocks are still possible with buffering since

receive operations block.

receive(&a, 1, 1); receive(&a, 1, 0);

send(&b, 1, 1); send(&b, 1, 0);

MPI_Send (void* buf,int count,

MPI_Datatype datatype,

int dest,

int tag,

MPI_Comm comm)

MPI_Recv (void* buf,int count,

MPI_Datatype datatype,

int source,

int tag,

MPI_Comm comm,

MPI_Status *status)

MPI Datatypes MPI Datatype C Datatype

MPI_CHAR signed char

MPI_SHORT signed short int

MPI_INT signed int

MPI_LONG signed long int

MPI_UNSIGNED_CHAR unsigned char

MPI_UNSIGNED_SHORT unsigned short int

MPI_UNSIGNED unsigned int

MPI_UNSIGNED_LONG unsigned long int

MPI_FLOAT float

MPI_DOUBLE double

MPI_LONG_DOUBLE long double

MPI_BYTE

MPI_PACKED

1 tuesday, october 10, 2006 to err is human, and to blame it on a computer is even more so. -robert...

rank mpi

size mpi

argv mpi

mpi forum

mpi functions

mpi execution environment

different mpi tasks

hello world mpi program

Documents

blame cleanse

solvang city hall n yh edd err h edd err

baseless blame

robert orben - hart associates...robert orben there’s so...

chap1-5 err

err is human

warm-up err… bellringer …err…do now

reas'ningbut 70 err

jon nistor nistor@torix.ca boulevard of broken networks ·...

err infosys

jon nistor nistor@torix.ca boulevard of broken networks ·...

err 2013 - 6

the blame game or sharing the blame?: hearing stakeholders

to err is human; not to err is better! vaccination errors...

err refresher – facilitator

man 8040t err

err documentation

blame 360 feedback template test blame · competency...

gunther - alternatives to blame and...

differential calculus to err isto admitto forgive to blame...