1 tuesday, october 10, 2006 to err is human, and to blame it on a computer is even more so. -robert...

33
1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. - Robert Orben

Post on 22-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

1

Tuesday, October 10, 2006

To err is human, and to blame it on a computer is

even more so.

- Robert Orben

Page 2: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

2

MPI (Message Passing Interface)

MPI is a specification for message passing libraries.

Standardized (replaced all previous message passing libraries)

Practical Portable (vendor independent) Efficient

Industry standard for writing message passing programs.

Implementations are available for both vendor and public domains.

Page 3: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

3

MPI (Message Passing Interface)

1980s - early 1990s: Number of incompatible software tools for

writing message passing programs for distributed memory systems.

The need for a standard arose. MPI Forum

• 175 individuals from 40 organizations• Parallel computer vendors, software programmers,

academia and application scientists.

Page 4: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

4

MPI (Message Passing Interface)

Originally, MPI was targeted for distributed memory systems.

Popularity of shared memory systems (SMP / NUMA architectures) resulted in appearance of MPI implementations for these platforms.

MPI is now used on just about any common parallel architecture including massively parallel machines, SMP clusters, workstation clusters and heterogeneous networks.

Page 5: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

5

MPI (Message Passing Interface)

All parallelism is explicit The programmer is responsible for correctly

identifying parallelism and using MPI routines.

Page 6: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

6

MPI (Message Passing Interface)

Format of MPI callsret = MPI_Xxxx(parameter, ...)

ret is MPI_SUCCESS if successful.

Page 7: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

7

MPI (Message Passing Interface)

Page 8: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

8

MPI (Message Passing Interface)

Communicators define which collection of processes may communicate with each other.

MPI_COMM_WORLD

Page 9: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

9

MPI (Message Passing Interface)

Rank Unique,

integer identifier (begin at zero and are contiguous).

Often used conditionally to control program execution

Page 10: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

10

MPI: the Message Passing Interface

The minimal set of MPI routines.

MPI_Init Initializes MPI.

MPI_Finalize Terminates MPI. MPI_Comm_size Determines the number of processes. MPI_Comm_rank Determines the label of calling process. MPI_Send Sends a message.

MPI_Recv Receives a message.

MPI: Rich set of routines

100+ routines

Page 11: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

11

MPI_InitInitializes the MPI execution

environment. Must be called in every MPI program, Before any other MPI functions Called only once in an MPI program. May be used to pass the command line

arguments to all processes.

Page 12: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

12

MPI_FinalizeTerminates the MPI execution

environment. Last MPI routine called in every MPI

program.

Page 13: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

13

Hello World MPI Program#include <stdio.h>#include <mpi.h>int main(int argc, char *argv[]){ int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("Hello, world! I am %d of %d\n", rank, size); MPI_Finalize(); return 0;}

Processes viewed as arranged in one-dimension

Page 14: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

14

Point-to-Point Communication

Message passing between two different MPI tasks.

Page 15: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

15

Different types of send and receive routines.

Page 16: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

16

Synchronous and asynchronous communication.

Page 17: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

17

The Building Blocks: Send and Receive Operations The prototypes of these operations are as follows:send(void *sendbuf, int nelems, int dest)receive(void *recvbuf,int nelems,int source)

Consider the following code segments:P0 P1a = 100; receive(&b, 1, 0)send(&a, 1, 1); printf("%d\n", b);a = 0;

The semantics of the send operation require that the value received by process P1 must be 100 as opposed to 0.

This motivates the design of the send and receive protocols.

Page 18: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

18

Non-Buffered Blocking Message Passing Operations

Handshake for a blocking non-buffered send/receive operation.It is easy to see that in cases where sender and receiver do not

reach communication point at similar times, there can be considerable idling overheads.

Synchronous communication overhead.

Page 19: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

19

Non-Buffered Blocking Message Passing Operations

Handshake for a blocking non-buffered send/receive operation.It is easy to see that in cases where sender and receiver do not

reach communication point at similar times, there can be considerable idling overheads.

Idling Overhead

Page 20: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

20

Deadlocks in blocking non-buffered operations

Page 21: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

21

In a perfect world, every send operation will time perfectly with a matching receive operation.

Page 22: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

22

Suppose …

A send operation occurs 5 seconds before the receive is ready - where is the message while the receive is pending?

Multiple sends arrive at the same receiving task which can only accept one send at a time.

Page 23: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

23

Buffered Blocking Message Passing Operations A simple solution to the idling and deadlocking

problem outlined above is to rely on buffers at the sending and receiving ends.

The sender simply copies the data into the designated buffer and returns after the copy operation has been completed.

The data must be buffered at the receiving end as well.

Buffering trades off idling overhead for buffer copying overhead.

Page 24: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

24

The MPI implementation (not the MPI standard) decides what happens to data in these types of cases.

Typically, a system buffer area is reserved to hold data in transit.

Page 25: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

25

Page 26: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

26

Blocking

Most of the MPI point-to-point routines can be used in either blocking or non-blocking mode.

Blocking: A blocking send routine will only "return" after it is

safe to modify the application buffer (your send data) for reuse.

Safe means that modifications will not affect the data intended for the receive task. Safe does not imply that the data was actually received - it may very well be sitting in a system buffer.

Page 27: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

27

Blocking

A blocking send can be synchronous which means there is handshaking occurring with the receive task to confirm a safe send.

A blocking send can be asynchronous if a system buffer is used to hold the data for eventual delivery to the receive.

A blocking receive only "returns" after the data has arrived and is ready for use by the program.

Page 28: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

28

Buffered Blocking Message Passing Operations

Bounded buffer sizes can have significant impact on performance.

P0 P1

for (i = 0; i < 1000; i++){ for (i = 0; i < 1000; i++){

produce_data(&a); receive(&a, 1, 0);

send(&a, 1, 1); consume_data(&a);

} }

What if consumer was much slower than producer?

Page 29: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

29

Buffered Blocking Message Passing Operations

P0 P1

receive(&a, 1, 1); receive(&a, 1, 0);

send(&b, 1, 1); send(&b, 1, 0);

Page 30: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

30

Buffered Blocking Message Passing OperationsDeadlocks are still possible with buffering since

receive operations block.

P0 P1

receive(&a, 1, 1); receive(&a, 1, 0);

send(&b, 1, 1); send(&b, 1, 0);

Page 31: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

31

MPI_Send (void* buf,int count,

MPI_Datatype datatype,

int dest,

int tag,

MPI_Comm comm)

Page 32: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

32

MPI_Recv (void* buf,int count,

MPI_Datatype datatype,

int source,

int tag,

MPI_Comm comm,

MPI_Status *status)

Page 33: 1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben

33

MPI Datatypes MPI Datatype C Datatype

MPI_CHAR signed char

MPI_SHORT signed short int

MPI_INT signed int

MPI_LONG signed long int

MPI_UNSIGNED_CHAR unsigned char

MPI_UNSIGNED_SHORT unsigned short int

MPI_UNSIGNED unsigned int

MPI_UNSIGNED_LONG unsigned long int

MPI_FLOAT float

MPI_DOUBLE double

MPI_LONG_DOUBLE long double

MPI_BYTE

MPI_PACKED