1 mpi primer lesson 10 2 what is mpi mpi is the standard for multi- computer and cluster message...

24
1 MPI Primer MPI Primer Lesson 10 Lesson 10

Upload: karin-long

Post on 03-Jan-2016

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

1

MPI PrimerMPI Primer

Lesson 10Lesson 10

Page 2: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

2

What is MPI

• MPI is the standard for multi-computer and cluster message passing introduced by the Message-Passing Interface Forum in April 1994. The goal of MPI is to develop a widely used standard for writing message-passing programs.

Page 3: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

3

Historical Perspective

Page 4: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

4

Major MPI Issues1. Process Creation and Management :

discusses the extension of MPI to remove the static process model in MPI. It defines routines that allow for creation of processes.

 

2. One-Sided Communications : defines communication routines that can be completed by a single process.These include shared-memory operations (put/get) and remote accumulate operations.

3. Extended Collective Operations:extends the semantics of MPI-1 collective operations to include intercommunicators. It also adds more convenient methods of constructing intercommunicators and two new collective operations.

4. External Interfaces: defines routines designed to allow developers to layer on top of MPI. This includes generalized requests, routines that decode MPI opaque objects, and threads.

5. I/O: defines MPI-2 support for parallel I/O.

6. Language Bindings: describes the C, C++ binding and discusses Fortran-90 issues.

Page 5: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

5

Message Passing • Most popular way for distributed-memory systems

– Three Steps A Message Is Passed:(1)   Data is copied out of sender and buffer message

assembled(2)   Message passed to receiver(3)   Message disassembled and data is copied to receiver

buffer • Communicator: Specifying a domain for communications to

take place

– Two Types of Message Passing1. Intra-Communicator message passing2. Inter-Communicator message passing

– Remarks:1. A process may belong to several communicators at the

same time2. A communicator is usually the entire collection of

processors (or processes) you get for your applications

Page 6: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

6

MPI_COMM_WORLD

• Specifies all processes available at initialization

– Rank– Every message must have two attributes

1. The Envelope

2. The Data– Message Tag– MPI datatype:

Page 7: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

7

Rank and two attributes• Rank:

1. An integer to uniquely identify each process in your communicator.

2. Rank goes 0 through n-1 (n = number of processes)3. Rank can be recalled by:4. MPI_Comm_Rank();

 • Every message must have two attributes

1. The Envelopea. Rank of Destinationb. Message Tagc. Communicator

 2. The Data

a. Initial Address of Send Bufferb. Number of Entries to Sendc. Datatype of Each Entry  

Page 8: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

8

Message Tag & MPI datatype• Message tag:

– (1)       ID for this particular message to be matched by both sender and receiver.

– (2)       It is like sending multiple gifts to your friend. You need to identify them.– (3)       MPI_TAG_UB >= 32767– (4)       Similar in functionality to "comm" to group msgs.– (5)       “comm” is safer than "tag" , but "tag" is more convenient.

 • MPI datatype: To achieve portability among different architectures 

– MPI_INTEGER– MPI_REAL– MPI_DOUBLE_PRECISION– MPI_COMPLEX– MPI_LOGICAL– MPI_BYTE – MPI_INT– MPI_CHAR– MPI_FLOAT– MPI_DOUBLE

Page 9: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

9

Main Message Passing Functions• Blocking message Send:  

MPI_Send( a. Initial address of send bufferb. Number of entries to sendc. datatype of each entry

 a. rank of destinationb. message tagc. communicator);

 • Blocking message Recv:

MPI_Recv( a. initial address of recv bufferb. max #entries to recvc. datatype of each entry

 a. rank of srcb. message tagc. communicator

 d. return status);

Page 10: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

10

Message selection(Pulling message)

• A receiver selects a message by its envelope information:

• (1)Source rank,• (2)Message tag,

• It can also receive all messages (wild card)(1) MPI_ANY_TAG

(2) MPI_ANY_SOURCE

 

• You must specify a "comm".(1) MPI_Get_count (

a. Return status of recv operationb. Datatype of each Recv buffer entryc. Number received entries);

 This function decodes the “status” from MPI_Recv().

• Remarks: • 1.    Message transfer is initiated by sender (pushing) not pulling• 2.    Send self message is allowed, may produce deadlock if blocking

sender recversend message receiving message

 • 3.    Passing messages of multiple datatypes (a struc) is difficult

a.     Use packing/unpackingb.     Two-phase protocol (msg nature, msg self)

• 4.      Avoid wildcard as much as possible

Page 11: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

11

MPI_SendRecv();A round-trip of a message. Send a message out and then receive another message.• Performing remote procedure calls (RPC) - Sending the input parameter to the dst and then get output

back.

• When you need to send AND recv a msg.

MPI_SendRecv(sendbufsendcountsendtypedst-ranksend-tag

recvbufrecvcountrecvtypesrc-rankrecv-tagcommstatus);

• Remarks: 1. Matches other operations of send and recv.

a. sendrecv can be received by recvsrc dstsendrecv recv

b. sendrecv can receive a msg by regular sendsrc dstsend sendrecv

2. same "comm"3. different tags4. different buffers (disjoint)5. send-recv is a concurrent double call (send and receive)6. avoids deadlock (make it possible to have a message for round-trip.)

Page 12: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

12

MPI_SendRecv_Replace();Same as above except the sendbuf is replaced after this call by receive

buffer.

MPI_SendRecv_Replace(bufcountsendtypedst-ranksend-tag

recvtypesrc-rankrecv-tagcommstatus);

Page 13: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

13

Dummy source or dst---NULL Processes

MPI_PROC_NULL

send or recv with src=MPI_PROC_NULL or dst=MPI_PROC_NULL

 Remarks:

1. For convenience of balance and symmetric of a code.

2. Use it with care.

Page 14: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

14

Two message protocols (short and long)

• Short: send msg get an ack

msg ack

 • Long:

send request-to-send Signal ReadyData

ack

Page 15: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

15

Blocking & Non-Blocking Communication• Blocking

– If we reverse the arrival at the communication point. Suppose Proc 1 executes the receive, but Proc 0 doesn’t execute the send. We say that the MPI_Recv function is blocking. Proc 1 calls the receive function but is not available. Pro 1 will remain ideal until it becomes available. This id different from the synchronous communication. In blocking communication, 0 may have already buffered the message when 1 is ready to receive, but the communication line joining the processesw might be busy.

• Nonblocking– Most systems provides an alternative for receive operation, called

MPI_Irecv for Immediate receive. It has one more parameter than MPI_Recv, the request. With this, the process gets a return “immediately” from the call. For example, Proc 1 callede MPI_Irecv, The call would notify the system that Proc 1 intended to receive a message from Proc 0 with the property indicated by the argument. Then Pro 1 could perform some other useful work and the system initialized the request argument. Proc 1 will check back later with the system (not depend on the Pro 0 message), to see if the message had arrived according to the requestargument.

• The use of non-blocking communication can dramatically improve the performance of message-passing programming.

• If each Node has a communication coprocessor, then we can start a non-blocking communication and perform the computations that don’t depend on the result of the communication.

Page 16: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

16

Message Passing Functions

• Message passing functions can be either blocking or nonblocking.

• In blocking message passing, a call to a communication function won’t return until the operation is complete.

• The nonblocking communication consists of two phases.– first phase: a function is called that starts the

communication.– second phase: another function is called that

completes the communication.– If the system has the capability to simultaneously

compute and communicate. We can do some useful computation in between the two phases.

Page 17: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

17

Non-blocking communication• Three non-blocking message passing:

Sender Receiver– Method-1

T1: send recvT2: recv send

 – Method-2

T1: send sendT2: recv recv

 – Method-3

T1: recv sendT2: send recv

Page 18: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

18

Completion operations• MPI_WAIT(

requeststatus)

 This call returns only if the "request" is complete.

 • MPI_test(

requestflagstatus)

 (1) flag=TRUE if request is complete. Otherwise, it’s FALSE

  (2) MPI_WAIT() will return if flag in MPI_Test() is TRUE

 • Remarks:

This allows easy change of blocking code to non-blocking.

Page 19: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

19

More Completion operations• MPI_Request_free()

removes the request handle, but will not cancel the msg. • MPI_Cancel() does it. • MPI_Waitany(

list lengtharray of request handlersindex of completed request handlestatus object)

 • MPI_Testany() • Remarks:  

1.    Additional Test and Wait tools allow easy migration of blocking code to non-blocking2.    MPI_Request_free() removes the request handle, but will not cancel the msg.3.    MPI_Cancel() will cancel the msg.

Page 20: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

20

MPI Message• The actual message passing in the program is carried out by the MPI function

MPI_Send (sends a message to a designated process) and MPI_Recv (receives a message from a process.)

• Problems involved in message passing: 1. a message must be composed and put in a buffer; 2. the message must be “dropped in a mailbox”; in order to know where to deliver the

message, it must “enclosing the message in an envelop” and the designation addressed of the message.

3. But just the address isn’t enough. Since the physical message is a sequence of electrical signals, the system needs to know where the message ends or the size of the message.

4. To take appropriate action about the message, the receiver needs the return address or the address of the source process.

5. Also different message type or tag can help receiver to take proper action of the message.

6. Need to know from which communicator the message comes from.• Therefore, the message envelop contains:

1. The rank of the receiver2. The rank of the sender3. A tag (message type)4. a communicator

• The actual message are stored in a block of memory. The system needs the count and datatype to determine how much storage is needed for the message:

1. the count value2. the MPI datatype

• The message also need a message pointer to know where to get the message1. message pointer

– .

Page 21: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

21

Sending Message

• The parameters for MPI_Send and MPI_Recv are:int MPI_Sent(

void* message /* in */,int count /* in */,MPI_Datatype datatype /* in */,int dest /* in */,int tag /* in */,MPI_Comm comm /* in */)

int MPI_Recv(void* message /*out */,int count /* in */,MPI_Datatype datatype /* in */,int source /* in */,int tag /* in */,MPI_Comm comm /* in */,MPI_Status* status /*out */)

Page 22: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

22

Send and Receive pair

• The status returns information on the data that was actually received. It reference the struct with at least three members:

status -> MPI_SOURCE /* contains the rank of the process that sent the message*/

status -> MPI_TAG /* status -> MPI_ERROR

Send

Tag

A

Receive

Tag

B (can be any sender/tag: MPI_ANY_TAG MPI_ANY_SENDER )

Identical

Message

Page 23: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

23

In Summary

• The count and datatype determine the size of the message• The tag and comm are used to make sure that messages don’t get

mixed up.• Each message consists of two parts: the data being transmitted and

the envelop of information

message

Data

Envelop

Pointer

Count

Datatype

1. the rank of the receiver2. the rank of the sender3. a tag4. a communicator

5. status (for receive)

Page 24: 1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface

24