specialized sending and receiving david monismith cs599 based upon notes from chapter 3 of the mpi...

Specialized Sending and Receiving

David MonismithCS599

Based upon notes from Chapter 3 of the MPI 3.0 Standardwww.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf

MPI Message Passing

• Recall that messages are passed in MPI using MPI_Send and MPI_Recv

• MPI_Send - sends a message of a given size with a given type to a process with a specific rank.

• MPI_Recv - receives a message of a maximum size with a given type from a process with a specific rank.

• MPI_COMM_WORLD - the "world" in which the processes exist. This is a constant.

Sending and Receiving Messages

• MPI_Send and MPI_Recv have the following parameters:

MPI_Send( pointer to message, message size, message type, process rank to send to, message tag or id, MPI_COMM_WORLD)

MPI_Recv( pointer to variable used to receive, maximum recv size, message type, process rank to receive from, message tag or id, MPI_COMM_WORLD, MPI_STATUS_IGNORE)

MPI_Send and MPI_Recv• Recall that we discussed that MPI_Send and MPI_Recv are

blocking send and receive fucntions.• These functions do block, however, MPI_Send may only block

until the message to be sent has been transferred to a temporary buffer.

• It is up to MPI to decide to buffer the outgoing message.• If MPI does not decide to buffer the outgoing message, the send

will block until a matching receive has been posted.• It is possible (and is currently true) that MPI_Send might continue

after transferring control of its message to the MPI Environment (i.e. the Open Runtime Environment).

• So, we refer to MPI_Send as a buffered send.

MPI_Sendrecv

• Recall that when discussing Odd-Even sort, we used a function that would send and receive messages in one operation.

• Syntax: MPI_Sendrecv(data_to_send, send_size, send_type, destination, send_tag, data_to_recv, recv_size, recv_type, source, recv_tag, communicator, status)

• Recall that this function allows for data to be both sent and received between processes without the need to worry about the order of the send and receive.

• Recall that this also avoids the problem of two blocking sends being issued at the same time (i.e. it helps to avoid deadlock).

MPI_Sendrecv Parameters• data_to_send – reference to the data to be sent (input)• send_size – amount of data to send (input)• send_type – MPI type of the data to send (input)• Destination – rank of the process to which the data will be sent (input)• send_tag – message identifier for the data to be sent (input)• data_to_recv – reference to the location where data will be received (output)• recv_size – maximum amount of data to receive (input)• recv_type – MPI type of data to receive (input)• Source – rank of process from which data will be received (input)• recv_tag – message identifier for message to be received (input)• Communicator – the world in which communication will occur (input, typically

MPI_COMM_WORLD)• Status – status of the message receipt (output, either MPI_Status variable or

MPI_STATUS_IGNORE)

Other Methods to Send and Receive Messages

• Since programmers often need control of how sends and receives occur, MPI implements true blocking and true non-blocking send and receive functions.

• MPI_Ssend – MPI’s synchronous send.– This function, when used with MPI_Recv, always blocks until it receives an

acknowledgement that the message has indeed been received.• MPI_Bsend – MPI’s buffered send.

– This function, when used with MPI_Recv, blocks until the message has been transferred to a buffer that the MPI framework can make use of.

• MPI_Isend and MPI_Irecv – MPI’s non-blocking send and receive functions– These methods continue immediately after being called.– MPI_Isend sends the message and does not wait for an acknowledgement.– MPI_Irecv tells the MPI Environment that a message will be received, but does not

wait for the message to be received.• Non-blocking send and receive functions must be used with MPI_Test and/or

MPI_Wait functions in order to receive the messages that were sent.

Synchronous Send• This function, when used with MPI_Recv, always blocks until it

receives an acknowledgement that the message has indeed been received.

• Syntax: MPI_Ssend(buffer, count, type, destination, tag, communicator)

• Buffer – the data to be sent• Count – the number of elements being sent• Type – the MPI type of the data being sent• Destination – the rank of the process to which the data will be

sent• Tag – an identifier for the message• Communicator – Currently MPI_COMM_WORLD

Buffered Send• This function, when used with MPI_Recv, blocks until the message

has been transferred to a buffer that the MPI framework can make use of.

• Syntax: MPI_Bsend(buffer, count, type, destination, tag, communicator)

• Buffer – the data to be sent• Count – the number of elements being sent• Type – the MPI type of the data being sent• Destination – the rank of the process to which the data will be

sent• Tag – an identifier for the message• Communicator – Currently MPI_COMM_WORLD

Examples

• We will investigate using MPI_Bsend and MPI_Ssend with in-class examples.

• First, we will re-implement the ring program from worksheet 6, problem 2.

• Next we will modify the Odd-Even Sort program to make use of these.

Return Status of a Message

• It is possible to specify wildcard values in place of the message source and message tag– MPI_TAG_ANY – allow for any tag– MPI_SOURCE_ANY – allow for any source

• In this case, it is possible to determine the status of the received message using the MPI_Status structure.

MPI_Status

• The MPI_Status struct contains three values in C.• So, given a structure of type MPI_Status

identified by status, these values can be retrieved as:– status.MPI_SOURCE – the source of the

message– status.MPI_TAG – the tag of the message– status.MPI_ERROR – the error code of the

message

MPI_Get_Count

• Question: What if we don’t know exactly how much data we will receive?

• Assume for now that our buffer is big enough to hold all the data that we need.– This could be a problem if we are transferring gigabytes

of data at a time, though.• We could first send a message with the size.• Or we could use MPI_Get_Count to determine the

size using the status of a message that was just received.

MPI_Get_Count

• Syntax – MPI_Get_Count(status, data_type, count)

• status - reference to status variable (input)• data_type – MPI type of receive buffer

elements (input)• count - reference to variable where count will

be stored (output)

Example

• Let’s try to write a simple MPI example with two processes that send an unknown amount of data from one process to another.

Non-Blocking Sending and Receiving

• Performance can be improved by allowing computations to be performed while waiting for a message to be sent or while waiting to receive a message.

• The non-blocking send operator starts a send but does not complete it.

• Similarly, a non-blocking receive operator may start a receive, but not complete it.

• It may be necessary to issue send complete and/or receive completion operations though.

• Be aware that non-blocking operations make use of MPI_Request objects to identify a communication operation and match that communication with the operation that will terminate it.

Non-blocking Send• MPI_Isend (and similarly Ibsend and Issend) is used to send a message

in a non-blocking fashion.• Syntax – MPI_Isend(buffer, count, type, destination, tag, communicator,

request)• Buffer – the data to be sent (input)• Count – the number of elements being sent (input)• Type – the MPI type of the data being sent (input)• Destination – the rank of the process to which the data will be sent

(input)• Tag – an identifier for the message (input)• Communicator – the communication world, currently

MPI_COMM_WORLD (input)• Request – the communication request result (output)

Non-Blocking Receive• MPI_Irecv is used to request receipt of a message in a non-blocking

fashion.• Syntax – MPI_Irecv(buffer, count, type, source, tag, communicator,

request)• Buffer – the data to be received (output)• Count – the maximum number of elements to receive (input)• Type – the MPI type of the data being received (input)• Source – the rank of the process from which the data will be received

(input)• Tag – an identifier for the message (input)• Communicator – the communication world, currently

MPI_COMM_WORLD (input)• Request – the communication request result (output)

But wait…

• If MPI_Irecv is non-blocking, how do we know if our message has been received?

• Non-blocking operations simply continue after the operation has been posted and leave it up to MPI to complete the operation.

• So, to truly know if the message has been received, we need to test and/or wait for its receipt.

MPI_Wait

• MPI_Wait is used to wait for delivery of a message.• It requires the message request from MPI_Irecv to

be passed in as a parameter.• Syntax: MPI_Wait(request, status)• request – a reference to the request structure from

the message to be received (i.e. from MPI_Irecv) (input and output)

• status – the status of the message that was received (output)

MPI_Wait

• Calling MPI_Wait forces a process to block until the message identified by the request is received.

• It is possible for a communication to be cancelled or to cause an error.

• These issues can be handled by examining the result stored in the status variable.

MPI_Test• Generally, it is possible to complete work while waiting for a message

to be delivered.• As previously mentioned in class messages are sent from point-to-

point with send and receive.• MPI acts in a similar fashion to a postal service.• The sender puts the message in an envelop and sends it, by handing

off the message to MPI.• The receiver waits for delivery of the message in its mailbox from the

postal service (MPI).• Instead of “standing by the mailbox,” it is possible to go “check the

mailbox every once in a while,” and to do something else for a while if the message was not delivered.

• MPI_Test can be used to check for delivery of a message.

MPI_Test

• MPI_Test is used to check to see if a message has been delivered.

• Syntax: MPI_Test(request, flag, status)• Request – reference to the requested message, note that

this identifies the message (input)• Flag – reference to an integer (output), note that this is set

to one if the message was delivered• Status – reference to the status object associated with the

message (output), note that this may be used to determine if delivery of the message was cancelled or resulted in an error.

Example

• There are several non-blocking MPI examples on the course website.

• We will look at these examples and make modifications to them, to use MPI_Wait and MPI_Test.

• It is recommended that you try to implement a while loop that tests for delivery of a message, does work if the message has not been delivered, and exits the while loop after delivery has occurred.

Advanced Testing and Waiting

• It is possible to test for many messages and to wait for many messages.

• MPI provides the following functions to test and wait for any, some, and/or all of your messages.– MPI_Testany– MPI_Testsome– MPI_Testall– MPI_Waitany– MPI_Waitsome– MPI_Waitall

• You should read about each of these functions in the MPI 3.0 Standard.

Reading Assignment

• Read Chapter 3 of the MPI 3.0 Standard• Pay particular attention to sections 3.2, 3.4,

3.5, and 3.7 – 3.10

specialized sending and receiving david monismith cs599 based upon notes from chapter 3 of the mpi...

Documents

world mpi

mpi message passingrecall

mpi environment

type mpi type of data

message type

message size

message tag

outgoing message