106. Reduce-Scatter

Up: Contents Next: Scan Previous: All-Reduce

MPI includes a variant of the reduce operations where the result is scattered to all processes in a group on return.

MPI_REDUCE_SCATTER( sendbuf, recvbuf, recvcounts, datatype, op, comm)
IN sendbuf starting address of send buffer (choice)
OUT recvbuf starting address of receive buffer (choice)
IN recvcountsnon-negative integer array specifying the number of elements in result distributed to each process. Array must be identical on all calling processes.
IN datatype data type of elements of input buffer (handle)
IN op operation (handle)
IN comm communicator (handle)

int MPI_Reduce_scatter(void* sendbuf, void* recvbuf, int *recvcounts, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)

<type> SENDBUF(*), RECVBUF(*)
void MPI::Comm::Reduce_scatter(const void* sendbuf, void* recvbuf, int recvcounts[], const MPI::Datatype& datatype, const MPI::Op& op) const = 0

If comm is an intracommunicator, MPI_REDUCE_SCATTER first does an element-wise reduction on vector of elements in the send buffer defined by sendbuf, count and datatype. Next, the resulting vector of results is split into n disjoint segments, where n is the number of members in the group. Segment i contains recvcounts[i] elements. The i-th segment is sent to process i and stored in the receive buffer defined by recvbuf, recvcounts[i] and datatype.

Advice to implementors.

The MPI_REDUCE_SCATTER routine is functionally equivalent to: an MPI_REDUCE collective operation with count equal to the sum of recvcounts[i] followed by MPI_SCATTERV with sendcounts equal to recvcounts. However, a direct implementation may run faster. ( End of advice to implementors.)
The ``in place'' option for intracommunicators is specified by passing MPI_IN_PLACE in the sendbuf argument. In this case, the input data is taken from the top of the receive buffer. If comm is an intercommunicator, then the result of the reduction of the data provided by processes in group A is scattered among processes in group B, and vice versa. Within each group, all processes provide the same recvcounts argument, and the sum of the recvcounts entries should be the same for the two groups.


The last restriction is needed so that the length of the send buffer can be determined by the sum of the local recvcounts entries. Otherwise, a communication is needed to figure out how many elements are reduced. ( End of rationale.)

Up: Contents Next: Scan Previous: All-Reduce

Return to MPI-2.1 Standard Index
Return to MPI Forum Home Page

MPI-2.0 of July 1, 2008
HTML Generated on July 6, 2008