[Neighborhood Collective Communication]Neighborhood Collective Communication on Process Topologies

MPI process topologies specify a communication graph, but they implement no communication function themselves. Many applications require sparse nearest neighbor communications that can be expressed as graph topologies. We now describe several collective operations that perform communication along the edges of a process topology. All of these functions are collective; i.e., they must be called by all processes in the specified communicator. See Section Collective Communication for an overview of other dense (global) collective communication operations and the semantics of collective operations.

If the graph was created with MPI_DIST_GRAPH_CREATE_ADJACENT
with sources and destinations containing
0, *...*, n-1, where n is the number of processes in the group
of comm_old
(i.e., the graph is fully connected and also includes
an edge from each node to itself),
then the sparse neighborhood communication routine performs the same data exchange
as the corresponding dense (fully-connected) collective operation.
In the case of a Cartesian communicator, only nearest neighbor communication
is provided, corresponding to rank_source and rank_dest
in MPI_CART_SHIFT with input disp=1.

* Rationale.*

Neighborhood collective communications enable communication on a process
topology. This high-level specification of data exchange among
neighboring processes enables optimizations in the MPI library because
the communication pattern is known statically (the topology).
Thus, the implementation can compute optimized message schedules during
creation of the topology [35]. This
functionality can significantly simplify the implementation of neighbor
exchanges [31].
(* End of rationale.*)

For a distributed graph topology, created with
MPI_DIST_GRAPH_CREATE, the sequence of neighbors in the
send and receive buffers at each process is defined as the sequence
returned by MPI_DIST_GRAPH_NEIGHBORS for destinations and
sources, respectively. For a general graph topology, created with
MPI_GRAPH_CREATE, the use of neighborhood collective
communication is restricted to adjacency matrices, where the number of
edges between any two processes is defined to be the same for both
processes (i.e., with a symmetric adjacency matrix). In this case,
the order of neighbors in the send and
receive buffers is defined as the sequence of neighbors as returned by
MPI_GRAPH_NEIGHBORS. Note that general graph topologies
should generally be replaced by the distributed graph topologies.

For a Cartesian topology, created with MPI_CART_CREATE, the
sequence of neighbors in the send and receive buffers at each process is
defined by order of the dimensions, first the neighbor in the negative
direction and then in the positive direction with displacement 1. The
numbers of sources and destinations in the communication routines are
2*ndims with ndims defined in
MPI_CART_CREATE. If a neighbor does not exist, i.e., at the
border of a Cartesian topology in the case of a non-periodic virtual
grid dimension (i.e., periods[*...*]==false), then this neighbor is
defined to be MPI_PROC_NULL.

If a neighbor in any of the functions is MPI_PROC_NULL, then the neighborhood collective communication behaves like a point-to-point communication with MPI_PROC_NULL in this direction. That is, the buffer is still part of the sequence of neighbors but it is neither communicated nor updated.

Return to MPI-3.1 Standard Index

Return to MPI Forum Home Page

(Unofficial) MPI-3.1 of June 4, 2015

HTML Generated on June 4, 2015