RE: information for profiling

Lewins, Lloyd J (llewins@msmail4.hac.com)
21 Nov 1995 08:39:53 -0800

The following are hooks into the MPI implementation which I have used to
implement profiling:

For MPI_Start, MPI_Start_all:
The following fields from the Request: request-type, tag, comm,
message length (bytes_as_contig), destination rank (either global or local)

For MPI_Wait*, MPI_Test*:
The following fields from the Request: request-type, comm

For MPI_Send, MPI_Isend, MPI_Ssend, MPI_Bsend, MPI_Recv:
comm-context

Several functions also need an efficient way to translate local ranks to
global ranks. Note: MPI_Group_translate_ranks uses a linear search in MPICH,
and therefore doesn't meet the efficiency requirements!!

The context logged is the integer context generated in the usual "non-global
unique" way. It is disamiguated by the fact that global ranks are logged.

Currently, I "peek" into the MPICH opaque objects to implement my profiling.
This is obviously not portable to other implementations, but it meets my
immediate needs.

While it might not possible to completely standardize the above functions
(some implementations may take exception to integer contexts for example). We
should at least be able to get those implementations which can support the
above features to provide a common interface.

Efficient request caching would meet most of the above needs. An optimized
version of MPI_Group_translate_ranks which treats MPI_Comm_world as a special
case would eliminate the need for a fast way to translate ranks. This leaves
context as the major sticking point. However, I remain concerned about
efficiency. Profiling must always trying to minimize the pertabation to the
timeline, and inefficiency cannot be tolerated.

Lloyd Lewins
Hughes Aircraft Co.
llewins@msmail4.hac.com