Re: MPI and threads

Marc Snir (snir@watson.ibm.com)
Mon, 17 Feb 1997 09:41:25 -0400




Dear sirs:
I read the section about threads in MPI-2 (7.8 MPI and Threads) and

and found several possible inconsistencies. Since I have not actively
participated in the MPI-2 discussion so far, I just want to draw the
attention
of interested parties to the points listed below in hopes that they will
be considered, and I will be corrected if I am wrong, in order to increase
the quality of the MPI-2 specification.

1. Thread-compliant and thread-safe implementations.

It is mentioned in the section 7.8 that MPI-2 - compliant implementations
do not necessarily have to be "thread-compliant". But those that are
("thread-compliant") should be "thread-safe" and should block only a thread
that calls a blocking MPI call. As far as I am concerned, the second
requirement is included in the notion of "thread-safety".

***
OK
***

2. Threads scheduling and MPI.

It is emphasized that the specification (MPI-2) deals only
with "preemptive" threads (i.e. those that are scheduled by the OS
rather then by the thread library). Right after that, page 162,
paragraph "Advise to implementors", it is said that "the MPI library
has to interact with thread scheduler" and unblock threads blocked in MPI
calls (under what circumstances???). To me it sounds like the MPI library
should interact with the OS scheduler (since this is the "threads
scheduler" that schedules preemptive threads). I do not understand what
could be the reason to do so.

***
The MPI library has to make a thread that was blocked on an MPI receive as
runnable, when the receive is statisfied. In some direct or indirect
manner, this implies interaction between MPI library and sdcheduler. This
does not imply that MPI requires kernel changes.
****

Also, what is so wrong with non-preemptive threads? I agree, it is more
difficult to ensure the desired program behavior with non-preemptive
threads, but the non-preemptive threads are more efficient. I do not know
of any "pathological" features that prevent using non-preemptive threads in
MPI applications.

****
Nothing wrong with nonpreemptive threads, except that the progress model of
MPI matches better preemptive threads. One would have to provide a
someowhat complex description of what is the correct progress rule in an
environment with nonpreemptive threads.
****

3. Different implementation options.

Later in the section, we can see that there are different degrees of
"thread-safety" that can be provided in a "thread-compliant" MPI
implementation:
MPI_THREADS_NOT_SUPPORTED,
MPI_SINGLE_MPI_THREADS, and
MPI_MULTIPLE_(MPI_?)THERADS. I think that, as far as the implementation is
concerned, MPI_SINGLE_MPI_THREADS is not different from the
MPI_THREADS_NOT_SUPPORTED at all since there is no hazard of concurrent MPI
calls in the program. I do not see how this case differs from the case of
having single-threaded MPI processes.

****
There is only one degree of thread-safety, namely "thread-compliant". The
user may provide hints about usage pattern, to facilitate optimizations.
Even if there are no concurrent MPI calls, the MPI library may invoke other
lib calls, concurrently with the user: e.g. malloc. Even, if the user does
not call MPI explicitly on several threads, the MPI library may be invoked
by a signal handler on different threads. Thus, there is a difference
between a single-threaded execution and a multi-threaded execution, even
when only one thread calls MPI.
****

4. Threads and MPI_Finalize call.

I think it is necessary to specify how the MPI_Finalize call affects MPI
operations in progress in threads other then the one that calls
MPI_Finalize (if the MPI implementation in question is of the type
MPI_MULTIPLE_THREADS). The case MPI_SINGLE_MPI_THREADS is trivial.

The obvious options would be 1) to abort all MPI activities in progress
(rationale - if users did not bother to synchronize completion/waits of the
operations with the call to MPI_Finalize, it is their problem); 2) complete
all pending MPI activities and do not accept requests for any new
activities. The second choice is more "user-friendly", but might be a cause
of deadlocks in the (actually erroneous) codes where not all asynchronous
communication operations are waited upon. Other options might be
considered, but the point is that the MPI-2 document should say what
happens in a multi-threaded application when the MPI_Finalize is called.

****
We currently say that it is ewrroneous to invoke MPI_FINALIZE while there
are pending MPI activities, form the user view-point. Same rule should
apply here. The call should complete all pending internal activities.
Yes, additional language is needed.
*****

***
Marc
***