RE: MPI and threads

Boris V Protopopov (boris@ERC.MsState.Edu)
Mon, 17 Feb 1997 11:40:20 -0600

Thanks for the clarifications, Mark;

Concerning the paragraph 2 in my letter, I think, that one can use=20
standard POSIX condition vars or semaphores to wake up threads=20
blocked on blocking calls; yes, one can say that MPI library interacts =
with=20
the thread scheduler in this way; the current formulation of this =
requirement=20
in the document sounds as if some special means, other than those =20
provided by any POSIX-compliant thread packages are required.

In paragraph 3,=20
I was under the impression that we discuss multi-threaded MPI on =
platforms
that provide thread-safe libc (and other platform-specific libraries); =
variations=20
in the semantics of similar services on different systems (an example of =
UNIX=20
signals in multi-threaded environment comes to mind) are quite wide, so =
I was=20
thinking, it is convenient just to say that MPI-2 requires thread-safe =
system libraries.
Otherwise, I am afraid, there are too many things to take care of; it =
will be=20
quite difficult for any practical MPI implementation to keep track of =
all cases of thread usage;
In fact, it will probably cause the implementor to provide several =
versions of most of MPI=20
routines to avoid complex and voluminous coding and performance =
penalties.

Thanks again for the response,=20
Boris Protopopov.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
From: Marc Snir
Sent: 17 =F4=E5=E2=F0=E0=EB=FF 1997 =E3. 7:41=20
To: boris@erc.msstate.edu
Cc: Mpi-External@mcs.anl.gov
Subject: Re: MPI and threads

Dear sirs:
I read the section about threads in MPI-2 (7.8 MPI and Threads) and

and found several possible inconsistencies. Since I have not actively
participated in the MPI-2 discussion so far, I just want to draw the
attention
of interested parties to the points listed below in hopes that they will
be considered, and I will be corrected if I am wrong, in order to =
increase
the quality of the MPI-2 specification.

1. Thread-compliant and thread-safe implementations.

It is mentioned in the section 7.8 that MPI-2 - compliant =
implementations
do not necessarily have to be "thread-compliant". But those that are
("thread-compliant") should be "thread-safe" and should block only a =
thread
that calls a blocking MPI call. As far as I am concerned, the second
requirement is included in the notion of "thread-safety".

***
OK
***

2. Threads scheduling and MPI.

It is emphasized that the specification (MPI-2) deals only
with "preemptive" threads (i.e. those that are scheduled by the OS
rather then by the thread library). Right after that, page 162,
paragraph "Advise to implementors", it is said that "the MPI library
has to interact with thread scheduler" and unblock threads blocked in =
MPI
calls (under what circumstances???). To me it sounds like the MPI =
library
should interact with the OS scheduler (since this is the "threads
scheduler" that schedules preemptive threads). I do not understand what
could be the reason to do so.

***
The MPI library has to make a thread that was blocked on an MPI receive =
as
runnable, when the receive is statisfied. In some direct or indirect
manner, this implies interaction between MPI library and sdcheduler. =
This
does not imply that MPI requires kernel changes.
****

Also, what is so wrong with non-preemptive threads? I agree, it is more
difficult to ensure the desired program behavior with non-preemptive
threads, but the non-preemptive threads are more efficient. I do not =
know
of any "pathological" features that prevent using non-preemptive threads =
in
MPI applications.

****
Nothing wrong with nonpreemptive threads, except that the progress model =
of
MPI matches better preemptive threads. One would have to provide a
someowhat complex description of what is the correct progress rule in an
environment with nonpreemptive threads.
****

3. Different implementation options.

Later in the section, we can see that there are different degrees of
"thread-safety" that can be provided in a "thread-compliant" MPI
implementation:
MPI_THREADS_NOT_SUPPORTED,
MPI_SINGLE_MPI_THREADS, and
MPI_MULTIPLE_(MPI_?)THERADS. I think that, as far as the implementation =
is
concerned, MPI_SINGLE_MPI_THREADS is not different from the
MPI_THREADS_NOT_SUPPORTED at all since there is no hazard of concurrent =
MPI
calls in the program. I do not see how this case differs from the case =
of
having single-threaded MPI processes.

****
There is only one degree of thread-safety, namely "thread-compliant". =
The
user may provide hints about usage pattern, to facilitate optimizations.
Even if there are no concurrent MPI calls, the MPI library may invoke =
other
lib calls, concurrently with the user: e.g. malloc. Even, if the user =
does
not call MPI explicitly on several threads, the MPI library may be =
invoked
by a signal handler on different threads. Thus, there is a difference
between a single-threaded execution and a multi-threaded execution, even
when only one thread calls MPI.
****

4. Threads and MPI_Finalize call.

I think it is necessary to specify how the MPI_Finalize call affects MPI
operations in progress in threads other then the one that calls
MPI_Finalize (if the MPI implementation in question is of the type
MPI_MULTIPLE_THREADS). The case MPI_SINGLE_MPI_THREADS is trivial.

The obvious options would be 1) to abort all MPI activities in progress
(rationale - if users did not bother to synchronize completion/waits of =
the
operations with the call to MPI_Finalize, it is their problem); 2) =
complete
all pending MPI activities and do not accept requests for any new
activities. The second choice is more "user-friendly", but might be a =
cause
of deadlocks in the (actually erroneous) codes where not all =
asynchronous
communication operations are waited upon. Other options might be
considered, but the point is that the MPI-2 document should say what
happens in a multi-threaded application when the MPI_Finalize is called.

****
We currently say that it is ewrroneous to invoke MPI_FINALIZE while =
there
are pending MPI activities, form the user view-point. Same rule should
apply here. The call should complete all pending internal activities.
Yes, additional language is needed.
*****

***
Marc
***