Re: Completion of nonblocking operations

William Gropp (gropp@mcs.anl.gov)
Thu, 08 Feb 1996 09:13:26 -0600

I agree that it is reasonable to require, for MPI generalized requests, that
the MPI_REQUEST_MARK_COMPLETED occur in a handler. What I don't believe is
that this is realistic for other, non-MPI events. For example, in the IO
setting, I might want to use the common system call "aioread" to
asynchronously read data. I do not see how I could use this routine within an
MPI generalized request, even though I might want to use it in an MPI-IO
request.

Attaching handlers to all POSIX signals opens its own problems. For example,
what if the MPI library itself has had to reserve some signals for its own
use? In the case above, some implementations may deliver SIG_IO on completion
of the aioread operation. What if the MPI implementation needs SIG_IO itself
(to detect that ITS device has completed an operation)? Now, all of this can
be handled, but it requires defining some sort of chaining mechanism for
calling multiple signal handlers, and becomes more sensitive to user abuse (if
the user changes a signal, or uses a library that does so, without using the
approved calls, the code may fail. We can contend that this is an erroneous
program, but it seems to put too much burden on the programmer to avoid using
otherwise legal system calls.

I believe that we need to consider more carefully what the interaction model
is. In basic Unix, all interaction is (essentially) either a signal or
activity on a file descriptor; if signals interrupt system calls, then a "wait
for anything" can be implemented with select/sysv-"poll". In a multithreaded,
service-oriented system, you just use a separate thread for each wait. Shared
memory systems may have a combination of process and system-managed
semaphores. I understand completely how the current model, with the signal
handling changes, can be made to work in a single-threaded, Unix-like
environment. I don't see how to implement them efficiently in a shared-memory
environment (I don't see how to avoid otherwise unnecessary interaction with
the operating system), and in a multi-threaded system, I also don't know for
certain that they can be implemented efficiently without imposing additional
constraints on the user's use of threads (if, for example, the user's
operations do explicit scheduling and prioritizing of threads).

I fully agree that having a single, non-spinning wait mechanism is important.
I'm just not convinced that we have found it yet.

Bill