generalized requests -- cancel

Marc Snir (snir@cs.nyu.edu)
Wed, 29 Jan 1997 10:56:32 -0500

I omitted to deal with some issues that relate to GR cancelation and
error handling.

1. We need to specify that progress_fn and cancel_fn are mutually
exclusive: one callback is not called while the other is running.

2. We need to specify how progress occurs after a cancelation was
initiated and, in particular, how is the generalized request marked
complete. There are two choices:
a) The progress_fn function continues to be called.
b) a new progress function is substituted to the old one.

I prefer the second choice. This way, the regular progress function
does not need to check each time whether a cancelation was initiated.
The exception handling code is in a separate callback function.

The simplemost way of supporting the second option is that the cancel_fn
function itself becomes the "progress function" after MPI_CANCEL was
called. Thus, its syntax is changed to

cancel_fn(extra_state, count, array_of_requests)
INOUT OUT OUT

or

cancel_fn(extra_state, request)
INOUT OUT

or

cancel_fn(extra_state, flag)
INOUT OUT

according to which version of the GR proposal we adopt.

It also means that cancel_fn becomes (after the initial call) an
asynchhronous progress function, and all the rules stated about
progress_fn (restrictions on call that may occur in it, how to achieve
mutual exclusion) apply to cancel_fn as well.

We can also have two callbacks supplied, if there is any reason to do
so: cancel_fn and cancel_progress_fn. But I see no reason to do so.

3. As part of the handling of generalized requests, we have to set the
error code that will be returned by wait, test, etc. Note that the
callback function does not know whether the error code should be
returned within the status object (e.g., when a WAITALL is used), or
whether the error code should be returend as the error code of the
WAIT/TEST/... call itself (for WAIT, TEST, WAITANY, etc.). Additional
text is needed.

The complete_fn call will return an error code. This error code will be
returned by a WAIT/TEST/TESTANY/WAITANY call that completes the
execution of the generalized request. {WAIT|TEST{ALL|SOME} calls will
return MPI_ERROR_IN_STATUS, if some complete_fn callback returned an
error code other than MPI_SUCCESS, and will store the error code
returned by the complete_fn callback in the status associated with the
corresponding generalized request. (This assumes that no other error
was raised by the WAIT/TEST/... call; otherwise, the outcome is
undefined.)

Another possible design is that complete_fn always stuff an error code
into the status object, and that MPI extracts it from there to determine
the error code returned by WAIT/TEST/...

-----

On a separate issue, what behaviour de we expect from MPI_GET_STATUS, if
the request did not complete successfully? Do we expect MPI_GET_STATUS
to return the error code that would be returned by MPI_WAIT, on the same
request? Or do we expect GET_STATUS to succeed, and the error code to
be returned in status?
-----------