error handling

Marc Snir (snir@watson.ibm.com)
Wed, 21 May 1997 14:33:30 -0400

I not sure we have closed on the design for error handlers. Let me outline
what I believe to be the logical choices and ask for oppinions.

There are two entities to consider: the opaque error handler object, and
the error handler callback function. The first is an MPI opaque object,
accessed by a handle (of type MPI_Errhandler, in MPI1). The second is a
function provided by the user. This function has, in C, a typedef of
errhandler_fn(MPI_xxx handle, void* extra_state,...), where xxx is,
respectively, Com, File, or Win.

There are three possible designs:

A) 3/3: 3 different types of opaque error handler objects and 3 types of
callback functions. We introduce new (C) types MPI_Win_errhandler,
MPI_Comm_errhandler (same as MPI1 MPI_Errhandler) and MPI_File_errhandler.
We introduce new constants MPI_WIN_ERRORS_RETURN, MPI_COMM_ERRORS_RETURN,
etc. We triplicate all error handler functions. In C++ each type of error
handler, and each callback typedef is in the namespace of the corresponding
class (Comm, Win or File). One advantage of this proposal is layerability
and extensibility: implementors of MPI-IO can deal with error handling on
thir own; libraries that add some new opaque objects, can add their new
error handlers, too.

B) 3/1: 3 different types of callback functions, but only one type of
opaque error handler object. In C++ the three typedefs for the callback
functions are in the namespace of the corresponding class. We triplicate
all functions that deall with callbacks, or with communicators, windows of
files; namely all error handler functions, except MPI_Errhandler_free.
[[There is no point in triplicating this last function, if there is only
one MPI_Errhandler object type -- there is no gain in layerability.]] In
terms of implementation, this means that the opaque error handler object
registers which kind of callback function is attached to it. We have a
run-time type matching, rather than compile-time matching: MPI can check,
when MPI_XXX_ERRHANDLER_SET(handle, errhandler) is invoked, that the
callback function attached to errhandler is one that takes handles to XXX.

C) 1/1: one type of callback function, one type of opaque object. The
typedef of the callback function becomes, in C,
errhandler_fn(void * handleptr, void * extra_state,...), where handleptr
can be, in one version, the MPI handle coerced to void* or, in another
version, a pointer to the MPI handle coerced to void*. We still triplicate
the functions that associate error handlers with objects
(MPI_XXX_ERRHANDLER_SET, MPI_XXX_ERRHANDLER_GET), but we do not need to
replicate MPI_Errhandler_create and MPI_Errhandler_free. We get, with this
proposal, a lower function count. We get layerability and extensibility:
if an implementer adds a new FOO opaque object (e.g., FILE), then it
provides MPI_FOO_ERRHANDLER_GET and MPI_FOO_ERRHANDLER_SET. There is no
way for MPI to do type matching. It's up to the user to invoke
MPI_XXX_ERRHANDLER_SET(handle, errhanlder) with an errhandler object with
an attached callback function that takes arguments of type XXX.

I think we agreed on (B), but there are still some inconsistencies among
chapters (e.g., do we triplicate MPI_Errhandler_free), and there is still a
proposal from Squires to go to C. I am not sure that we discussed the
issue with a clear view of the alternatives and their pros and cons.
Hence, this mesage.