Error handlers for files

John M May (johnmay@coral.llnl.gov)
Wed, 23 Oct 1996 12:09:49 -0700 (PDT)

I'm trying to understand the intended use of MPI error
handlers for I/O. I'm concerned both about how layered
implementations will use these handlers and how the
handlers will meet users' needs.

The handler will be associated with the communicator
used to open the file, not the file handle itself. A
layered I/O implementation can retrieve the handler object
but not the handler function. So how can it call this
function? (Perhaps this was addressed at the October
meeting. I couldn't find anything about it in the new
draft.) Moreover, how can the I/O library communicate to
the handler which file suffered the error? It seems likely
that many files will be opened with MPI_COMM_WORLD or
MPI_COMM_SELF, so raising an error just on the communicator
might not be specific enough to tell the user where the
problem is, especially for nonblocking calls. Error handler
functions can have variable numbers of arguments. Assuming
we can find a way for the I/O system to call the function,
should the I/O chapter define one or more of these extra
parameters to have a specific meaning, such as the value of
the file handle and possibly the operation that (read,
write, seek, etc) that caused the error? If so, will users
have to write all their handlers to accept variable
argument lists, so they'll work for both communication and
I/O errors? I'm not really clear on the intended use of
the variable arguments for MPI error handlers, so I would
appreciate some clarification.

The likely use of MPI_COMM_WORLD and MPI_COMM_SELF as
communicators for files raises a second problem: the
default handler for these communicators is currently
MPI_ERRORS_ARE_FATAL. This means that the first time
a program tries to read past the end of a file that
was opened with one of these communicators, it will
crash. This seems quite unfriendly. It would be
better if the default handler that users got for files
was MPI_CONT_ERRORS_RETURN. I suppose we could define
a separate set of built-in communicators with this
property, but that seems awkward.

I would still prefer a separate error handler mechanism
for I/O, so we could associate error handlers with file
handles. This would solve both problems, since we could
define the default handler for all files to be
MPI_CONT_ERRORS_RETURN, and the handler would automatically
know which file caused the problem.

One remaining open issue is errors on MPI_Open. There is
no file handle until the function returns successfully,
so the handler would have to be attached to the
communicator. In that case, it would be nice to define
an argument to the error handling function as being the
name of the file that the user tried to open.

John