continuable errors in the dynamic chapter

W. Saphir (wcs@nersc.gov)
Wed, 26 Feb 1997 08:54:56 -0800

Several people have noticed that the dynamic chapter still makes
several references to continuable and soft errors. With the
current state of continuable errors I think we cannot rely on
these and in any case should probably not specify in the
text which errors are continuable.

The cases I find are:

- attempt to publish a name when name publishing not supported
- connect() times out or port does not exist
- attempt to delete an info key that does not exist
- service name has already been published

For the last three, I think we can just get rid of the continuable
text.

For the first, I'm not sure. I think there should be some
way to tell if name publishing is supported without crashing
the program. Originally there was a compile-time constant
MPI_NAMES_ARE_PUBLISHED that was designed to allow conditional
compilation. That was voted out, I believe because name publishing
might be something you determine at runtime.

The reason this is treated differently from other things
in MPI is that we felt we could not require that implementations
publish names since you have to have some third-party mechanism
(daemons, name service, filesystem, whatever) so we couldn't
guarantee that name publishing would always be possible.

Some alternatives (to soft errors) are:
- an attribute on MPI_COMM_WORLD
- a function to query
- a flag argument to publish/get

Although it is awkward, I think I'd prefer the MPI_COMM_WORLD
attribute. Are there any opinions on this?

Bill