ATTENTION Chapter Authors

Dick Treumann (treumann@kgn.ibm.com)
Fri, 30 Aug 1996 15:21:47 -0400

This note identifies a number of errors (at least in my view) in
the current version of the MPI document. It covers several chapters
so I hope it will be surveyed by at least the owners of the chapters
on:

Terms and Conventions
MPI-1.2
Process Creation and Management
One Sided
External Interfaces
Language Bindings
Miscellany

I did not attempt to review MPI-IO or Collective Communication

The page and line numbers are from the version offered by Steve prior
to the Sept Meeting. I hope they still fit the doc handed out on Tuesday

Dick Treumann

****************************************************************************

Pages 12,30,48,69 have lines which run off the page. At least
with the printer I am using this results in the whole page printing
blank. I found an alternate way to print thes pages so I have been
able to see thta this is what all have in common.

Pg 11(19,35) References to mapping C names to C++ names use the
Fortran spellings. Though the Fortran spellings are generic in
most contexts, here I think it makes better sense to say MPI_Datatype
and MPI_Comm_compare become MPI::Datatype ....

Pg15 When did it become the intent of MPI to be signal safe in
the sense that it can run on a signal handler? I think any suggestion
that MPI is signal safe is very dangerous. On AIX, malloc is
not signal safe and trying to do MPI without malloc would be impractical.
Is malloc signal safe on other Unix like systems? At best, any
code which tries to do MPI on signal handlers is non-portable
with timing related failures likely on systems that it does not
port to. Any implied support in MPI for something nonportable
but with no way for a non supporting system to detect the danger
should be avoided.

pg25(line30) Does this mean that any non-blocking receive which
has been posted must be either successfully completed or be canceled
else MPI_FINALIZE gives a fatal error? If I am using handlers
it may not be easy to know about outstanding Irecvs which are
posted to kick handlers. Does anyone have codes which post Irecvs
expecting only a subset to complete in any given run? Given the
ability to do MPI_Testsome on a set of requests, it seems like
a reasonable thing to do. Why not just guarantee that any MPI
message which HAS completed in the users perception (A wait succeeded.)
will in fact complete under the covers too. A user who does
MPI_Finalize with uncompleted messages can reasonably have decided
these are DONTCARE cases. Any strengthening of MPI_Finalize should
probably say how MPI_Isend/MPI_Request_free and early arrivals
(small eager-protocol messages) with no matching receive fit in.

pg25(line38) My understanding of the MPI 1.1 doc is that EVERY
task of an MPI job must call MPI_Init. This implies that there
can be a correct MPI MPMD job in which only 1 program calls MPI_Init.
Is this really intended and if so, on what basis in the MPI 1.1
doc?

pg27 The section on "New Datatypes" is really about "New Datatype
Constructors"

pg28(22) Add "MPI_Type_simple_struct should only be used for complete
structures, never for portions of larger structures."

pg30 "Example" This indicates that in some cases the wire protocol
for 2 floats followed by a double will contain padding and in
others in will not. If a message containing 2 floats followed
by a double is received as MPI_PACKED and then digested by calls
to MPI_Unpack, first for the pair of floats and then for the double,
how does MPI know if there is a 4 byte pad? OR I receive the
[cc]xx[ffff][ffff]xxxx[dddddddd] message into a halfword aligned
buffer using MPI_PACKED then call MPI_Unpack for the [cc] first
and next for [ffff][ffff]. Which call to MPI_Unpack decides it
must discard two bytes and how does it decide this?

pg46(40) Shouldn't MPI_PARENT be MPI_COMM_PARENT?

pg47(12) The PVM rule is $HOME/pvm3/bin/$PVM_ARCH

pg47(14) The "SP2" series is now simply called SP.


pg58(1) A bit clearer if the word "spawn" were added. "...collective
spawn routine..."

pg58(44) add "MPI will spawn the largest number of processes it
can, consistent with some number in the set. The order in which
triplets are given is not significant." Also, should say if 16:8:-2
is a valid form since it may seem logical to a user who overlooks
the point that order is not significant.

pg62(29) info is an opaque object >>with a handle << of type MPI_Info.
i.e. add the marked phrase because it is not the object which
has this type.

62(38) Make it explicit that it is the format of "value" which
may be unrecognizable.

62(40-42) Does the value length include the terminating \0 in
C? I assume yes but it would be better to say.

63(34),64(12),64(35) How could a Fortran implementation recognize
an overlong value?

65 Given the ability to dup INFO objects, the ability to replace
key values and NO way to remove a key once defined -- It seems
we need to define a NULL value that will apply to every key.

65 Maybe I just overlooked something but - Is there a "No info
provided" predefined handle which is different from MPI_INFO_NULL.
If not, how does a routine tell if it is correctly passed a "No
info" or mistakenly passed a handle to an info which has been
freed. Do we care?

66(1-3) Given that info is a modifiable object, I think we must
state that it is parsed before return from the call which uses
it, that MPI in a threaded environment guarantees that it is
managed such that parsing by a thread is atomic re the modification
calls and lastly that Free is immediate. The user will need to
do the managing of the handle on his own if he uses an info on
one thread while it might be freed on another. This is not unique
to info though.

67(42) Macro mpifunc shows in text

67(44-45) Does MPI_Request_free on a monitor request pop any attached
handler? Probably no but should say.

68(7) say UNIX wait() rather than just wait(). I first read this
line as a shortcut way of saying MPI_Wait.

70(24-29) Make it clear what "this information" is. Does it mean
how many instances of ProgA, how many of ProgB etc? It is clearly
still possible to find out how many process in total even when
MPMD.

72(48),73(1) The IBM SP2 is now simply the IBM SP (will not need
SP3, SP4 ...)

72-73 MPI_PORT_OPEN - This still discusses INFO as a string rather
than an MPI_Info handle. Is the arg really a string? Then it
should get a new name. Otherwise the text needs fixing and so
do the examples.

78(37) MPI_Recv is an even stronger example. A receive against
a dead client could hang the server forever.

79(37-41) The semantic in which disconnect completes requests

will leave dangling references. If I do an MPI_IRECV(........,req,..)
then disconnect I will still have what seems to be a valid handle
in the "req" variable.

ONE SIDED

84 In the introduction it says "needs to compute the inverse mapping"
then goes on to mention the option of polling for requests. Should
probably say "needs to determine the..

85(11) clean up double negative

85(19) add "after MPI_Init" to sentence ending "any MPI call."

86(advise to users) Consider pointing out that "void *base" in
MPI_Mem_alloc and MPI_Mem_free are not the same parameter (in
lieu of making Mem_alloc use "void **base")

87(9,10,27-29) Should rule out 0 as displacement unit or explicitly
allow it with some hint as to why it is legit.

87(29,30) An RMA_INIT does not modify an existing communicator,
it creates a new one. The first sentence should be more like:
"Only a communicator created by an MPI_RMA_INIT call can be used
for RMA commands." I did not add "or one duped from ..." because
I slightly lean toward alternative 1 below which says dup does
not carry this along.

89(Rational) Note that while RMA to own memory may be a plus on
some versions of MPI it will probably be a minus on others.

91(Advise to impl) what does MPI do when it must truncate an MPI_PUT?
Where and when is an error reported?

92(MPI_Get) Does the bounds checking indicated for MPI_PUT apply
to the target window on a get as well (i.e. Do not read outside
the target window)? I would expect the MPI-1 rules for a receive
to apply to the origin window in this case. (no overwrite)

Put, Get, Accumulate - target displacement should be nonnegative
integer

96(48) replace "other" with non-RMA. Clearer that way.

97(32,41),102(21),103(6-12,22,23),108(19) - Uses of MPI_LOCK,
MPI_UNLOCK, MPI_WINDOW_LOCK and MPI_WINDOW_UNLOCK should all be
RMA_LOCK/UNLOCK.

99(24,25) Should it be explicit that two puts from the same location
cannot occur in an epoch?

100(19-20) Should say "Each MPI_RMA_POST must be matched by an
MPI_RMA_WAIT and each MPI_RMA_WAIT followed by a fresh MPI_RMA_POST
if the widow is to be used again."

100(Alternative 1) Mentions info on MPI_RMA_INIT. This parameter
no longer exists.

102 Is there an idea that there may be impl defined values beyond
MPI_STRONG, _WEAK, _NOCHECK? If not then this might better be
something other than an info. If it is to be an info then these
are probably best defined as values for some as yet unnamed key.

106(27) add "after MPI_Init" to last sentence.

108(1-14) Is it legal to have a window at task three with 0,1,2,3

on the communicator. Task 0 does put; all barrier; task 1 put;
all barrier; task 2 put; all barrier; task 3 does local load?
The description of barrier as a toggle does not seem to allow
this. Also, toggle semantic without a way to check a window's
current mode is hard for a user to track.

110(9-10) To say a "high quality Fortran" will solve the problems
introduced by MPI violations of the Fortran language rules is
unwarranted. The only way I see for a Fortran compiler to do
this would be to forgo optimizations that are proper by language
rules. This is hardly a "quality" behavior for a Fortran compiler.

110(32-35) I presume this comment is based on the idea that "frombuf"
was originally part of the MPI_RMA_INIT. This is not evident
in the code fragment.

110(48) should "update_core(comm)" be "update_core(A)" and shifted
to align with "for"?

117(23) left should be "right"

117(39) Suggest "Some Fortran implementations may provide compiler
options to solve this problem" Again, it is not a matter of quality
since it involves forgoing legal optimizations.

117(46,47) Suggest "A C compiler understands the implications.
Some compilers do offer optional, aggressive optimization levels
which may not be safe."

EXTERNAL INTERFACES

164(14-17) The whole issue of macros for MPI functions needs
broad consideration. It looses name shift profiling and if MPI
allows macros for some functions it should probably be specific
about which ones.

164(47-48) I could not decide what the last sentence means. Can
it be clarified?

166(32-36) In a homogeneous MPI you only need 2 internal representations
for datatypes, one covers contiguous, vector and hvector. The
other covers indexed, hindexed and struct (incl new versions of
struct). If I could return either hvector or struct based on
internal form then I would not need to worry about keeping track
of the combiner. Since this is ruled out (properly),I see little
benefit in the flexibility that remains.

177 (Datatype transfer) - I realize this may be too late but I
want to point out that the overloading of MPI_Pack and MPI_UNPACK
to handle datatypes is a REALLY UGLY KLUDGE. I would like to
see new functions, one which encapsulates an opaque object and
one which reconstitutes it. We are adding so many new functions
to MPI that this does not seem like the place to get by on the
cheap.

181-182 is the "handle" arg to ...PUT, ...GET and ...DELETE really
intended to be a reference parameter or is the "*" a typo?

186(6,7) Suggest "There is no requirement that an MPI program
provide a name for any handle."

188(1-5)(6-10) These two paragraphs are not consistent. An MPI
which used a global lock to serialize MPI calls could not allow
MPI calls on one thread while another was blocked on some MPI
call.


191(first Alt) This is not consistent with the stronger rules
for MPI_FINALIZE unless it is accepted that all handler use will
involve some policy in which the client for which a handler is
posted takes the responsibility of telling the handler to run
one last time in a cleanup mode which does not involve a repost.

192-193 I believe MPI_MUTEX_CREATE and MPI_MUTEX_FREE must both
be collective. There must be some instantiation of the mutex
which all members of the communicator deal with. and each must
know where it is and how it is locked and unlocked.

194-195 MPI_COND_CREATE and MPI_COND_FREE must be collective.

195(25-26) I do not understand the comment that OS must contain
the cond variable state. If so, please explain more fully. Also,
if so, does this not rule out this function for MPI based on MPI
not defining special OS or environment interfaces.

195 MPI_COND_WAIT - In a thread compliant MPI, can more than one
COND_WAIT exist on a single MPI cond (2 or more threads)? I would
advocate that it be disallowed but either way it should be explicit.

FORTRAN 90

209 I would prefer to see a different include file name as for
example mpif90.h. Since F90 is a superset of F77 there is a probability
that the F90 include file will not be tolerated by an F77 compiler.
An environment which has both F77 and F90 compilers in use should
be able to go after suitable include files for each compiler.

212(10-18) This advise brings in the problem of MPI calls with
hidden arguments that is discussed in the section about compiler
opt and Fortran.

MISCELLANY

231 (MPI_TRYRECV) This is still risky in the case of a threaded
MPI unless the user is careful with tags and can guarantee that
the QUERY will not find a message 100 words long and then will
have that taken by another thread leaving the tryrecv to match
a 200 word message. At least it will not deadlock though. ALSO
- The "status" argument is missing from the IN/OUT list.

231 (continuable errors) What does it mean to say one of these
errors is continuable when it occurs in a collective communication?