2(5:6) Clarification of last sentence needed. Is it stressing
that an MPI user may RMA_INIT any memory, not just RMA_MALLOC
memory or that an implementation need not make memory that is
allocated this way "special". (or both)
2(32) This -> MPI_RMA_MALLOC
3(12) MPI implementation only must keep list only if RMA_MALLOC
memory is truly special. If it is a wrapper for malloc and no
more, RMA_INIT has no need for a list to check.
6(32:33) Does erroneous require an error return code?
10(1:2) Do we really want to require that a get not complete at
the origin until the window counter is updated at the target?
If the target side response is provided by a thread, the response
agent may wish to start an MPI_Isend of the data then yield, deferring
wait and counter update. This would not be a good idea if origin
MPI_GET is blocked until the counter update.
Example 4.2 I see no problem with the part of the example directly
related to the point it is making. I do see it exhibiting some
other poor practice. The subroutine does an RMA_INIT, creating
a new communicator and then creates 2*p datatypes. It does not
free these before it returns and since the handles are held in
local variables, the objects are left dangling.
14(45) To me the term SWAP implies a movement in two directions,
as in RMW. For accumulate it does not seen like the best term.
Perhaps this should be MPI_REPLACE. I am reluctant to add new
definitions unless needed so if MPI_SWAP was chosen to keep the
operation count down then I can accept it.
16(4:5) invar and outvar are now integer, not choice.
outvar = targetvar[disp];
targetvar[disp] = targetvar[disp] op invar;
this isn't quite right either but it is better. The write-up
should probably either say that disp-unit at target window must
be sizeof(MPI_INT) or discuss the meaning if it is something else.
22(32:37) MPI_RMW has been restricted to a single integer element.
These lines talk about multi element RMW.
23(43:46) When a window is in a local epoch, remote read gives
undefined results and when in a remote epoch, local read is undefined.
This is because local epoch writes may or may not be reflected
in the underlying storage between subroutine calls. If there
are not enough registers to keep all interesting variables, even
code without subroutine calls will spill regs to storage. Likewise,
for local reads in a remote epoch, even without intervening subroutine
calls, regs may be reloaded from storage. The window copies are neither
cleanly distinct nor cleanly identical across any recognizable
24(41) MPI_WINDOW_IN, not MPI_COPY_IN
25(25) rather than saying the window copies are in effect identical,
say they are in effect syncronized by any subroutine call. A
pointer to the discussion about Fortran and variables not in parm
list might be worthwhile here too.
32(1) Add the word Fortran before users.
33(43) MPI_Type_recv, not MPI_Recv
Dick Treumann POWER Parallel Systems
(Internet) email@example.com IBM -- Poughkeepsie, NY
(VNET) TREUMANN at KGNVMC Tel: (914) 433-7846
(internal) firstname.lastname@example.org Fax: (914) 433-8363