size in MPI_MEM_ALLOC
disp_unit in MPI_WIN_INIT
2) Similarly to the collective chapter, I think we should mandate that the
copied data for MPI_GET and MPI_PUT should fit EXACTLY within the
destination
buffer. Allowing the user to specify a destination buffer which is of a
different size than the source buffer seems sloppy at best.
3) I suggest that we define MPI_WIN_INIT to be equivelent to MPI_WIN_BARRIER
from the point of memory synchronization semantics. This will allow the
removal of an otherwise neccessary MPI_WIN_BARRIER after init (e.g see
Example 5.3).
4) The restriction of non-overlapping target datatypes for MPI_Accumulate
(page 13) should presumably also apply to the target of put and origin of
get.
5) MPI_WIN_BARRIER should also be a barrier operation, i.e, it has barrier
synchronization semantics. Doing otherwise will lead to user confusion,
and may require the addition of MPI_BARRIER in some codes. Note, the
current semantics do not require MPI_WIN_BARRIER to be synchronized.
Consider two process in an MPI_WIN, process A with an open window,
process B with no window. If process B has performed no RMA operations
since the last MPI_WIN_BARRIER, a new call to MPI_WIN_BARRIER can be
implemented as a local operation so long as any subsequent RMA operations
are delayed until process A has called MPI_WIN_BARRIER.
6) Page 19 lines 1-8 appear to imply that at least one of the operations:
WIN_START, PUT or WIN_COMPLETE will block. That doesn't appear to be a
neccessary condition. For example, an eager put may be locally complete
and buffered remotely. There appears to be no requirement that any of
the sequence block in this case.
7) SImilarly for lock/unlock operations there is no requirement for either the
lock or the unlock to block. Presumably an implementation could even
collect
a sequence of locked transfers and send them as one "atomic" message when
the unlock is called. Such a message may be buffered on the target size,
allowing the origin to continue without blocking.
8) Users will presumably wish to use locked get/put operations to implement
traditional shared memory synchronization using flags. It is not clear to
me what the ordering model users can assume in such a case. Page 28 line
20 says that the operations are completed at the target by a "subsequent"
call to lock by any another process. What is the definition of
"subsequent".
It is temporal, or must some other synchronization be used to force an
ordering. Assuming 7) above is correct, a locked put operation may be
"on the wire" at the instant when another process makes the lock call. Must
this operation be commited (since it occured temporaly before the lock)?
If not, when must it be commited, what lock is guaranteed to occur after
it?
Consider a two party case, if process 0 uses a locked put to set flag A
in process 1's memory, and then uses a locked put to set flag B in process
1's memory. If process 1 sees that flag B is set (using a sequence of
locked gets) is it neccessarily that case that it will see flag A as set?
Also consider a three party case, if process 0 uses a locked put to set
flag A in process 1's window, and then uses a locked put to set flag B
in process 2's window. When process 2 sees flag B set (using a sequence
of locked get), it uses a locked put to set flag C in process 1's window.
If process 1 sees flag C is set (again using a sequence of locked gets),
is it neccessarily that case the it will see flag B set?
In other words, if locked put/get operations are not sequentially
consistent
we need to make this very clear.
Lloyd Lewins
Hughes Aircraft Co.,
llewins@msmail4.hac.com