I think your post/start text has now another bug.
you remember your answer:
> Rolf, thanks for pointing out that my definitions
> for 2-party synchronization are incomplete. I changed
> the text to make clear that calls must be interleaved:
> a window is available to a "start" call if it is posted, but the same post
> cannot satisfy two starts at the same process. I.e., the
> post must come after
> the wait that matched any previous complete by the same caller. The
> implementation must enforce the interleaving, either using a hand-shake
> protocol, or using generation counters. I hope that the text to be put out by
> Dave next week makes this clear. Please look at it and let me know if it
> still has problems.
Post/Start must solve two major problems:
(A) loops with several post/start/RMA/complete/wait/local_access
cycles must be possible
(B) in a scenary with changing communication partners it must be
possible to start such cycles after a initiating message
from the target to the origin RMA process.
Your new text -- the official for the meeting -- is the worse now:
Draft 08/16/96:
MPI_START:
The succeeding RMA accesses to this window will be delayed,
if necessary, until the target window is posted by a call to
MPI_RMA_POST.
MPI_WAIT:
----
Criticism: 1) In MPI_WAIT the following sentence must be added:
"The call also marks the target as not posted."
Then (B) is solved.
2) Does not solve (A)
Draft 08/26/96:
MPI_START:
The succeeding RMA accesses to this window will be delayed,
if necessary, until the target window is available.
The target window is available if a call to MPI_RMA_POST
has occurred at the target, subsequent to the wait that
matched the previous complete call at the origin, for the same
window (if there is such a complete call).
MPI_WAIT:
The call also marks the target as not posted.
Criticism: 1) MPI_START already defines exactly the meaning of
"window is available".
The sentence in MPI_WAIT makes therefore no sense.
2) The definition in MPI_START solves (A)
but (B) is not solved, see example below.
There are two possible solution:
I) Combining the solutions in both drafts:
MPI_START:
The succeeding RMA accesses to this window will be delayed,
if necessary, until the target window _is__marked__as__posted_
and, if the MPI_RMA_START is called after a MPI_RMA_COMPLETE
for the same window, until the target window _is__available_.
The target window _is__marked__as__posted_ by a call to
MPI_RMA_POST in the target process (MPI_RMA_INIT and MPI_RMA_WAIT
marks the window as not posted).
The target window _is__available_ when a MPI_RMA_POST is called
at the target process after the MPI_RMA_WAIT that matches the
previous MPI_RMA_COMPLETE at the origin.
MPI_WAIT:
The call also marks the target as not posted.
II) Saying explicitly that (B) is not solved and delete the sentence
in MPI_WAIT because now, it is a pure matching rule and not
a state-based rule.
Example that the Draft 08/26/96 does not solve (B):
Origin 1 Target Origin 2
COMPLETE
WAIT
send to 1
recv
POST
start(MPI_WEAK or STRONG)
put
complete
wait
send to 2
receive
load
post
START(MPI_WEAK or STRONG)
put
... ...
This example does not work with the draft 08/26/96 because
the last start is satisfied by the first post. (Upper case is
used for the sequence matching the rule of MPI_RMA_START in
the draft 08/26/96.) I.e. the last load and put is running
at the same time!
With solution II) we must say that after canging communication partners
it makes no sense to use MPI_RMA_START(MPI_STRONG) or
MPI_RMA_START(MPI_WEAK) because they normally match with previous
POSTs. The example can be solved by rewriting it in a manner that
MPI_RMA_START(MPI_NOCHECK) can be used, e.g.
... ... ...
wait
load <<<---!!!
send to 2
receive
post
START(MPI_NOCHECK)
put
... ...
I also want to remember that this way of synchronization is not good
on virtual shared memory machines like CRAY T3E, because it cannot
be implemented efficiently there.
And please remember that in this examples the ugly send-receive
synchronization with an empty message is only necessary, because
your post-start synchronization cannot solve application's need.
Rolf
Rolf Rabenseifner (Computer Center )
Rechenzentrum Universitaet Stuttgart (University of Stuttgart)
Allmandring 30 Phone: ++49 711 6855530
D-70550 Stuttgart 80 FAX: ++49 711 6787626
Germany rabenseifner@rus.uni-stuttgart.de