Extended proposal: point-to-point 1-sided

David C. DiNucci (dinucci@nas.nasa.gov)
Thu, 27 Jun 1996 12:11:53 -0700

In previous messages, I proposed and explained some collective PUT and GET
operations. In this message, I make some small improvements to that model,
and extend PUT and GET to point-to-point as well. As has been mentioned
before, by myself an others, the primary difference between 1-sided and
message-passing operations is not the number of participants in the
communication -- it is in which side provides information about where data is
to be obtained or deposited in the target process.

.............................Recap of Collective 1-sided

First, to avoid confusion, I will recap the collective operations (with some
very minor modifications and renamings):
PUT_ARG Has same arguments as PUT in chap4 (except increment), and
effectively just stores these arguments for later PUT_ALL
PUT_ALL Arguments are (base, size, disp_unit, comm), with same meanings
as same arguments in RMA_INIT in chap4. This is a collective
operation, which uses arguments collected by previous PUT_ARGs,
to deposit data into the buffers in the specified processes,
specified by the base, size, and disp_unit arguments in those
processes.
IPUT_ALL Non-blocking version of PUT_ALL

GET_ARG Has same arguments as GET in chap4 (except increment), and
effectively just stores these arguments for later GET_ALL
GET_ALL Arguments are (base, size, disp_unit, comm). Collective
operation, which uses arguments collected by previous GET_ARGs,
to obtain data from buffers in the specified processes.
IGET_ALL Non-blocking version of GET_ALL

[In a previous post, I made an assumption that all of the PUT_ARG and
PUT_ALL (or GET_ARG and GET_ALL) calls in all processes for a particular
collective execution (i.e. using the same communicator) would be required to
use the same "target_rank" argument, and that the base, size, and disp_unit
arguments would only be significant in that process. I now believe that this
was a bad idea. I cannot see why implementation would be much more difficult
if each PUT_ARG (or GET_ARG) is allowed to refer to a different process, and
therefore multiple processes are allowed to offer buffers for the receipt or
distribution of this data. In other words, these collective operations would
be just as flexible as the barrier-based semantics discussed earlier, and
would look much the same. For this reason, PUT_ALL and GET_ALL no longer take
a "target_rank" argument, and the "base, size, disp_unit" arguments
are significant in all processes.
]

Additional convenience functions could be offered to satisfy those cases where
only one PUT was desired -- i.e. PUT1_ALL, IPUT1_ALL, GET1_ALL, and IGET1_ALL,
where PUT1_ALL would be exactly the same as one PUT_ARG followed by a PUT_ALL,
and likewise for the rest. I personally think that the added convenience would
be small compared to the added weight to the standard.

............................New Proposal

Now for the new proposal. I propose two new point-to-point operations, each
requiring a pair of routines, one to be executed in each process involved in
the operation:
PUT/ACCEPT
and
GET/OFFER

PUT/ACCEPT would work as follows: One process would execute a PUT, with
exactly the same arguments as the PUT in chap4 (except increment). The target
process would execute an ACCEPT, with the arguments
(OUT base, IN size, IN disp_unit, IN comm, OUT status).
Each would complete only when the other executed its part, and the result
would be a transfer of data from the PUT process to the ACCEPT process, into
the buffer, at the location dictated by the arguments to PUT. The user would
be able to query the status argument to determine the rank of the PUT process.

GET/OFFER would work the same way, only the transfer would be from the OFFER
process to the GET process.

All of these would have non-blocking versions -- i.e. IPUT, IGET, IACCEPT, and
IOFFER.

I also propose a PROBE-like routine (perhaps the existing MPI_PROBE) to
determine whether a PUT or GET had been issued on a particular communicator,
so that third-party communication could be implemented -- e.g.

Process A

MPI_PUT(comm, ..to C..)

Process B

MPI_GET(comm,..from C..)

Process C -- something like...
10 CONTINUE
IF (MPI_PROBE_1SIDED(comm) .eq. MPI_GETWAITING)) THEN
CALL MPI_OFFER(buffer, size, disp_unit, comm)
ELSE
CALL MPI_ACCEPT(buffer, size, disp_unit, comm)
ENDIF
GOTO 10

I do not have a specific proposal for the form of this PROBE-like routine.

Notes:
It may be possible to even add tags to these point-to-point operations,
though that may very well be going too far for now.

I have carried over the "disp_unit" argument in all of these calls, with
the assumption that it serves the same purpose as it does in the RMA_INIT
call in chapter 4.

As described by Marc in the chapter, these may require sending the data
type to the target in a heterogeneous network environment.

.......................... Summary

Even though it may seem that I have been proposing somewhat major changes,
I really have proposed very little that is different from what is already in
chapter 4, other than altering argument locations and renaming. PUT and GET,
the primary functions, have barely changed at all, BARRIER has been renamed
to either PUT_ALL or GET_ALL, depending upon its use, and combined with
RMA_INIT. This allows the elimination of the WINDOW_OUT and WINDOW_IN calls
in barrier cases. Where WINDOW_IN and WINDOW_OUT were still required -- i.e. in
the non-collective cases -- I have effectively combined them with RMA_INIT and
synchronization, and renamed them to ACCEPT and OFFER. With these changes,
window counters no longer have a function, so they are no longer here. I
expect (but have not yet considered) that the RMW operations can be worked into
this framework nicely. About all that I have proposed that might actually be
new is the idea of the PROBE-like routine for one-sided and a status argument.

It is conceivable that RMA_INIT was an expensive operation, and that combining
it with these other operations is a mistake. I will wait for the experts to
tell me if this is so.

I should also make a correction. In a previous message, I requested an
axiomatic semantics for one-sided communication, suggesting that one was
not present in chapter 4. Well, it is. Page 24 of the latest draft has
almost exactly what I was requesting. I hope that this extremely important
information can find a more prominent location, but I don't want to criticize
the good work that has been done on the chapter.

-Dave
===============================================================================
David C. DiNucci | MRJ, Inc., Rsrch Scntst |USMail: NASA Ames Rsrch Ctr
dinucci@nas.nasa.gov| NAS (Num. Aerospace Sim.)| M/S T27A-2
(415)604-4430 | Parallel Tools Group | Moffett Field, CA 94035