Re: a (radical ?) alternative to current chapter 4

David C. DiNucci (dinucci@nas.nasa.gov)
Wed, 10 Jul 1996 11:56:36 -0700

Rolf and Marc,

Thank you for the clarifications. I am now straight on all but a couple of
parts, and I will review the chapter again before asking.

Marc Snir writes:
>There is no direct way for the target to know when the originator has
>completed a FENCE. For communication between originator and target, one
>uses counters or barriers.
I don't believe that this is stated so bluntly in the draft. Thank you.

Marc writes that a PUT becomes visible to loads *only* after a call to
WINDOW_IN. I believe that I understand that it is only *guaranteed* to
become visible after a call to WINDOW_IN, but may become visible before
that. (If this is not the case, please correct.)

From: Rabenseifner@RUS.Uni-Stuttgart.DE (Rolf Rabenseifner)
>FENCE is defined for access of multiple processes to one location
>in a "third" process. Then normally none of the process uses local
>load/store and therefore WINDOW_IN/OUT is not used.

So this suggests that third-party communication *is* supported by having
one process perform a PUT followed by a FENCE, and another perform a
GET? Doesn't this mandate the use of an agent at the third party on
some systems?

>Therefore I don't see any objective errors or lacks in the
>official draft.

If the draft agrees with the answers that you and Marc have given, I
am leaning in that direction myself.

>> Basically, my proposal works as follows:
>> MPI_PUT and MPI_GET are always non-blocking.
>> MPI_PUT is matched by an MPI_ACCEPT on the target
>> MPI_GET is matched by an MPI_OFFER on the target
>> MPI_ACCEPT (and MPI_OFFER) take the number of MPI_PUTs (MPI_GETs) to
>> match as an argument.
>>
>> MPI_ACCEPT and MPI_OFFER block until (1) they have matched with the
>> stated number of PUTs (or GETs) on other processes, and (2) all of
>> the PUTs (or GETs) preceding the ACCEPT or OFFER on the same processor
>> have matched (with ACCEPTs or OFFERs in their target processes).
>
>David, mainly MPI_ACCEPT is the same as MPI_CONSUME_COUNTER if
>the counter increment in MPI_PUT is 1.

Close, but not quite. First, unlike the MPI_CONSUME_COUNTER approach, a PUT
will not have any action at the target until the MPI_ACCEPT begins, so it
builds in synchronization which the user will probably find necessary
anyway (to prevent the PUT from taking effect too soon). Second, MPI_ACCEPT
essentially performs a WINDOW_IN.

Actually, I think of PUT & ACCEPT (or GET & OFFER) like SEND & RECV.

>You force the use of the counter modell; therefore you loose the
>FENCE modell and the BARRIER modell.

I do not lose the BARRIER model -- the collective operations (which you
seem to have deleted from my description above) offer exactly the
functionality of the BARRIER model, except that they integrate the WINDOW_IN
or WINDOW_OUT into the BARRIER, and rename the BARRIER either OFFER_ALL or
ACCEPT_ALL.

As for the FENCE model -- if I understand correctly, this
is communication through a third party without its involvement. Yes, I
lose that functionality, because I do not believe that it is practical to
require the involvement of an agent unless it is required. If third party
communication *is* required, then the user can build their own agent using
the constructs I have already described, or more likely, they can have the
third party execute an RMA_PROBE every once-in-a-while to determine whether
it is being used as a third party.

>And the cache coherence operations must be inlined into MPI_ACCEPT
>and MPI_OFFER.

Yes, they are included in MPI_ACCEPT and MPI_OFFER.

This whole proposal, then, consists of the following routines:

PUT/ACCEPT/IACCEPT point-to-point PUTs
PUT_ALL/ACCEPT_ALL/IACCEPT_ALL collective PUTs
GET/OFFER/IOFFER point-to-point GETs
GET_ALL/OFFER_ALL/IOFFER_ALL collective GETs
RMA_PROBE For unplanned and 3rd-party comm
RMA_MALLOC To make Salo happy :-)

No WINDOW_IN, WINDOW_OUT, counters, or FENCEs.
(And someday I will have the proposal converted to Latex so I can post it.)

-Dave
===============================================================================
David C. DiNucci | MRJ, Inc., Rsrch Scntst |USMail: NASA Ames Rsrch Ctr
dinucci@nas.nasa.gov| NAS (Num. Aerospace Sim.)| M/S T27A-2
(415)604-4430 | Parallel Tools Group | Moffett Field, CA 94035