This is "complete" in the sense that it lists all of the operations which
are being proposed. It describes a "relaxed" semantics, which are not
expected to require an agent for any architecture. It is incomplete in the
sense that, as always, there is still work to do.
Latex will follow when I have time to finish the conversion.
-Dave
==========================================================================
Syntax (Bindings)
=================
MPI_RMA_MALLOC(base, size)
OUT base Address of allocated buffer
IN size Size of buffer to allocate, in bytes
MPI_PUT(origin_addr, origin_count, origin_datatype, target_rank,
target_disp, target_count, target_datatype, tag, comm)
MPI_PUTC(origin_addr, origin_count, origin_datatype, target_rank,
target_disp, target_count, target_datatype, tag, comm)
IN origin_addr Data to be put
IN origin_count Number of data elements at origin_addr
IN origin_datatype Datatype of each element at origin_addr
IN target_rank Rank of target
IN target_disp Relative displacement in target
IN target_count Number of elements in target
IN target_datatype Datatype of elements in target
IN tag Tag
IN comm Communicator
MPI_GET(origin_addr, origin_count, origin_datatype, target_rank,
target_disp, target_count, target_datatype, tag, comm)
MPI_GETC(origin_addr, origin_count, origin_datatype, target_rank,
target_disp, target_count, target_datatype, tag, comm)
Same arguments as MPI_PUT, except
OUT origin_addr Data to be retrieved
MPI_ACCEPT(base, size, disp_unit, tag, comm, count)
MPI_IACCEPT(base, size, disp_unit, tag, comm, count)
MPI_ACCEPTC(base, size, disp_unit, tag, comm)
MPI_IACCEPTC(base, size, disp_unit, tag, comm)
INOUT base Buffer made available to PUT requests
IN size Size of base in bytes
IN disp_unit Scale factor for target_disp in requests
IN tag Tag
IN comm Communicator
IN count Number of requests to service
MPI_OFFER(base, size, disp_unit, tag, comm, count)
MPI_IOFFER(base, size, disp_unit, tag, comm, count)
MPI_OFFERC(base, size, disp_unit, tag, comm)
MPI_IOFFERC(base, size, disp_unit, tag, comm)
Same arguments as MPI_ACCEPT, except
INOUT base Buffer made available to GET requests
MPI_RMA_PROBE(source, tag, comm, status)
MPI_RMA_IPROBE(source, tag, comm, status)
IN source Source rank
IN tag Tag
IN comm Communicator
OUT status status object
Operational Semantics
=====================
(This first paragraph belongs with Preliminary Material -- it just
describes the relationship between operations and their non-blocking
counterparts.)
MPI Non-Blocking Rule: The operation Iop will always complete.
If a WAIT operation is performed on the request returned from an
Iop operation, or a TEST operation is performed on the request
and the TEST returns a "completed" status, then the combination
of the Iop and the WAIT or TEST will have identical semantics to
an op operation with the same arguments as the Iop operation,
except that other operations in the same thread may execute
after inception and before the completion of the combined
operation. The behavior of an Iop operation without a matching
WAIT, or a matching TEST which returns a "completed" status, is
undefined.
PUT, ACCEPT, and IACCEPT
Requesting: A PUT operation will always complete, and results in
the issuance of a PUT request. For brevity of description, the
PUT request will be described as possessing the arguments of the
PUT operation which issued it.
Matching: Each PUT request will be serviced at most once, and
only by an ACCEPT (or IACCEPT) operation with matching comm and
tag arguments, and executing in the process designated by the
comm and target_rank arguments of the PUT request. Multiple
PUT requests issued from the same process (thread) and with
identical comm, tag, and target_rank arguments will be serviced
in the order in which they are issued.
Servicing: When a given PUT request is serviced by a given ACCEPT
operation, "origin_count" data items of datatype "origin_datatype",
starting at location "origin_addr" in the process on which the PUT
was issued, will be transferred to the process on which the ACCEPT
operation executes, in the location obtained by multiplying the
"target_disp" arg of the PUT request with the "disp_unit" arg of
the ACCEPT operation, and adding this to the "base" arg of the
ACCEPT operation, then interpreting this location to be the beginning
of "target_count" data items of datatype "target_datatype".
The servicing of a request is not necessarily an atomic action. A
PUT request will be called "fully serviced" if the entire data
transfer is complete. [Note: In the end, this should work the
same as in Marc's original proposal.]
Completion: An ACCEPT operation will complete if and only if all
of the PUT requests issued by the process executing the ACCEPT,
and having the same tag and comm, have been fully serviced and
(*) the ACCEPT has serviced exactly count PUT requests.
Non-conformance: Local references to addresses in the range
specified by the base and size arguments of an ACCEPT operation
are not permitted during the execution of that operation. Local
references to addresses in the range specified by the
origin_base, origin_datatype, and origin_count arguments of a
PUT operation are not permitted between that operation and the
following ACCEPT operation having the same tag and comm
arguments. [Is this too restrictive?]
IACCEPT satisfies the MPI Non-Blocking Rule.
PUTC, ACCEPTC, and IACCEPTC
The semantics for PUTC, ACCEPTC, and IACCEPTC are identical to
those for PUT, ACCEPT, and IACCEPT, after replacing all PUT with
PUTC, ACCEPT with ACCEPTC, and IACCEPT with IACCEPTC, and the last
clause in the "Completion" paragraph, marked with a (*), is
replaced with the following:
if m-1 total ACCEPTCs and IACCEPTCs with this tag and
communicator have executed in this process prior to this one,
then m total ACCEPTCs and IACCEPTCs with the same tag and
communicator have completed (or will complete) in each of the
other processes belonging to the communicator.
GET, OFFER, and IOFFER:
The semantics for GET, OFFER, and IOFFER are identical to those
for PUT, ACCEPT, and IACCEPT, after replacing all PUT with GET,
ACCEPT with OFFER, and IACCEPT with IOFFER, and replacing the
phrase "transferred to the process" in the "Servicing" paragraph
with the phrase "transferred from the process".
GETC, OFFERC, and IOFFERC:
The semantics for GETC, OFFERC, and IOFFERC are identical to those
for PUTC, ACCEPTC, and IACCEPTC, after replacing all PUTC with
GETC, ACCEPTC with OFFERC, and IACCEPTC with IOFFERC, and
replacing the phrase "transferred to the process" in the
"Servicing" paragraph with the phrase "transferred from the
process".
RMA_PROBE
RMA_PROBE will complete when there is a pending (i.e. issued but
not serviced) PUT, GET, PUTC, or GETC request with the given tag
and comm arguments, from the source process, and with a target_rank
argument matching the current process. Upon return, RMA_PROBE will
return a value equal to either MPI_PUTREQ, MPI_GETREQ, MPI_PUTCREQ,
or MPI_GETCREQ for each of the above conditions, respectively.
RMA_IPROBE
RMA_IPROBE has identical semantics to RMA_PROBE except that it
will always complete, and if its completion is not due to one of
the reasons designated for RMA_PROBE, it will return a value equal
to MPI_NOREQ. (Note: RMA_IPROBE does not obey the MPI Non-
Blocking Rule, just as IPROBE does not.)
Discussion
==========
Because of the odd completion condition for an ACCEPTC and OFFERC,
TEST is free to always return false ("not completed") for any request
returned by an IACCEPTC or IOFFERC. To do otherwise may require
TEST to engage in expensive communication.
(I)ACCEPT(C) and (I)OFFER(C) could be given origin_rank and status
arguments with the same purpose and general meaning as the same
arguments in MPI_RECV. However, since these operations can
service multiple requests, the status argument, at least, seems
less useful than for MPI_RECV.
It is tempting to believe that GETC and PUTC can be eliminated,
since GET and PUT have virtually identical syntax and semantics.
However, it is important from both a programming standpoint and an
implementation standpoint that the GET or PUT operations specify
whether they can be satisfied by a collective ACCEPTC or OFFERC
operation, or by a non-collective ACCEPT or OFFER operation. This
is justified below on two counts:
1. Ease of programming: If a few GETs or PUTs, intended for one
type of operation (i.e. collective or non-collective) are instead
matched by the other type of operation, the programmer will
perhaps never know -- all of the GET or PUT requests will be
serviced, though by an unintended servicer.
2. Ease of implementation: The collective operations operations
are implemented most efficiently by having each process count the
number of requests destined for each target, and then at
completion of the ACCEPTC or OFFERC, totalling these counts across
all processors (logn time) and broadcasting the resultant vector
to all processors (logn time), allowing each process to know how
many requests it needs to service before completing. If some of
the requests are serviced by non-collective operations, then
either each process will be required to keep a complete history of
the number of requests it has ever serviced for every
tag/communicator, or else a much less efficient method of handling
collective operations must be employed.
Similarly, the number of operations added to MPI could be
(artificially) decreased by removing ACCEPTC and OFFERC, and
stating that some special count field in ACCEPT or OFFER would
make these become collective operations. This would allow non-
collective operations to become collective through simple program
errors causing an incorrect calculation of the count, and would
sometimes make it more difficult to determine the intended
servicer of a GET or PUT operation syntactically. (It would not
affect ease of implementation in any way.)
It is also tempting to combine ACCEPT and OFFER (and/or ACCEPTC
and OFFERC) into a combined operation, called perhaps AVAIL
(and/or AVAILC), which would match either PUTs or GETs (PUTCs or
GETCs). In fact, a standard BARRIER could serve the function of
AVAILC, as has been proposed by others. However, if both PUTs and
GETs were satisfied during the same AVAIL(C) operation, it could
tempt users to believe that they could pass data through a third
party during a single AVAIL(C) operation in the third party, and
in fact, they could do so on some architectures unless a
sigificant amount of memory and time overhead were added to
preclude it. Labeling such programs "non-conforming" would only
partially solve the problem, because to make such a program
conform, the user would be required to not only split the AVAIL
operation into two separate AVAIL operations (one for GETs and one
for PUTs), he/she would be required to ensure that the proper
operations were serviced by the proper AVAIL, which would probably
require a separate tag for each. Thus, more work may be required
of the user if these operations are merged.
On non-cache-coherent architectures, where remote operations
access main memory but not cache, an OFFER or ACCEPT operation can
begin by performing a cache flush of the memory range (to ensure
that remote processes will see or update the freshest values).
Conforming programs will not bring any of these addresses into
cache by referencing them for the duration of the operation.
[What about a PUT or GET which target one's own process?]
However, the implementor may need to perform additional work in
the cases where false sharing exists -- i.e. where cache lines
may contain data from both inside and outside the public region.
Axiomatic Semantics (Derived from Operational Semantics)
========================================================
[Still lots of work to do here.]
If the statement before a PUT, GET, PUTC, or GETC statement
completes, then the statement after that statement will begin.
If an ACCEPTC completes, and m-1 ACCEPTCs have completed before
this ACCEPTC, then m ACCEPTCs with the same tag and communicator
will complete (or have completed) on all processes in the
communicator.
If an ACCEPT completes, then all of the PUTs in the same process
which executed prior to the ACCEPT and which have the same tag and
communicator as the ACCEPT will have been matched by an ACCEPT in
their target.
(Same as above, but with ACCEPTCs and PUTCs.)
If a PUT with comm, tag, and target arguments matches, then any
preceding PUT in same process with same comm, tag, and target
arguments will have matched first.
If the mth ACCEPTC completes in process p, and there is a PUTC in
some process within the communicator specified by the comm
argument which collectively satisfies all of the following
conditions:
(a) the PUTC has executed after the m-1th ACCEPTC on its process
(b) the PUTC has the same tag and comm arguments as the ACCEPTC
(c) the target_rank and comm arguments of the PUTC designate
process p
(d) the PUTC effectively stores value v into location l of
process p
THEN the PUTC and ACCEPTC will be said to satisfy the condition
POSSIBLE(PUTC, m, v, l, p)
===============================================================================
David C. DiNucci | MRJ, Inc., Rsrch Scntst |USMail: NASA Ames Rsrch Ctr
dinucci@nas.nasa.gov| NAS (Num. Aerospace Sim.)| M/S T27A-2
(415)604-4430 | Parallel Tools Group | Moffett Field, CA 94035