I've been told that there are shared memory vector multiprocessor systems
from other vendors which are not cache-coherent with respect to memory
updates from other processors.
There are 2 options for non-coherent SMPS:
1) A signal handler must be invoked at the target
task when MPI_PUT is called. The signal handler will
execute an instruction to make the cache coherent (by
invalidating it) after the data for the MPI_PUT is delivered.
2) Another alternative is to add an MPI_DELIVER function:
CALL MPI_DELIVER(comm)
Any task is required to call this function before directly
accessing (with loads and stores) the data updated by
an RMA operation. Note that tasks may access the memory
using subsequent RMA operations without intervening calls
to MPI_DELIVER.
I think this approach would also provide a convenient
implementation avenue for a networked implementation
of 1-sided MPI. The MPI_DELIVER function might serve
most (all?) of the purposes of an RMA agent, making the
RMA agent unnecessary. On the other side of the coin,
cache-coherent SMPs can simply make MPI_DELIVER a no-op.
In Cray Research's SHMEM 1-sided communication library, the
SHMEM_USCFLUSH function works the way MPI_DELIVER would work.
On cache-coherent systems it is a no-op, and on CRAY T3D and
CRAY T90 systems it invalidates cache.
I am interested in your views on this issue.
Karl Feind
+-----------------------------------+----------------------------------+
| Karl Feind | E-Mail: kaf@cray.com |
| Cray Research, Inc. | Phone: 612/683-5673 |
| 655F Lone Oak Drive | Fax: 612/683-5276 |
| Eagan, MN 55121 | |
+-----------------------------------+----------------------------------+