Lloyd wrote me the answer below.
I do not understand exactly, why MPI_DELIVER is not enough,
if it throws all the caches away.
But we must define clearly, when the local application on
the target node may access the memory for loads and stores.
The exact definition should be:
a) After calling MPI_RMA_WINDOW_READY on the target node
the application on the target node can make local loads
and the application on the origin node may access the
memory in parallel with MPI_GET.
After all MPI_GET are done, the application on the target
node may access again the memory with loads and stores.
b) After calling MPI_RMA_WINDOW_READY and before calling
MPI_DELIVER on the target node the application on the
target node must not access the memory and the application
on the origin node may access the memory with MPI_PUT.
After MPI_DELIVER the application on the target host
may access again the memory with loads and stores
but the application on the origin node must not access
the memory.
Therefore before calling MPI_DELIVER the application
on the target node must be sure that all outstanding
MPI_PUTs have been executed on the target system
(e.g. by controlling the counter).
Is there a chance to implement MPI_RMA_WINDOW_READY and
MPI_DELIVER in the following way
MPI_RMA_WINDOW_READY = write all caches of the target node
to the memory
MPI_DELIVER = throw all caches of the target node
away
on each cache system?
I do not understand the problem with partial cache lines.
Karl and Lloyd, can you please discuss the problem and tell me
what is correct?
In the moment I'm writing the text for the section
"Problems with register optimization and caching"
and I should send it to Rusty tomorrow.
Thank you
Rolf
Forwarded message:
> Date: 17 Apr 1996 13:13:18 -0800
> From: "Lewins, Lloyd J" <llewins@msmail4.hac.com>
>
> Rolf wrote:
>
> > Possible solution: (to CACHE PROBLEM 1)
> > The application on the memory node must issue a call like
> > CALL MPI_RMA_WINDOW_READY(newcomm)
>
> This is not a solution due to the false sharing of cache lines. Even if dirty
> lines in the cache are flushed before the sequence of Puts, subsequent writes
> to a variable unrelated to the put location, but sharing a cache line with it
> could bring a dirty line into the cache. When this line is eventually returned
> to main memory it will overwrite the new data from the put.
>
> In our experience with software managed cache-coherence, the following are all
> neccessary:
>
> 1) Flush the cache before a sequence of remote get/puts. Neccessary for CACHE
> PROBLEM 1, and also to ensure that the memory is up to date before a get
> reads stale data.
> 2) Flush the cache after a sequence of remote puts. Neccessary for CACHE
> PROBLEM
> 2. But see below.
> 3) Put to partial cache lines using a remote agent. Neccessary to avoid the
> false sharing problem.
> 4) Disallow writes by the local processor to buffers which overlap with remote
> puts. Otherwise, if a local write to part of a cache line occurs before the
> put, and the replacement occurs after the put, the buffer will contain old
> data plus the local write data. This is illegal since no possible sequence
> of byte writes and byte puts could acheive this.
>
> If we also disallow reads by the local processor to buffers which overlap with
>
> remote puts, we remove the need for 2) above. Note: These are exactly the
> existing rules for MPI send/recv, for exactly the same reasons.
>
> Note: this precludes using shared memory locations for synchrnoization, since
> it would be erroneous to read a location which may be updated by a put. Thus,
> counters become much more important.
>
> A new function, i.e., Rolf's MPI_RMA_WINDOW_READY(newcomm) is neccessary to
> perform 1.
>
> MPI_DELIVER is not sufficient to perform 3 since it will not support the
> progress rule. Thus, the remote agent MUST use a signal or interrupt. Note
> however that the remote agent is only neccessary when the put is to partial
> cache lines. This is unfortunate since small read/writes are exactly the kind
> of accesses one would like to use (efficiently) wish shared memory.
>
> Lloyd Lewins
> Hughes Aircraft Co.,
> llewins@msmail4.hac.com
Rolf Rabenseifner (Computer Center )
Rechenzentrum Universitaet Stuttgart (University of Stuttgart)
Allmandring 30 Phone: ++49 711 6855530
D-70550 Stuttgart 80 FAX: ++49 711 6787626
Germany rabenseifner@rus.uni-stuttgart.de