> An alternative is to specify that a put becomes visible to any other access,
> either local load, or get, only after the call to WINDOW_IN. We get a "delay
> consistency" model for all window updates.
> We cannot use anymore windows for "shared memory" style third-party
> communication, where process A and B communicate (using put and get) thru
> memory at process C. Process C must be involved. I.e., we restrict put/get
> for more message-passing style communication, where put/get is used to move
> data produced by the origin (target) process for consumption by the target
> (origin) process, while keeping the advantage of put/get that one side provid
> all the information on addresses, and the other side need only be involved in
> the synchronization.
> We can have now a (low quality -:) MPI implementation with no agents: the
> transfer is always affected when the call to WINDOW_IN occurs.
> We don't need MPI_FENCE anymore. This, because we gave up on third party
> communication, where fence is needed. If the communication always requires
> some synchronization with the window owner, then counters or MPI_BARRIER can
> used to make sure that data delivery is complete before it is used.
This model might be easy to implement but it is useless to the applications. If
it is adopted, the entire so called "1-sided" chapter could be then deleted
as a joke.
In fact, we don't need to relax the 1-sided model to make it possible to
implement on the shared-memory systems. The shared-memory vendors have a
straightforwad way to implement the current model by using the mirrored buffer
technique for window memory that was not alocated with RMA_MALLOC. It requires
duplicating the window in shared memory and then using WINDOW_IN and WINDOW_OUT
to copy data to and from shared memory buffer in order to enforce consistency. Of
course, the copy operation degrades performance and extra memory is required, but
it should work.