Re: to lock or not to lock

(no name) (Marc Snir/Watson/IBM Research@nas.nasa.gov)
12 Jul 96 17:13:24

We want to have synchronization points where window coherence and activities
occur. So, if A writes to a window and B accesses the window, A "did
something" after the write, and B "did something" before its access. This is
necessary when one of the accesses is load/store and the other is
put/get/accumulate. Now we cant start arguing what calls we want to use to "do
something". Everybody seems to agree that use of barrier to synchronize
put/get with load/store is pretty frequent, so we agree that a barrier is one
of these "something". Now, we argue about the case where communication really
involves only two (or few) processes, but it still make sense to use put/get,
perhaps because A knows where the data is in the memory of B, but B does not
know what data A wants to access; recoding this with send/receive requires an
additional hand-shake. So, with lock/unlock, this will look like

A B
window_excl_lock(Wdata) repeat
put data window_shar_lock(Wdata)
put flag if (a=flag)
consume(data)
window_unlock(Wdata) window_unlock(Wdata)
until(a)

The overhead is that B needlessly flushes its cache with each unlock, in a
noncoherent architecture. Now, this overhead can be avoided if the flag is in
a separate, small window.
So, now it looks


A B
window_excl_lock(Wdata) repeat
put data
window_shar_lock(Wflag)
window_unlock(Wdata) a = flag
window_excl_lock(Wflag) window_unlock(Wflag)
put flag = 1 until(a)
window_unlock(Wflag) window_shar_lock(Wdata)
consume(data)

window_unlock(Wdata)

If this usage is prevalent, then one can support an additional synchronization
operation, similar to the cond_signal/cond_wait pair of pthreads, or the
EVPOST/EVWAIT/EVCLEAR of CRAFT. Then, the code looks

A B
put data window_wait(Wdata)
window_post(Wdata) consume(data)

window_clear(Wdata)

We intoduce a new synchronization operation:
window_wait(comm, rank) blocks until the event associated with the window is
set; window_post(comm, rank) sets the event, and window_clear(comm, data)
clears the event.

And we add to the specification, that conflicting accesses to windows can be
also synchronized by a post/wait pair. The general approach is

(a) we introduce synchronization calls patterned after those currently used in
similar environment, and associate the synchronization object with a window; and
(b) we specify that window accesses need to be synchronized with these
constructs.

The implementation associates coherence operations with these synchronization
calls.

We can add more synchronization operations if the one I suggested do not cover
well prevalent usage. But this is OK: the new calls provide new functionaility
and will be used by users that want it, and ignored by users that do not want
them; they are not auxiliary coherence operations. We can start with a small
set of synchronization operations, and enrich them if there is demand (I can
already see somebody that wants a post that increments a counter, and a wait
that waits until the counter reaches a given value).