Eric Salo wrote:
>1) Let's say that the shared buffer is very large. It looks to me like the
>entire buffer will always be sent to the receiver, even if we only modify a
>single word. On shared memory systems this is of course not an issue, but I
>don't think this is likely to be acceptable for NOWs.
The objective of communication in CDS1 is to get data which is accessible by
one process to be accessible by another. If one of those processes wants that
data in a particular place, it is up to the process to move it to or from that
spot. There are CDS1 primitives ("send" and "recv") which integrate the
movement with the transfer to allow for optimization, but there is *no*
capability in CDS1 for the producer to specify where the data should physically
appear in the consumer, or for a consumer to specify where the data should
come from in the producer, nor do I believe there should be. Such semantics
are directly contrary to information hiding principals.
So, to answer your specific question, the buffer (i.e. "region" in CDS1) *is*
the unit of transfer. If you want to transfer parts of a buffer, then you
make each part a separate buffer. Each CDS1 comm cell can hold one element of
an array, or an entire array, so the unit of transfer is up to the user.
>2) Now let's say that the shared buffer is very small. Will the handshaking
>that is required between partners to obtain/free the buffer lock be a source of
>significant latency, and in general can it be masked by careful coding? I
>confess that I don't quite understand your sample implementation, but it seems
>logical that a consumer (for example) would need to receive first a message
>from the producer stating that the buffer was available, and then potentially a
>second message containing the new buffer contents. This might or might not be a
>significant source of delay, depending on how things are "reset" for the next
>transaction.
I don't know why you believe that more overhead would be needed than for
standard message passing. In fact, in CDS1, the buffer transfer is essentially
a ready send, because it is the user's responsibility to ensure that there is
room in the comm heap for any incoming region. (It sounds like the same could
be said of the NOAA system, since there is always a buffer there waiting for
the message.) In CDS1 (though perhaps not in the NOAA system), there is some
overhead required to allocate a region in the comm heap for the incoming
message.
>3) Limiting each buffer to a single sender and a single receiver seems very
>painful, although I can see how it simplifies implementations nicely. If we're
>going to try to simulate a true shared buffer it would be nice to be able to do
>it for an arbitrary set of readers and writers. (Just a small matter of design
>and implementation, right?)
CDS1 allows an arbitrary set of readers and writers.
>I can't make up my mind yet whether David's CDS1 does this - I'm still trying
>to wrap my brain around it - but at first glance it appears to be more of a
>"pull" design than a "push" design, which again makes me worry about
round-trip
>latency.
CDS1 allows both "push" and "pull". If the process puts a region into a cell
in another process, and that process gets the region from the cell, it is a
"push". If a process puts a region into a cell in its own process, and another
process gets the region from that cell, it is a "pull". Even on pulls, CDS1
offers a very simple and portable mechanism similar to asynchronous receive
to allow the pulling process to perform other work after requesting a region
from a distant cell -- i.e. while incurring the latency of the request
and the resultant data.
-Dave
-- =============================================================================== David C. DiNucci | MRJ, Inc., Rsrch Scntst |USMail: NASA Ames Rsrch Ctr dinucci@nas.nasa.gov| NAS (Num. Aerospace Sim.)| M/S T27A-2 (415)604-4430 | Parallel Tools Group | Moffett Field, CA 94035