You wrote:
>
> 2. Put/Get operations should not be allowed on data items like single
> bytes. The scenario in which two processors A and B each store single
> (but not-identical) bytes into the same word at processor C would
> have serious consequences on performance. An implementation which
> allows this scenario would be slower also in the easy case where
> processors only access 8-byte aligned data items.
The current 1-sided draft fixes this. Marc has added text which
disallows concurrent "partially overlapping" RMA requests. That means that
byte PUTs and word PUTs to the same memory word can be assumed not to
occur concurrently. Therefore the (fast) word-oriented PUTs do NOT need
to lock and are NOT slowed down because byte PUTs are supported.
Byte PUTs are potentially slowed down because concurrent (from multiple
origin processes) PUTs to two different bytes in the same word ARE allowed.
This requires a lock inside the byte PUT operation on most implementations.
I consider this slowdown acceptable on platforms without byte-store instructions
as long as word-PUTs are still fast.
When I initiated some discussion with Marc last week, I was also concerned
about slowing down word-oriented RMA just because byte support was in the
standard. I find this concern to be successfully addressed by the
strategic restriction disallowing partially overlapping RMA requests.
Regards,
Karl
Karl Feind E-Mail: kaf@cray.com
Cray Research, Inc. Phone: 612/683-5673
655F Lone Oak Drive Fax: 612/683-5276
Eagan, MN 55121