Re: we should wait for 1sided implementations

James Cownie (jcownie@bbn.com)
Thu, 30 May 1996 13:15:32 +0100

> Well, unless your DMA engine is only 8 bits wide, you are going to
> have to assume some sort of data alignment. If you have 32-bit wide
> DMA, then you'll need 32-bit alignment, and so on. The specific
> requirements will vary from system to system, but fundamentally the
> problem of alignment is always gonna be there in these
> high-performance configurations. So it ain't just my machine that
> cares about this, it's potentially a whole bunch of 'em.

This assumes a simplistic hardware interface.

There's absolutely no reason why the hardware should not be capable of
handling DMA's to arbitrarily byte aligned buffers. (It just needs a
few shift registers and a bit of intelligence in the design to handle
the partial write protocols on the system bus). If MPI can encourage
the design of such hardware so much the better. (As a proof by example
the Meiko CS-2 has had remote DMA to/from arbitrarily byte aligned
buffers for the last 3 or 4 years, so it can't be that hard.)

Hardware designers are the same as the rest of us, if they can't see
the requirement they'll leave a feature out to make their lives
easier.

> Now we move from technical to religious, I suppose. We already have
> a fully general send/recv interface and are working on an equally
> general handler interface.
I thought we voted out the fully general handlers last meeting other
than in environments with full thread support, so it seems hard to
rely on them.

> So the functionality is already there for users who want it. If we
> make the get/put interface equally general, then it will greatly
> complicate implementations which could otherwise be quite
> simple. All that we're doing now is making it easy for users to get
> poor performance.

The same argument could have been applied to the non-contiguous data
types, however as we expected the implementations have been able to
exploit the additional information that they provide and achieve much
higher performance than had they been omitted.

You are arguing for similar restrictions (contiguous, aligned data),
but

1) the restrictions are hard for users to comply with. (How can I
specify the alignment of a variable in standard Fortran ?)

2) many of the extended features only cost if you use them, so
it is indeed more work for you as an implementer, but it isn't a
performance issue for your users. (You are entirely free to publish
guidelines explaining how to achieve the fast path in your
implementation, though, of course, you may not want to do this since
by implication it also points out the slow path...)

3) remote store access isn't there only for performance, it's also
there because it's a useful programming model which is
fundamentally different from message passing in its semantics.
It isn't in general trivial to change a remote store access code
into a message passing code.

-- Jim

James Cownie
BBN UK Ltd
Phone : +44 117 9071438
E-Mail: jcownie@bbn.com