you give a set of arguments why even the "hard case" in place collective
operations may be worth supporting. For completeness you might also
give at least one argument against :-
If the MPI implementation cannot re-order the sends and receives to
ensure that data to be sent is sent before the same buffer locations
need to be used for received data (and this may not always be
possible, or may have unpleasant sequentialising effects on the whole
collective algorithm), then the MPI library will have to allocate
buffer space. Unless these operations are persistent, this buffer
allocation/release will have to occur on every call. In effect this
moves store allocation which could be cheap (since the user may well
know the required buffer size at compile time, while the library
cannot) and which would have occurred outside the loop into the
loop. Therefore the use of in-place functions may actually turn out to
be a pessimisation over the existing code using separate in and out
buffers.
-- Jim
James Cownie
BBN UK Ltd
Phone : +44 117 9071438
E-Mail: jcownie@bbn.com