[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [mpi-21] Proposal EH2: add const keyword to the C bindings
Doug:
Re:
> > OK. How hard is it for you to fix? Is it really worth a
> > potential performance penalty?
>
> It will affect the performance of my code to fix it. The pattern I have
> here is that I'm sending the same buffer to a subset of the processes
> within a communicator. If this restriction is lifted, I can do that with
> a bunch of MPI_Isends to the various processes followed by a
> MPI_Waitall. To fix this code, I need to either clone the buffer N times
> (ouch!) or do N MPI_Sends in succession.
>
> > While the argument to "fix" it is that there is no penalty,
> > the argument that there are no errors is at least as valid.
>
> Well, perhaps now we have a new argument to fix it: by lifting this
> restriction, I can write some obvious code that performs better than the
> workaround I would have to do to avoid tripping over this restriction.
I will grant that your pattern is interesting but it is not
clear to me that you are concerned with the correct problem.
What you want to do is a multicast. I can understand why you
might not want to create a communicator on which to call an
MPI_Bcast for this pattern (not only is the bcast blocking
but you may want to do many of these on slightly different
communiators so creating communicators would be inefficient).
However, alternatives are possible - extending send operations
with "send multiple" semantics would be reasonable. Basically,
instead of providing a destination rank you would provide an
array of destination ranks. This solution allows the MPI
implementation to muck the buffer if it wants to do so without
causing semantic violations.
You are likely to get performance benefits from in standard
support. Memory performance of large messages can be improved
through cache blocking of the sends. Also, if the number
of sends is sufficiently large, the implementation could use a
tree-based algorithm (i.e., it could broadcast the message).
While you could do this yourself, putting the call in the
standard allows it to be implemented far fewer times than
it otherwise might have to be...
The real question is "Are there instances where the multiple
accesses are by the user code and not by the MPI library?"
I think these are much less likely.
Bronis