->
-> I suspect that there is probably no greater sin in the MPI community than
-> proposing Yet Another Send Mode, but here's one anyway:
->
-> In the 1-sided chapter, we have introduced the notion of "good" memory vs.
-> "bad" memory for gets and puts. The idea is that if your system supports shared
-> memory, then you can call MPI_RMA_MALLOC to request a shared buffer and as a
-> reward for doing this your gets and puts will be faster.
We voted nothing like the above paragraph. There is no "good" vs "bad"
memory in the spec. There is no "reward" in the spec. There is no
guarantee that any particular memory will be "faster" than any other memory
in the spec.
Please refrain from projecting your desire despite several votes to the
contrary.
->
-> The thought occurs that 2-sided communication could benefit from this as well.
-> In MPI-1, for example, many shared memory implementations still use a two-copy
-> algorithm to pass messages around; the sender copies the data into a shared
-> buffer, and the receiver copies it out again. While some systems are able to do
-> better than this in some cases, many are not.
->
-> So, given that we already are going to be adding MPI_RMA_MALLOC to MPI-2, why
-> not also add a new send mode to take advantage of it? Something like this:
I for one am willing to vote out MPI_RMA_MALLOC if it is going to cause these
kinds of problems!
->
-> MPI_Xsend
-> MPI_Ixsend
-> MPI_Xsend_init (We can debate the proper value of 'X' later...)
->
-> In this mode, the sender asserts that the buffer being passed is one which was
-> returned by a call to MPI_RMA_MALLOC. It otherwise has the same semantics as
-> standard mode.
I would think an implementation would be smart enough to tell when it can
take advantage of this kind of special hardware without burdening the
application writer.
-> With this mode, processes calling MPI_RECV on shared memory
-> machines would be able to copy the data directly out of the send buffer. There
-> could also be advantages on NOWs, because (for example) high performance
-> implementations might opt to pin down the send buffers in advance to prepare
-> for subsequent DMAs. Machines not capable of optimizing this case could simply
-> use MPI_SEND to implement the new calls, much as often happens today with
-> MPI_RSEND.
->
-> Adding additional smarts to the already-existing send modes is probably not a
-> practical solution because of the added latency that would be involved in
-> building internal lists of shared buffers and comparing against them with every
-> sent message.
The lists can be built during MPI_RMA_MALLOC. If the compare of a address costs
so much latency there must be no advantage to avoiding the copies you mention
above.
->
-> Doing this would probably involve making some minor changes to MPI_RMA_MALLOC.
-> For example, it would probably have to become a collective function, and we
-> would perhaps want to tweak the syntax slightly.
So now creating a pt2pt message buffer is a collective operation?! Ugh!
->
-> Does anyone else think that this idea has merit?
Need I say more?
joel clark
->
-> --
-> Eric Salo Silicon Graphics Inc. "Do you know what the
-> (415)933-2998 2011 N. Shoreline Blvd, 8U-808 last Xon said, just
-> salo@sgi.com Mountain View, CA 94043-1389 before he died?"
--- End Of Your Message