I apologize for getting so hot under the collar. I am very frustrated with the
process and direction of the Forum. I feel like Eric was a couple of months
ago when he proposed tossing out the whole 1-sided chapter. We keep
re-arguing many discussions and adding new functionality for (to me)
border line cases and specialized limited hardware. One comment I felt like
making in my last mail was we might was well call it SGI-MPI-2.
This obviously is a result of my limited experience with shared memory
architectures.
While many of these new functions may be useful on certain shared memory
machines and (simply) no-ops on distributed memory machines, the complexity
and wording of the spec. is beginning to greatly obscure the usability of
MPI for our system.
Eric's email further emphasized this bias by suggesting systems have
good and bad, faster and slower memory. It's just not true for our system,
and many of the functions being added and required by the spec are (to me)
useless and irrelevant. I'm afraid mnay users will find understanding the
useful functions from the irrelevant overwhelming especially as the
spec more and more implies requirements and bias that don't exist for
our system.
I also seriously object to any effort to add complexity to the user API
in order to make implementations easier.
I also am biased to resist any thing that looks like a change to MPI-1.
As to providing optimized data paths being a problem,
there are many trade offs that have to be made in writing a portable interface.
If the Forum insists on providing all the functionality required to
implement the absolute optimal communication for each type of hardware for now
and into the future, MPI will be unusable. 20%-50% of the latency of
MPI in our system is implementing MPI semantics above our primitives. This
is something we have to live with (though we keep grinding away at it) for
the sake of a widely accepted portable interface.
Again I don't know much about shared memory architectures, I assumed an
MPI implementation would be able to tell which kind of memory (shared or
not shared) was being used without comparing each address to a list of
addresses. Aren't the physical addresses of shared memory somehow different
that non-shared physical addresses?
Again I apologize for responding off-the-cuff to Eric's with my frustration.
joel
- --- Your Message
- -> From: "Eric Salo" <salo@mrjones.engr.sgi.com>
- -> Message-Id: <9606131534.ZM1406@mrjones.engr.sgi.com>
- -> Date: Thu, 13 Jun 1996 15:34:11 -0700
- -> To: mpi-core@mcs.anl.gov
- -> Subject: Re: YASM
- ->
- -> > We voted nothing like the above paragraph. There is no "good" vs "bad"
- -> > memory in the spec. There is no "reward" in the spec. There is no
- -> > guarantee that any particular memory will be "faster" than any other memory
- -> > in the spec.
- ->
- -> I would suggest that you reread the first paragraph in section 4.2:
- ->
- -> "In some systems, RMA operations will run faster when accessing
- -> specially allocated memory (e.g., memory that is shared by the other processes
- -> in the communicating group, on an SMP)." MPI provides a mechanism for
- -> allocating and freeing such special memory. The use of such memory for RMA is
- -> not mandatory."
- ->
- -> Seems pretty straightforward to me. Which part don't you understand?
- ->
- -> > Please refrain from projecting your desire despite several votes to the
- -> > contrary.
- ->
- -> Joel, I am doing what I can to *embrace* the current proposal. My own
- -> preference would still be to mandate RMA_MALLOC for every put/get message. That
- -> has been voted down and so I have moved on; the subcommittee has made it quite
- -> clear that restricting get/put windows is not acceptable, we shall instead
- -> allow codes to specify "special" memory for added performance. I am simply
- -> extending this idea forward to the next logical step.
- ->
- -> > I for one am willing to vote out MPI_RMA_MALLOC if it is going to cause these
- -> > kinds of problems!
- ->
- -> I was unaware that providing optimized data paths was a "problem". Are you
- -> saying that MPI should force all implementations to run uniformly slowly?
- ->
- -> > I would think an implementation would be smart enough to tell when it can
- -> > take advantage of this kind of special hardware without burdening the
- -> > application writer.
- ->
- -> Well, then perhaps you should think again. In principle, and MPI implementation
- -> could certainly maintain a list of all RMA_MALLOC buffers, yes. But now you
- -> have to check that list every time you send a message. So, what if the
- -> application allocates 20 different buffers? Now MPI has to check all 20 entries
- -> every time a message is sent to see if it can optimize it, which rather defeats
- -> the whole purpose.
- ->
- -> > The lists can be built during MPI_RMA_MALLOC. If the compare of a address
- -> > costs so much latency there must be no advantage to avoiding the copies you
- -> > mention above.
- ->
- -> This is just plain bullshit. The problem is not in building the list, it is in
- -> checking it. Extra memory references are a *big* problem, which is my whole
- -> point! I don't want to have to check some list every time I call MPI_SEND, and
- -> I don't want to have to copy my send buffer twice if I can just do it once.
- ->
- -> > So now creating a pt2pt message buffer is a collective operation?! Ugh!
- ->
- -> No, it's only collective if you choose to take advantage of this new option.
- -> Nothing else in MPI-1 changes.
- ->
- -> --
- -> Eric Salo Silicon Graphics Inc. "Do you know what the
- -> (415)933-2998 2011 N. Shoreline Blvd, 8U-808 last Xon said, just
- -> salo@sgi.com Mountain View, CA 94043-1389 before he died?"
- --- End Of Your Message
------- End of Forwarded Message