Re: Public Defined

Greg Burns (gdburns@tbag.osc.edu)
Mon, 13 Jan 1997 21:35:02 -0500 (EST)

I would like to throw my support behind the rebel David DiNucci's
original remarks. I am not surprised by the silence of the public
comment period, though it is necessary that we have such a period
for the sake of credibility.

The reason for the lack of feedback is that not enough people care.
Parallel computing is a niche technology dominated by transient users
in the academic sector. The vendors all attend the Forum.
What is lacking is the business sector who intend to build products on
or around MPI, but for one reason or another do not attend the Forum.
If they existed in greater numbers, they would certainly care enough
to dive into the MPI-2 document, no matter how large or complex.

I am quite disappointed that every HPC cycle center does not have
a person carefully reviewing MPI-2 and returning comments. Users
at these centers are not being properly informed or prepared for key
developments in HPC software, of which the MPI standard is certainly one.

MPI has supplanted PVM for the traditional parallel applications.
It still seems like there are more PVM users in the world and many
are not using it for scientific computing. Interestingly, PVM's
future may be brighter because of this.

Non-blocking operations, beginning in MPI-1 and culminating in
generalized requests, are turning MPI into an operating system
wanna-be. Synchronization and blocking processes are major parts
of what an OS does. We want to wait on an arbitrary subset
of communication events and preferably release the processor in
the meantime. In MPI-2, we also include I/O devices. Oh, and
because we want underlying progress, don't forget to watch all
other outstanding requests at the same time. These things are
best and most efficiently coordinated at the lowest level, the
operating system that is fielding all the interrupts from the
devices. But no OS lets the user tap into all of this in
a completely flexible way [Win32 WaitForMultipleObjects() is more
powerful than UNIX select(); let's require NT :-)]. So we decided
to force all of this into MPI, at the cost of considerable complexity
for both the user and the implementor.

Fortunately, the total lack of correct implementations has resulted
in the acceptance of the weak interpretation of the progress rule.
Otherwise, you would need to have or to build an OS to implement MPI
- forget about lean, mean embedded implementations running with
the interrupts turned off. But this has caused all sorts of agony
in designing one-sided to be suitable to distributed memory machines.

Many of the implementations have single-threaded progress engines.
Every time MPI-2 adds another non-blocking operation, we have to extend
the progress engine with an additional state machine for that operation.
This only increases the opportunity for users to create multi-flavoured
request arrays that cause the implementation to degrade to polling
the operating system from a user process. Very few users realise
that this is going to suck in terms of chewing up the CPU and increasing
mean latency to respond to a synchronization on one of the requests.
Of course it all looks yummy at the MPI layer.

My wish, in hindsight, is that MPI_I*() only provided information
to the implementation for possible overlapping or idle-consuming progress.
There would be zero guarantee of progress until a Wait or Test was called.
This would at least allow implementations to concentrate on the request
array used in the Wait/Test call, and not the entire outstanding request list.

Even in MPI-1, we surpassed the goal of standardizing common practice.
We must keep in mind that MPI's number one selling point is that it
is a standard. Any particular chapter or function is a distant number two.
The _standard_ is what can catalize the niche parallel computing industry,
which is the primary goal in my opinion. The very existence of
MPI-2 shows that we want more. We want to design an all-encompassing
parallel interface. It's message-passing. It's shared memory (subtle
distinctions aside). It's I/O. Did we forget anything? No worry, it
can be layered on GR. Now we may get this all bang-on right, be adopted
as POSIX.n+1 and turn parallel computing into the next WWW. Or we
might muck it up, scare off the users with the complexity, piss off
the implementors with work, save PVM, and get bumped off by some new
DCE-2 standard with more hype. I don't know which is right. My gut
tells me to keep it simple, let the market have a go at it, see where
they really get stuck and then step in.

-=-

tidbit MPI-2 feedback: I, like many, felt that dynamic processes were
a major draw-back, marketing-wise, for MPI-1. As an (the only?) implementor
of MPI-2 dynamic processes, I can report that we have had near-zero
feedback on these functions, which leads me to believe that they are
not being used very much. Maybe Marc's unconventional wisdom on this
chapter will prove correct...

--
Greg