Re: dynamic counter-proposal

Rusty Lusk (lusk@mcs.anl.gov)
Fri, 10 May 1996 15:54:21 -0500

|
| I don't understand Rusty's view that the new proposal is a major
| departure from the old one.
| OK, it changes the one thing that was probably the hardest to design
| and make acceptable to most people.
|

I consider it a major departure because it changes MPI_Spawn, the most
fundamental function in the chapter, in a fundamental way.

| >From an implementor's standpoint, I think of the new proposal as identical
| to the old one, except after getting the application started, instead
| of putting in a big comment introducing the communicator synchronization
| phase, I put a return statement and put the code in another function
| (MPI_Child_attach).

Certainly not how I would do it. I would put the communication information in
the new process's environment as it was started, so that it could begin doing
what it was supposed to do, not wait for further communication.

| We have implemented and distributed a spawn capability as part of
| LAM 6.0 that has exactly the same communicator usage as is proposed by
| the current draft (and unchanged in the new proposal). I forsee no
| problems breaking it into separate creation and communicator hookup
| functions. In fact, I think it is trivial.
|

But you get to write your own daemons. We want to send instructions to some
process manager to start processes and communication. For example, on the SP
we interface to IBM's system that reserves part of the hardware switch as well
as starting processes connected to the switch and able to use tIBM's mpl
library. I don't know how to do this in two steps, and I can't rewrite IBM's
software to take a different interface.

| Please explain exactly how the process manager complicates this division.
| I would not expect third party process managers to be the synchronization
| point for comm_world construction, since they know nothing about MPI.
| Thus I don't anticipate a problem with them doing the actual creation
| and the MPI parent processes doing the communicator construction.

I certainly would. For one example, on the SP mpirun invokes a scheduler
which eventually starts processes with the switch reserved for them. We know
how to build a communicator out o information avaialble from mpl. I don't
know how to ask it to start processes without building this information, nor
now to get it without asking it to start processes at the same time.
Expanding one's communicator may have two compnents, but they may be quite
intertwined. Ergo, one call is best.

| So you are asserting that environment portability will not work without
| combining the two steps in a single MPI function? I take this
| seriously since you are travelling down this road, but please explain
| the details.

See above.

Rusty