Re: dynamic counter-proposal

Greg Burns (gdburns@tbag.osc.edu)
Sat, 11 May 1996 11:00:05 -0400 (EDT)

>| >From an implementor's standpoint, I think of the new proposal as identical
>| to the old one, except after getting the application started, instead
>| of putting in a big comment introducing the communicator synchronization
>| phase, I put a return statement and put the code in another function
>| (MPI_Child_attach).
>
>Certainly not how I would do it. I would put the communication information in
>the new process's environment as it was started, so that it could begin doing
>what it was supposed to do, not wait for further communication.

My statement does not preclude this detail, nor do I see how this
detail supports your argument.

| We have implemented and distributed a spawn capability as part of
| LAM 6.0 that has exactly the same communicator usage as is proposed by
| the current draft (and unchanged in the new proposal). I forsee no
| problems breaking it into separate creation and communicator hookup
| functions. In fact, I think it is trivial.
|

> But you get to write your own daemons.

Our daemon helps us to a remote fork()/exec() and provides a basic
remote communication layer. It is not a party to the building of
communicators for our spawn.

>We want to send instructions to some
>process manager to start processes and communication. For example, on the SP
>we interface to IBM's system that reserves part of the hardware switch as well
>as starting processes connected to the switch and able to use tIBM's mpl
>library. I don't know how to do this in two steps, and I can't rewrite IBM's
>software to take a different interface.

>I certainly would. For one example, on the SP mpirun invokes a scheduler
>which eventually starts processes with the switch reserved for them. We know
>how to build a communicator out o information avaialble from mpl. I don't
>know how to ask it to start processes without building this information, nor
>now to get it without asking it to start processes at the same time.
>Expanding one's communicator may have two compnents, but they may be quite
>intertwined. Ergo, one call is best.

OK, here is the meat. Note that there are two communicator building
requirements, the children's world and the inter-comm. You _do_ know, by
way of a flag, whether or not to build the child world communicator
in the MPI_Spawn function. What you do not know is whether or not
you have to build the inter-communicator later in an MPI_Child_attach
function. You are telling us, then, that there is no way to break
out the intercomm part with IBM's resource manager? Can the vendor confirm
this roadblock? I assume IBM is planning to implement dynamic processes
and planning to use their resource manager. Will someone there
speak up if we are about to screw your implementation?

--
Greg