Thank you *very* much for providing some specific implementation issues for the
new proposal. Some of them had definitely not occured to us before, but some of
them had, and so I'll attempt to address them now...
I agree that MPPs are the big question mark here, as usual. For the moment,
let's talk about current systems. Do we have any specific machines today for
which this would be a definite problem? We've not yet heard from any MPP
vendors regarding the new proposal, which is unfortunate.
If the only issue here is the fact that we now do not know at spawn time
whether we intend to attach or detach, there is a simple fix for that. We
already have a 'flags' argument in the new proposal to distinguish between
MPI/non-MPI processes. It would be a simple matter to add an ATTACH/DETACH flag
as well, which would restore this missing information to implementations. I
personally don't believe that we really need one, but it's certainly something
to consider.
> 2) There are other issues that are likely to need to know if the spawn is
> creating processes that are an extension of an existing set of a new set.
> For example, the user may want to gang or co-schedule the new processes with
> the existing set.
But surely the application writer already knows at the time of the spawn call
whether he/she wants (for example) gang scheduling? In the old proposal, this
would probably be indicated via the 'info' argument, right? Same deal with the
new proposal.
> 3) An interesting model that is popping up on some shared memory machines is
> "threads with private memory". To the MPI user, this looks like an MPI
> process (since it has its own memory space); to the implementation, it offers
> major advantages in performance (both with shared memory and intelligent
> scheduling). If you are doing a spawn of the same executable for the
purposes
> of expanding your communicator, you can create a new such "thread"; if you
are
> spawning to create a new independent process, you want to fork/exec. It is
> too late to do this when the the attach/detach occurs. Technically, it is
> possible to change the operating system to allow threads to be stripped off
> into another process, but I can't see why it should be done just to support
> MPI. Also note that in both cases the user is likely to use the NULL info
> argument, making it impossible to know what to do until it is too late.
As an implementor who uses *exactly* this model, let me just say that this will
not be a problem for us. The reasons are many, maybe we should take this
particular discussion offline. Other vendors who use this approach should
comment on this as well...
-- Eric Salo Silicon Graphics Inc. "Do you know what the (415)933-2998 2011 N. Shoreline Blvd, 7L-802 last Xon said, just salo@sgi.com Mountain View, CA 94043-1389 before he died?"