1) I now claim that the new spawn proposal is at least no worse than the old
one in all respects, and is probably better even in the case of MULTIPLE.
Consider the case (in the current proposal) where the user wishes to spawn 6
a.outs, 6 b.outs and 6 c.outs with a single spawn_multiple() call, but the
resource manager will only allow the creation of 12 processes. What should the
distribution of processes be? Some reasonable choices:
I think that the new proposal is perhaps just pointing out the fact that we
haven't really come up with a good solution for soft errors such as this. To
say that this behavior is implementation-dependent is highly unsatisfying. In
the new proposal, at least, the user has explicit control over the recovery
2) Observation: Combining the spawn and attach does not free implementors from
having to solve the problem of joining two distinct communicators whose
relationship is unknown at spawn time, because the client/server routines will
require exactly this behavior anyway.
3) As Marc points out, an interesting possibility is to defer the actual spawns
until the attach/detach call is made. I used to think that we needed to decide
on this possibility one way or the other, but it now seems to me that it would
be very much in the spirit of MPI to leave this up to the implementations! So
it *may* be that the spawn call actually does create the new processes, but
implementations are also allowed to defer the spawns until the attach/detach. I
have no idea whether this is doable, but it's a very amusing idea and has a
twisted sort of appeal.
4) I also like Marc's suggestion that MPI_PARENT always refer to a valid
intercommunicator. Some users will probably complain that comparing
MPI_Comm_remote_size(MPI_COMM_PARENT) against zero is more work than comparing
MPI_COMM_PARENT to MPI_COMM_NULL, but I really like the idea of making
MPI_COMM_PARENT into a predefined constant; it's a cleaner design.
The only possible gotcha is that implementations must now be able to support
intercommunicators for which the remote group size is zero. I don't believe
that this is possible in MPI-1, but I also don't think that allowing it would
be a Big Deal for our implementation. Would anyone have trouble implementing
-- Eric Salo Silicon Graphics Inc. "Do you know what the (415)933-2998 2011 N. Shoreline Blvd, 7L-802 last Xon said, just email@example.com Mountain View, CA 94043-1389 before he died?"