Re: using MPI_Init() without mpirun

Al Geist (geist@msr.epm.ornl.gov)
Tue, 30 Apr 1996 11:23:46 -0400 (EDT)

> From: Gary Oberbrunner <garyo@avs.com>
>
> If I could start the visual programming environment as usual and have it
> "become" an MPI process dynamically

I think it is a very useful feature to become an MPI process dynamically.
And I think we should consider the request to put it in MPI2
using MPI_INIT() as the medium.

I don't see any fundamental problems with defining the behavior of
MPI_INIT() when the calling process is not spawned or mpirun.
In the absence of set environment variables or "magic args" passed
by the mpirun, MPI_INIT() would form a MPI_COMM_WORLD of size one.
As Gary points out the behavior in this case is presently undefined.
So this proposed change should not break any MPI-1 applications.
(a mandate of MPI-2)

Note this is a little different than Joel's reply about how the Paragon
presently works. In his case MPI_INIT() blocks forever
if the partition size is 4 and I manually start just one MPI process
in this partition (hoping this process would spawn others to fill partition)
Did I interpret this correctly, Joel?

Implementations will have to be a little smarter with this change,
but I don't see any fundmental hurdles.

I have a few comments on Bill's reply:
>I think we don't want to encourage people to spawn one process
>and have it spawn the rest,

encourage or not, MPI-2 allows people to do this, as well it should.
But the real issue is how to make a non-spawned single process
an MPI process. This is something MPI presently doesn't define how to do.

>0. MPI-1 has already established the standard "look and feel" of
>an MPI application.

MPI-1 has NO dynamic functionality, MPI-2 is changing all of that
so we are now defining the "look and feel" of MPI-2 applications.

>1. this is non-portable, especially if the info argument is used.
>Much easier to put the non-portability in mpirun arguments
>than inside a program.

We have already agreed in the forum that using "info" is non-portable
There is no "especially if" about it.
I don't see how it is easier for the user to use mpirun rather than info.

>2. it is more complicated (since you have to mess with
>intercommunicators)

Yes it is more complicated than starting all processes at once.
But sometimes you don't know how many processes the application
needs until after the input files are read.
So I vote to give the power to the programmer to do complicated
things if she wants.

>3. it might be impossible to get high performance.

In this instance, the argument that PVM proves that this is the case
is somewhat flawed. First, is's true that in PVM,
tasks started with multiple spawns on an MPP usually communicate slower.
But this is because PVM uses the native MPP communication and startup routines.
And on many MPP, the native routines assume all processes start together.
(It is a tribute to PVM that multiple spawns can talk at all)

In defining MPI, we are defining the native routines of the future.
I would like to see future MPPs not locking in applications
to start all processes at once. In which case, MPI (and PVM)
could get high-performance even with multiple spawns.

We are defining a standard that will last longer than the present
generation of MPPs. We need to consider what features applications
of the future will need and make sure MPI doesn't handicap them.

And to Rolf about the argument:
>In Fortran 77 it is not possible to use a string as an
>actual argument if the formal argument is an array of strings.
>I tested the following example...

My experience is that Fortran 77 compilers are quite varied in
how they handle strings. In the meetings we have heard several times
that Cray has trouble with some of the existing string args
(not even thinking about arrays of string args).

I would like to see the number of SPAWN functions decreased
just as Rolf and others have asked for.
If we replace the char* info argument with an info-handle arg
(this was a suggestion at the last MPI meeting that awaits
a writen proposal to the dynamic committee.)
then we might consider using the same string handling functions
such as Insert_String_In_Handle(handle, string),
Num_of_Strings_In_Handle(handle, count),
etc
to make command_line a handle arg of the same type as info.
Users could get SPAWN_MULTIPLE functionality by inserting
multiple command lines in the handle.
Using the info handle makes our problems with (char* void*) info go away.
And Fortran programmers never have to work with array of strings. (-:

The above suggestion could combine SPAWN and SPAWN MULTIPLE versions.
We voted to use a flag to distinguish the NONMPI vs MPI versions.
The two distinctions left in SPAWN would be nonblocking and INDEPENDENT.

I don't see how to collapse INDEPENDENT because it returns a group
rather than an intercomm.
I would prefer not to collapse ISPAWN and SPAWN together
because this is not done anywhere else in the MPI standard.

These are just some ideas, and maybe not even good ones.
Let's try to get some consensus from the dynamic committee
this week on how to rewrite the chapter for the next meeting.

Al "flame bait" Geist