[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: MPI_FINALIZE
(This mail is a response to the Mail of Jeff and the replies from Hubert and Tony)
As far as I know, an MPI routine is "collective" over a group of
processes means that all these processes have to call the
MPI routine in a way, that the collective MPI routine is allowed to
make a barrier synchronization internally,
but it does __NOT__ mean that the MPI routine must do such a barrier.
I.e. it is always allowed to implement MPI_Finalize in a way,
that it delivers all pending messages to the destination processes
and then it exits the process in a way, that the other MPI processes
are not touched (or killed or...).
Therefore, the sentence "MPI_FINALIZE is collective over "the
union of all processes that have been and continue to be connected"
has no impact on the implementation.
It has also no impact on the application, because it is trivial, that
all processes have to call MPI_Finalize exactly once.
And I do not see any dfference between "are connected" and
"have been and continue to be connected".
And each implementor should check, whether it makes sense
to do a barrier inside of MPI_Finalize or whether such a barrier is
mainly a performance drawback.
Is my answer correct, or does I have overlooked some lines in
the standard that imply a barrier inside of MPI_Finalize?
Rolf
> A few points:
>
> 1. Implementing the change in MPI_FINALIZE to make it collective over "the
> union of all processes that have been and continue to be connected" is a
> non-trivial distributed algorithm, since it is essentially a barrier over
> potentially unrelated and not-directly-connected processes.
>
> 2. Is there a difference between "have been and continue to be connected"
> and "are connected"?
>
> 3. This change can potentially drastically change the semantics of
> currently-valid MPI programs.
>
> As one example: currently-valid "task-farm" programs may unintentionally
> cause a lot of "zombied" MPI processes that are simply waiting for an
> MPI_FINALIZE from their ancestor(s). Consider what happens if a root
> process continually spawns short-lived MPI processes to perform some task
> in a "fire and forget" kind of model. The short-lived child processes
> could previously invoke MPI_FINALIZE and die. With the proposed change,
> the short-lived processed will now block waiting for the parent to invoke
> MPI_FINALIZE as well.
>
> This program can be fixed by having the root and child processes invoke
> MPI_COMM_DISCONNECT right after spawning (or after whenever the last
> message between the root and children finishes) so that the child can
> MPI_FINALIZE by itself, and then die.
>
> But my concern is backwards compatibility: we have no idea how many
> programs exist that rely on MPI_FINALIZEing over just MPI_COMM_WORLD.
> Changing the spec now could cause unintended side-effects in
> currently-valid MPI programs.
>
> {+} Jeff Squyres
> {+} jsquyres@lam-mpi.org
> {+} http://www.lam-mpi.org/
>
Dr. Rolf Rabenseifner High Performance Computing
Parallel Computing Center Stuttgart (HLRS)
Rechenzentrum Universitaet Stuttgart (RUS) Phone: ++49 711 6855530
Allmandring 30 FAX: ++49 711 6787626
D-70550 Stuttgart rabenseifner@rus.uni-stuttgart.de
Germany http://www.hlrs.de/people/rabenseifner
- Follow-Ups:
- Re: MPI_FINALIZE
- From: Nicholas Nevin - Sun HPC High Performance Computing <Nicholas.Nevin@east.sun.com>
- Re: MPI_FINALIZE
- From: "Andrew Lumsdaine" <lums@lsc.nd.edu>