Re: dynamic counter-proposal

Raja Daoud (raja@tbag.convex.com)
Tue, 14 May 1996 3:06:34 CDT

The way I understand it, Bill's comment against the split spawn/attach
design can be summarized as: if I know at spawn-time the set of
parents, I can do some optimizations. As examples of such optim., he
presents: 1) reconfigurable switches, 2) gang scheduling, and 3) thread
vs. process spawning. Another point was the doubt about future
architectures. I'd like to address these points.

While it is generally true that, given more information, we can pull
further optimizations, we also have to draw the line somewhere. In
this particular case, we voted out many resource management features
and added the "info" argument. It compromises portability, but allows
system-specific optimizations that may not be of general nature.
This certainly fits the three cases above. An implementation for
a system that supports any (or all) of the above features is free
to define specific "info" arguments that give spawn the necessary
hints, including providing it the eventual set of parents.

In any event, having the old spawn doesn't solve the problem since
a user may later merge communicators (not known at spawn time), thus
requiring further 1) reconfiguration, 2) hints to scheduler to form
new gang. Since these would have to be done at merge time, then
they might just was well be also done at child_attach time (consider
it another merge). This scenario makes the I-have-to-know-it-at-spawn-
time argument weaker.

The point about reconfigurable networks deserves special mention. The
argument is that MPPs (high performance + scalability) would quickly
restructure their switches to accomodate new process groupings. From my
experience, this is a slow serialized process, requiring the network to
be brought to a "quiet" state before changing paths and route tables
(whether in s/w or h/w). So it seems this 'requirement' pulls in two
opposite directions. The "user hostile" solution of configuring at
application-startup time is usually there because of concerns about
performance. Reconfiguration requires global coordination, and will be
costly. I'm not comfortable with an argument using it to backup a
concern about performance at spawn time (another typically slow call).

In my opinion, the examples given are less prevalent scenarios than many
other resource management issues that we tossed out from the earlier
drafts. If for all of these we said "use the 'info' argument", the same
approach should be used to handle the special three cases above.

Eric partially addressed the thread-vs-process case, but even for that
case, and on systems that exhibit different behaviour, the point is
still: "if you need to know more for further optimizations, have the
user pass it in the 'info' argument." This is still a resource
management issue. A "high quality" implementation will provide all
the extensions it needs in 'info' to pull all the optim. its resource
manager has up its sleeves.

To handle the "fear of the future" part:

1- At the risk of sounding like a broken record, "info" provides
lots of flexibility for all esoteric systems.
2- Further MPI std. efforts will add the necessary extensions
(think Fortran-Forever).
3- We all are experienced people, no matter the differences of
opinions, the 'sky isn't falling'. [yet?] :-)

I support Eric's proposal for the following reasons:

- It provides all the functionality we have converged on
(single/multiple, blocking/non-blocking, mpi/indep/non-mpi).
- It does so with only 4 functions (not 12).
- It provides an elegant "multiple" mechanism (group unions).
- It's portably fast on the vast majority of parallel systems.
- It's no less parallel than the old proposal since the spawn
prerequisites are only valid at the root, a single point.
- It's easy to extend its functionality for systems that
require more info for further optimizations (using "info").
All this info-sharing is really one-way, from the root parent
to the external resource manager, and thus can also be done from
the new spawn call.

--Raja