notes from meeting

Steve Huss-Lederman (lederman@cs.wisc.edu)
Tue, 30 Jan 1996 10:01:45 -0600

This note is to pass along what happened during the discussions of the
Forum in Vienna concerning collective operations. A later note will
discuss status and cancel since this involves several committees.

The first topic was whether we want collective put/get. One question
raised was whether an MPI function would be any faster than just layering
it on top of the one-sided operations proposed. The humorous comment
was made that the one-side bcast is the ultimate mcast! The final
sentament was that a written proposal would be needed if we are to
seriously consider this.

We again had a discusion about whether to have non-blocking collective
calls. It was raised that CMMD has them and users would like it. It
was raised whether, unlike pt-to-pt, we could allow implementations to
do a blocking collective instead. With pt-to-pt this will break
codes; however, with collective it was suspected that the codes would
still work (since you can't interleave calls) but the performance
advantage in the user algorithm would be lost. The issue of mixing
non-blocking with blocking was also discussed. The question was does
it limit hardware solutions, i.e., would it stop vendors from using a
hardware assisted bcast? What resources would be needed - buffers,
etc. Right now you can only have one collective at a time so you can
use one set of system resources. With non-blocking you cannot do
this. (This isn't directly an issue with mixing calls). In the end
no votes were taken.

Can you probe for non-blocking collective calls? Everyone voted
against this.

The current draft has two proposals for intercomm collective ops.
They are similar except for non-rooted ops. In one case they are half
duplex and full duplex in the other. It was noted that the full
duplex case is a superset of the half duplex case. The vote was:

full: 15
half: 0
abstain: 7

Thus, the next draft should use Marc's proposal as the basis for the
text.

I raised another issue. For intercomms we use MPI_ROOT to indicate
the root of a bcast. For consistancy, should we allow the root in an
intracomm to do the same? Currently, it gives its own rank. There
seems to be no implications for performance. The vote was:

Allow: 6
Don't: 3

This vote is clearly mixed but the proposal should probably be written
up.

The final issue is whether all processes should be involved in the
rooted intercomm collective ops. Unlike the intracomm case, not all
processes are involved - only the one process that is the root in the
source group. The argument for not including the others is that it is
strange that they make a call for which they do not get data. The
arguments for including them were that it is consistant with
intracomms so the rule is simply that all processes make the call in a
comm for collective ops. Also, it allows an implementation to use an
algorithm that uses the other processes. The vote called was whether
all processes should be involved:

yes: 14
no: 4
abstain: 7

This is all that we had time for.

Steve