Use non-blocking collective or threads
--------------------------------------
This issue has been discussed before. Again, the vote was strongly to
have non-blocking collective calls. This should settle this issue.
Tags or no tags
---------------
If one has tags in non-blocking collective ops than you can mix the
order of calls on different processes as with point-2-point. If not,
then you need to use the order of calls to match collective
operations. For good or bad, MPI-1 does not have tags in the blocking
calls. Mostly for this reason, the group voted not to have tags in
the non-blocking calls. MPI will match calls by order of execution.
If on a threaded system then the user will have to be careful about
locks. This only applies to the same collective call. MPI will have
to separate different collective calls.
Persistant collective ops
-------------------------
Since the overhead on collective ops is generally higher than pt-2-pt
and since the time spent to optimize them is potentially higher, the
group thought this was a good idea. It needs a good proposal to flush
it out.
Mcast
-----
It seems that the current proposal does not allow one to give a list
of processes to do the mcast to. This seemed like one of the primary
drivers for using this - the ability to do a bcast over subsets of
processes without creating a subcommunicator. This whole call is for
compatibility reasons. If this is not possible, then the question
raised is does this do what users want. Since the committee only want
this for users, we have to make sure we meet users needs or just drop
it.
Inter-comm ops
--------------
The group thought these were ok. The objection that they lacked
symmetry with intra-comm was not a problem. Thinking of them more as
acting on a single group helps with this. The fact that changing from
an inter-comm to and intra-comm may cause problems was viewed as ok.
Finally, since the inter-comm ops are somewhat different, the group
thought that we should consider using different names. However, the
feeling about this was mixed.
Min # of outstanding requests
-----------------------------
This was viewed as a quality of implementation issue. The group
didn't think specifying a value was a good idea.
Cancelling collective operations
--------------------------------
Since we are persuing non-blocking collective operations, the issue of
cancelling these operations has come up. Reasons discussed for
wanting this feature is symmetry with cancel in pt-2-pt in MPI-1,
fault tolerance, and getting back resources from MPI. The only
negative argument given against is that it is tough to do. Three
basic options were discussed:
- At one end of the spectrun is a global cancel operation. If one
process issues a cancel to a collective operation then MPI has to
clean up the participation of all processes involved. This was not
supported by most for several reasons. First, it is very hard to do.
What does MPI do about nodes that have never called the collective
operation but are in the communicator. Second, there are many
potential race conditions. Third, unlike pt-2-pt, it is not as simple
to say that the operation will either be cancelled or completed. In
pt-2-pt it says that if cancelled then it is as if the operation never
occured. Since a collective operation may partially complete but
never be able to finish (e.g., the log spanning tree begins but some
members are missing) this cannot be done.
- The other end of the spectrun is not to allow cancel. This was a
practical solution from the MPI implementors point of view but because
of the above discussion not a clean design for MPI.
- The middle solution is to treat cancel as a local operation. If a
process issues a cancel then all the resources associated with that
collective operation on that process will be freed up. This can
happen by completing the operation or cancelling all operations
left that are associated with the operation. Logically (but not
necessarily in the implementation) the cancel is like issuing a cancel
on each pt-2-pt operation one uses to perform the collective operation.
The only difference than pt-2-pt is some may have already
started/completed so buffers can be left in an indetermined state.
Potential negatives is it would be easy to hang a code by only
cancelling some of the calls on different processes. The vision was
that the programmer would have to make sure that all processes that
had started the collective operation would need to call cancel to be
sure that it is safe. One problem raised is that the current proposal
allows one to mix blocking and non-blocking calls so one cannot call
cancel for processes that issued the blocking call. This middle
ground proposal got the most interest but there are still several
issues that need resolution.