more non-blocking collective discussion

W. Saphir (wcs@nersc.gov)
Wed, 5 Feb 1997 18:03:06 -0800 (PST)

I need to put together a new dynamic chapter tomorrow and I don't
think we've come to a consensus on what to do about eliminating
non-blocking collective operations. Here is my summary of the
options.

1. Eliminate all non-blocking operations.

a. add something that gives the equivalent of iaccept functionality.
b. don't add anything else

2. Eliminate all but MPI_Iaccept

3. Punt port_open/accept/connect and go with MPI_Join

4. Replace with "two-face" versions

My assessment of these possibilities is (in reverse order):

4. Doesn't solve IACCEPT problem (as discussed in previous note).

3. May be a good proposal, technically, but dead until proven
otherwise (through an outpouring of support that hasn't materialized).
We have a mandate from the Forum to eliminate non-blocking operations,
but I don't think the mandate yet extends to a complete rewrite.

2. Simple, but doesn't solve any problems

1. Fairly easy to do, if not perfect. Despite Eric's assertion
that we don't need a real client/server interface, based on
previous discussions, I think people want the functionality
(though we can debate this).

***************
After a careful reading of the current text, I believe the
current proposal (without IACCEPT) does what we want without
any changes, or with only minor ones.
(In particular, I don't think we need MPI_LISTEN and friends).
***************

Current text says:

Under iconnect: "If the port exists, but does not have a pending
ACCEPT, the connection attempt will eventually time out after an
implementation-defined time, or succeed when the server calls
ACCEPT. In the case of a time out, MPI_CONNECT returns a soft error of
class MPI_ERR_PORT and sets newcomm to MPI_COMM_NULL. "

(Side issue: We don't currently have soft errors, so I will eliminate "soft"
and "sets newcom to MPI_COMM_NULL")

Then under advice to implementors:
"The time out period may be arbitrarily short or long. A high quality
implementation will ensure that a CONNECT will usually succeed if the
ACCEPT happens at "about the same time." A high quality
implementation may also provide a mechanism, through the info
arguments to PORT_OPEN, ACCEPT and CONNECT, for the user to specify
this behavior.

This text seems a little weird, but was arrived at after extensive
discussions, and I have begun to believe that it was wise, though it
could be tweaked slightly.

The text says explicitly that connection attempts without a
corresponding ACCEPT block for some period of time, as long as
a port is open. It doesn't provide any guarantees, but
certainly a high-quality implementation is allowed to block
the client for an arbitrarily long time while the server
processes the previous request.

I have done a few tests with one particular socket implementation
on one machine, and the behavior is as follows:

- a socket+bind+listen establishes a queue of finite length.
- a connection attempt on such a socket will *succeed* whether
or not there is an accept, if there is room in the queue. I mean
*succeed* in the sense that connect() does not block and returns
a valid socket() [Note: this is not what we want in MPI, where CONNECT()
must not return until there is a corresponding ACCEPT()].
- if the queue is full, a connection attempt will block, and then
time out after about a minute.

So a natural implementation of the current MPI specification
on top of sockets will work fine, and will allow a server
to process multiple requests, with clients blocking until the
server can get around to them.

I would like to tweak the advice to implementors slightly to
deemphasize the usefulness of implementations with very short
timeouts, and to mention the desirability of queueing connection
requests. So my suggestion would be to replace the advice to
implementors with:

"The time out period may be arbitrarily short or long. However, a high
quality implementation will try to queue connection attempts so that a
server can handle simultaneous requests from several clients. A high
quality implementation may also provide a mechanism, through the info
arguments to PORT_OPEN, ACCEPT and/or CONNECT, for the user to specify
control timeout and queing behavior"

=============================
Summary (for those who got lost in the text above):

My proposal is:

1. remove all nonblocking calls as requested by the Forum.
2. leave the other text as is, except change the "advice to
implementors" under "connect" as above.

If there is no discussion, I'll go ahead and make these changes
(saving the deleted text of course) for the Friday version of the
document.

Bill