The inetd model includes a deamon process that advertises well known services
and waits in select() for requests for those services arriving via sockets.
When a request arrives inetd creates a new process to service the request
passing along the socket to be used to communicate with the client. The new
server process may have to detect when the client process arbitrarly closes
the socket.
/* "inetd" MPI server */
#define MAX_SERVERS 16
{
int index;
MPI_Port stock_ticker other_ports;
MPI_Request *new_req, *wait_reqs[MAX_SERVERS];
MPI_Comm *new_comm, mycomm;
MPI_Status status;
.
. /* Initialization stuff */
.
MPI_Port_init("stock_ticker",&stock_ticker);
/* I think this name stuff is a big hole unless we either
* piggy back on existing IP stuff, explicitly using
* IP socket addresses or re-implement it all again in MPI
* The equivilent socket routines specifying "MPI" protocol:
*
* struct servent stock_service;
* struct hostent stock_host;
* char hostname[MAXHOSTNAMELEN];
* char MPI_server_name[64];
*
* stock_service = getservbyname("stock_ticker","MPI");
* gethostname(hostname,MAXHOSTNAMELEN);
* stock_host = gethostbyname(hostname);
* sprintf(MPI_server_name,"%s:%s",stock_host.h_addr,
* stock_service.s_port);
* MPI_Port_init(MPI_Server_name,&stock_ticker);
*/
new_req = (MPI_Request *)malloc(sizeof(MPI_Request));
new_comm = (MPI_Comm *)malloc(sizeof(MPI_Comm));
MPI_Iaccept(mycomm,stock_ticker,rank_0,newcomm,new_req);
/* The last forum discussed not allowing caching on MPI_Request
* types for performance reasons (send and recv) so I have used
* a request cache (req_cache) below that is an array of pointers
* to structures containing information I want to associate with
* specific MPI_Requests.
*/
for (i = 0; i < MAX_SERVERS); i++ ) {
wait_reqs[i] = NULL;
}
num_reqs = 0;
for (i = 0; i < MAX_SERVERS); i++ ) {
if (req_cache[i]->used == 0) {
wait_reqs[i] = new_req;
req_cache[i]->used++;
req_cache[i]->class = ACCEPT;
req_cache[i]->comm1 = new_comm;
req_cache[i]->req = new_req;
req_cache[i]->port = &stock_ticker;
req_cache[i]->server = "stockd";
num_reqs++;
break;
}
}
* "Other" services provided */
MPI_Port_init("other_services",&other_ports);
new_req = (MPI_Request *)malloc(sizeof(MPI_Request));
new_comm = (MPI_Comm *)malloc(sizeof(MPI_Comm));
MPI_Iaccept(mycomm,other_ports,rank_0,new_comm,new_req);
for (i = 0; i < MAX_SERVERS); i++ ) {
if (req_cache[i]->used == 0) {
wait_reqs[i] = new_req;
req_cache[i]->used++;
req_cache[i]->class = ACCEPT;
req_cache[i]->comm1 = new_comm;
req_cache[i]->req = new_req;
req_cache[i]->port = &other_ports;
req_cache[i]->server = "otherd";
num_reqs++;
break;
}
}
* Now wait for client requests */
while (1) {
MPI_Waitany(num_reqs,wait_reqs,&index,status);
switch(req_cache[index]->class) {
ACCEPT:
{ /* Spawn new server */
new_req = (MPI_Request *)malloc(sizeof(MPI_Request));
new_comm = (MPI_Comm *)malloc(sizeof(MPI_Comm));
/* Notes on this Ispawn call below */
MPI_Ispawn(req_cache[index]->server,
args, 1, "where",
rank_0,
req_cache[index]->comm1,
new_comm,
req_cache[index]->req);
req_cache[index]->class = SPAWN;
req_cache[index]->comm2 = new_comm;
/* repost Accept for this service */
new_req = (MPI_Request *)malloc(sizeof(MPI_Request));
new_comm = (MPI_Comm *)malloc(sizeof(MPI_Comm));
MPI_Iaccept(mycomm,stock_ticker,rank_0,newcomm,new_req);
for (i = 0; i < MAX_SERVERS); i++ ) {
if (req_cache[i]->used == 0) {
wait_reqs[i] = new_req;
req_cache[i]->used++;
req_cache[i]->class = ACCEPT;
req_cache[i]->comm1 = new_comm;
req_cache[i]->req = new_req;
req_cache[i]->port = &stock_ticker;
req_cache[i]->server = "stockd";
num_reqs++;
break;
}
}
break; /* ACCEPT */
}
SPAWN:
{ /* free MPI_Comm and post Notify request */
MPI_Comm_free(req_cache[index]->comm1);
free(req_cache[index]->comm1);
MPI_Inotify(MPI_Process_exit,1,
req_cache[index]->comm2,
req_cache[index]->req);
req_cache[index]->class = NOTIFY;
break; /* SPAWN */
}
NOTIFY:
{ /* Free communicators, requests and req_cache */
MPI_Comm_free(req_cache[index]->comm2);
free(req_cache[index]->comm2);
free(req_cache[index]->req);
/* Pack out the freed request from wait_reqs array */
save_cache_req = req_cache[index];
for (;index < (MAX_SERVERS - 1); index++) {
wait_reqs[index] = wait_reqs[index + 1];
req_cache[index] = req_cache[index + 1];
}
wait_req[MAX_SERVERS] = NULL;
req_cache[MAX_SERVERS] = save_cache_req;
req_cache[MAX_SERVERS]->used = 0;
num_reqs--;
break;
}
RECV:
{ /* Might be useful */
{}{}{} break;
}
SEND:
{ /* If Recv, must Send too! */
{}{}{} break;
}
} /* end of switch() */
} /* end of while(1) */
} /* end of inetd server */
Comments:
This seems fairly straight forward and a relatively close copy of the
existing inetd (IP) model. The client and stock_ticker server code
can be inferred based on the inetd server code shown above.
There are a couple of problems.
The first problem is with the spawn call to create the server process for
the connecting client. In order for the client to communicate with the
server being spawned it must participate in the spawn call, specifying as
input to MPI_Ispawn() the intercommunicator created by MPI_Iaccept().
It seems a little funny to have the client call MPI_Ispawn() to create
the server it is requesting service of. The server will talk to the client
via the MPI_PARENT_COMM. The client will talk to the server via the
intercommunicator returned by MPI_Ispawn(). MPI_Ispawn() must be able to
work with intercommunicators specified by the parent processes.
Can a intercommunicator be the input to a call that creates another
intercommunicator? Or put another way can an intercommunicator become
one side of a new intercommunicator?
The problem of how the client specifies the root rank of the
intercommunicator for MPI_Ispawn() is solved in the Extended Collective
Chapter, Section 6.4.3 "Rooted Example" of Intercommunicator Collective
operations.
This requires the "inetd" process that Accepts the client connection, send
its rank to the client so the client can use it in the collective
MPI_Ispawn() call. An interesting possibility since the MPI_Accept() call
specifies an input parameter "root" would be for the MPI_Connect() call
to specify a matching output parameter "root", or define that rank 0 of
the intercommunicator returned by MPI_Connect() is always the "root" of
the MPI_Accept().
An alternative to having servers talking to clients via MPI_PARENT_COMM
would be to have "inetd" spawn the server by itself and then send to both
the server and client a new "name" with which to establish a new MPI
connection:
sprintf(server_name,"%s:%s",hostname,next_avail_port);
MPI_Spawn(req_cache[index]->server, args, 1, "where",
rank_0, MPI_COMM_SELF new_comm);
MPI_Send(server_name,strlen(server_name),MPI_CHAR,0,
NEW_NAME_TAG,req_cache[index]->comm1);
MPI_Send(server_name,strlen(server_name),MPI_CHAR,0,
NEW_NAME_TAG,new_comm);
In this case the client would not participate in the MPI_Spawn call.
The client would call MPI_Connect() specifying the server_name given to
it by inetd. The new "stock_ticker" server would call MPI_Accept()
specifying the same server_name. Not really the current inetd model,
but it would work.
We still have not solved the problem of what MPI_Status means for all
these new requests. As a first step in that direction I would like to
propose a new routine for the External Interfaces Chapter:
MPI_Request_class(request,class)
IN request A MPI_Request structure
OUT class Defined constant specifying "class" of request.
This routine returns the class of an MPI_Request structure. Possible
classes are: MPI_SEND, MPI_RECV, MPI_ACCEPT, MPI_SPAWN, MPI_NOTIFY,
and MPI_INACTIVE. This routine can be used after a MPI_Waitany(),
MPI_Waitsome(), MPI_Waitall(), MPI_Testany(), MPI_Testsome(), or
MPI_Testall() call to determine what kind of request has completed.
This may also be useful to determine the validity of fields of
the corresponding MPI_Status structure.
Discussion:
There is already a proposed routine called "MPI_Request_type", used
for Generalized Requests, so this routine's name is proposed to be
MPI_Request_class. It could also be MPI_Get_request_class.
It can certainly be argued that there are or will be more classes
defined than those listed above.
joel clark