Re: generalized request

Steve Huss-Lederman (lederman@cs.wisc.edu)
Fri, 20 Sep 1996 15:36:34 -0500

I am very glad to see that someone is thinking about the external
chapter. I'll give my first thoughts on the comments.

>
> In an effort of trying to use the new general request to layer I/O on top
> of MPI-2, I have a couple of questions on generalized request.
>
> 1) Two more callback functions may be needed, wait_fn and test_fn.
>
> wait_fn. This callback function is invoked when the MPI_Wait is callen on
> the request.
> test_fn. This callback function is invoked when the MPI_Test is called on
> the request.
>
> typedef MPI_Wait_function void()(MPI_Request meta_req,
> MPI_Request request, MPI_Status *status)
>
> typedef MPI_Test_function void()(MPI_Request meta_req,
> MPI_Request request, MPI_Status *status)
>
> This is useful for systems that supports asyn I/O but does not support
> notification. That is I/O is done asynchronisely, but you have to poll
> to find out whether the I/O is done. (And you don't want to spin a thread
> to do the polling).

I want to make sure I understand the desired functionality. I
interprete this to mean that you need external checking to know when
progress has occured. If you simply call Wait/Test it will not finish
because the request will not be marked as done. I think we are
talking about a capability that is more than GR (generalized request)
does right now. Currently, it is assumed that MPI can progress the
request. This means that through its own polling (with threads) or
when an MPI function is called, MPI causes the request to make
progress (if it can). Here it seems you want progress outside of MPI
to happen. If this is the case, why not create a new user function
that is io_test. In io_test you first do what you need to do to cause
progress (may check check for done), do a MARK_Complete and then call
MPI_Test?

>
> 2) How to return error is not very well defined. Is there an error field in
> MPI_Status that the user can get at?

I can clarify the text. Each MPI_Status object has a non-opaque field
that holds the error information. It is accessed via status.MPI_ERROR in
C and STATUS(MPI_ERROR) in Fortran.

>
> 3) May be it is close enough to the final draft that I can talk a little
> about binding issues. There are these 6 calls in the draft
>
> MPI_Set_request_tag(request, tag)
> MPI_Set_request_source(request, soruce)
> MPI_Set_request_error(request, error)
> MPI_Set_request_count(request, count, datatype)
>
> MPI_Set_status_count(status, count, datatype)
>
> MPI_Request_mark_complete(request)
>
> It is probably preferable to have just this 3 function
>
> MPI_Set_status_error(status, error)
> MPI_Set_status_count(status, count, datatype)
>
> MPI_Request_mark_complete(request, status)

The fundamental difference seems to be between associating the info
with the request or the status. With the proposed system you save 4
functions (you don't need MPI_Set_status_error since error is not
opaque). The tradeoff is that you need to have a status at the
MPI_Request_mark_complete. When you do a Test/Wait, what is put in
the supplied status? I assume it is the same info that was provided
in the status at the MPI_Request_mark_complete. I think the current
design is cleaner since you only have 1 status. You need to pass or
create a status object where ever MPI_Request_mark_complete is called
in the new design. I am happy to hear counter arguments.

>
> 4) I also argue that you will need MPI_Status_create and MPI_Status_free
> for layerability.

I don't think you need such functions. You can create a status object
by mallocing the MPI_Status structure or with static declaration if
you are in Fortran and cannot malloc. You do something like:

status = (MPI_Status)malloc((size_t)(sizeof(MPI_Status))

Steve