Re: caching and naming other handles

Rajeev Thakur (thakur@mcs.anl.gov)
Fri, 6 Dec 1996 11:19:21 -0600

> I would like to raise new issue. At the last meeting we voted to
> restrict naming to communicators and put the caching capability on new
> handles (other than communicators) in the JOD. I view these two
> decisions as related. Several people have expressed regret at these
> decisions. I would be interested in gathering examples and reasons to
> support putting the capability back on other handles and which handles
> are really needed. I would also be interested in hearing from
> implementors that these capabilities are or are not a problem. I
> think we can make a push at the next meeting to reverse the decision
> (if that is what comes out of the discussion), but we need to be armed
> and ready this time.

To refresh people's memories, I am resending the mail I had sent
a couple of months ago for caching on data types and file handles...

--------------------------------------
At the last MPI meeting, the Forum voted out the section on "Caching
on MPI Handles" in the external interfaces chapter. Part of the reason
may be that the section allowed caching on all MPI handles.
Though it may not be necessary to support caching on all handles, it
is definitely needed on some handles. As explained below, I need
caching on datatypes in my MPI-IO implementation. I can also see a
need for caching on file handles. Therefore, I suggest that at the
next MPI meeting we reconsider the issue of allowing caching on
select MPI handles. My short list includes datatypes and file
handles. I heard that there may be some technical difficulties in
caching on datatypes, but those need to be addressed.

Rajeev

-----------for those interested in the details----------

1. Caching on datatypes:

In my "layerable" implementation of MPI-IO, I create a flattened
version of derived datatypes, and I need to store it somewhere for use
later in the program. The ideal way would be to cache the flattened
version on the datatype itself, so that when the datatype is freed,
the flattened version is also freed. Since caching on datatypes is not
currently available, I maintain a global linked list (maybe a hash
table later on) of all flattened datatypes, indexed by the datatype
handle. In order that this list gets freed at the end of the program,
I maintain an attribute on MPI_COMM_WORLD (could be
MPI_COMM_SELF---anything that would be freed by MPI_Finalize) and
provide a delete function.

This works fine, except that when the user frees a derived datatype,
the flattened version doesn't get freed automatically. If the system
reuses the same file handle for a newly created derived datatype, it
will match the key for the flattened version of the old datatype in my
linked list!

One solution to this problem is to define a new MPI function,
MPI_Type_free_callback(datatype, callback_fn),
which allows me to supply a callback function that will be called when
the datatype is freed. The callback function will delete the flattened
version in my linked list.

A cleaner solution, I think, is to support caching on datatypes.
Wasn't attribute caching intended for these kinds of purposes in the
first place?

2. Caching on file handles:

In the I/O chapter, there is a proposal by Leslie Hart on I/O
Service Interception (Section 10.8). This proposal allows the user to
add alternate service routines for each of the I/O functions in the
chapter. This is intended to support special purpose I/O services
such as remote I/O, data compression, etc. The ability to cache on
file handles is essential for this proposal to work.

In general, I think special-purpose I/O libraries built on top of
MPI-IO would need caching on file handles.