sequential I/O

Terry R. Jones (trj@nimble.llnl.gov)
Tue, 4 Feb 97 11:07:11 PST

On Jan-10-97, Nick Maclaren (nmm1@cam.ac.uk) proposed additions
to the I/O chapter with the intent of improving support for
sequential I/O. The proposal was especially interesting since
it incorporated Nick's extensive experience with operating systems
other than Unix.

Due primarily to time constraints, we were not able to adequately
discuss the proposal as part of the formal proceedings of the January
meeting. However, a number of the I/O sub-committee did get together
for an extended informal discussion regarding the proposal after hours.
(Hey, who needs sleep :-))

At this informal discussion, there was general agreement that sequential
I/O was important and should be addressed -- but a number of questions
were raised. It seemed to us that most/all of the functionality could be
provided by coding the applications to use the following two items:

1) shared file pointers as currently specified (i.e. every node
must have the same filetype).

2) adding a new info (say "MPI_IO_MATCHED") to the open info list.
Rational
Since the read and write routines have datatype and count,
a new info could be added to allow applications to specify
when the datatype and count will be the same for collective
operations. This allows the potentially important optimization
that collective op may start once the first member has checked
in.

For example, on collective writes you can avoid synchronization
by doing memcopies of each node's buffers into a single large
contiguous buffer as the nodes checkin -- but the contents
for any given node can only flushed from the single large
contiguous buffer to a pipe or tape once all the nodes below
it have checked in. A similar strategy applies to collective
reads.
End Rational

Unfortunately none of the people present were able to say with certainty
that the above two items capture the necessary functionality (especially
for non-Unix concerns). There was concern that applications which
exhibited a "blocked-sequential" access style (the sequential accesses
occur in groups of operations which form the file as a collection of
indistinguishable sequential portions) may find item 1) overly restrictive.

If there are reasons that the above two items are insufficient, it would
be desirable to address them and get the necessary functionality into the
standard if at all possible. Since we need the proposal by Friday Feb-7-97,
it may be necessary to organize a conference-phone-call. I'm willing
to do that if people think that it would be beneficial. Nick, how do you
suggest we proceed?

-terry