eof/file-size consistency semantics

Ian E. Stockdale (ies@nas.nasa.gov)
Sat, 19 Apr 1997 16:26:52 -0700 (PDT)

There have been two similar proposals floated by John May and
Bill Nitzberg for sharpening the the end of file / file size
consistency semantics. I enclose below proposed text for
implementing these proposals. Please send me comments on
any errors and/or omissions which you might find.

Thanks,
Ian

Definitions:
-----------------------------------------------------------------
The {\it size} of an \MPI/ file is measured in bytes from the
beginning of the file. A newly created file has a size of zero
bytes. Using the size as a displacement gives the position of
the byte immediately following the last byte in the file. This
location is the {\it end of file}.
The file size, and thus the end of file, may vary between
different processes subject to the consistency semantics
given in \ref{sec:io-consistency-filesize}.
=================================================================

Consistency and semantics - file consistency - file size
-----------------------------------------------------------------
\subsubsection{File Size}
%--------------------------
\label{sec:io-consistency-filesize}

The size of a file may be increased by writing to the file after the
current end of file. The size may also be changed by calling
\MPI/ {\it size changing} routines,
such as MPI\_File\_set\_size. A call to a size changing routine
does not necessarily change the file size. For example, calling
MPI\_File_preallocate with a size less than the current size does
not change the size.

Consider a set of bytes which has been written to a file since
the most recent call to a size changing routine,
or since MPI\_Open if no such routine has been called.
Let the {\it high byte} be the byte
in that set with the largest displacement. The file size
is the larger of
\begin{itemize}
\item One plus the displacement of the high byte.
\item The size immediately after the size changing routine,
or MPI\_Open, returned.
\end{itemize}

In applying consistency semantics to determine the file size, calls to
MPI\_File\_set\_size and MPI\_File\_preallocate are considered writes
to the file.

\begin{users}
Any sequence of operations containing the collective routines
MPI\_File\_set\_size and MPI\_File\_preallocate is a write sequence.
As such, sequential consistency in non-atomic mode is not
guaranteed unless the conditions in \ref{sec:io-filecntl-atomicity}
are satisfied.
\end{users}
Significant text modifications in -

MPI_Set_size:
-----------------------------------------------------------------
Delete the first sentence from the first paragraph. It
now reads:

If \mpiarg{size} is smaller than the current file size,
the file is truncated at the position defined by \mpiarg{size}.
The implementation is free to deallocate file blocks located
beyond this position.

Modify last sentence of last paragraph. It now reads:

All nonblocking requests
and split collective operations
on \mpiarg{fh} must be completed
before calling \func{MPI\_FILE\_SET\_SIZE}.
Otherwise, calling \func{MPI\_FILE\_SET\_SIZE} is erroneous.
The consistency semantics governing
\func{MPI\_FILE\_SET\_SIZE} are described in
Section \ref{sec:io-consistency-filesize}.
=================================================================

misc:
-----------------------------------------------------------------
In MPI_Seek:
Modify to obtain:
\const{MPI\_SEEK\_END}:
the pointer is set to the end of file plus \mpiarg{offset}
Delete definition of end of view.

In MPI_Seek_shared:
Modify to obtain:
\const{MPI\_SEEK\_END}:
the pointer is set to the end of file plus \mpiarg{offset}
Delete definition of end of view.

In "random vs. sequential files":
"end-of-file" becomes "end of file"
=================================================================