Re: eof/file-size consistency semantics

Jean-Pierre Prost (jpprost@watson.ibm.com)
Tue, 22 Apr 1997 15:14:00 -0400

Basically, with your revised definition, you call "end of file"
what I call "end of view". However, as you mention yourself,
this new definition, specifying that the end of file might be
different for different processes (according to their own
view of the file), does not seem to be the notion that people
commonly have of the end of file, which used to be unique for any
given file.
Therefore, I would prefer to maintain the terms "end of
view" for this new concept.
Consequently, there is no real need to define the end of file. However, if
you think that we should define it for historical reasons, it would
be "the position of the byte following the last byte in the file."

Assuming you keep both definitions:
In the last paragraph of the "Data Access Conventions" section,
on 233/24-25, we could use either end of file or end of view, and
in seek operations, end of view should be used.

Jean-Pierre

ies @ nas.nasa.gov
04/22/97 02:42 PM

To: mpi-io @ mcs.anl.gov
cc: (bcc: Jean-Pierre Prost/Watson/IBM Research)
Subject: Re: eof/file-size consistency semantics

John May wrote:
> So seeking with MPI_SEEK_END can drop you into the middle of
> a hole? And presumably the next read or write would access
> that spot?

Whoosh... I hear the sound of a missed point going whizzing by my ear.
I answered your first question incorrectly. MPI_SEEK is written in
terms of offsets, and I was thinking displacements when I replied.

I agree with the point implicit in your note. There is a problem with
the first version of the definition in that it is in terms of a
displacement (bytes, absolute) rather than an offset (etypes, current
view). MPI_SEEK, of course, uses offsets. If the definition is
changed to something like:

Definitions:
-----------------------------------------------------------------
The {\it size} of an \MPI/ file is measured in bytes from the
beginning of the file. A newly created file has a size of zero
bytes. Using the size as an absolute displacement gives
the position of the byte immediately following the last byte in
the file. For any given view, the {\it end of file} is the next
accessible etype following this byte. The file size, and thus the
end of file, may vary between different processes subject to the
consistency semantics given in \ref{sec:io-consistency-filesize}.
=================================================================

then it works much better (correctly, even...) with SEEK. Suggestions
for further improvements are encouraged.

> If the choice is between and easy-to-write definition that
> gives you this behavior and a hard-to-write (or hard-to-
> read) definition that keeps you aligned with your filetype,
> I'd definitely choose the latter.

Indubitively. The definition is now slightly longer, and much more
useful.

Jean-Pierre followed up:
> Hope this clarifies things. It seems to me that it will be difficult
> to get rid of the concept of end of view.

I think the revised definition repairs the confusion. (My reply to
John was incorrect in terms of both the current text and my own
proposed modifications).

It seems like we may have a choice between defining an "end of file"
as above, or defining the "end of view" as above and not using the
term "end of file". EOF has both the advantage, and the disadvantage,
that people already know what it means.

-Ian

P.S. Vis a vis the "whole filetypes/no etypes" question, this text
would need to be changed by substituting "filetype" for
"etype".