Hello all,
Here are my suggestions from yesterday’s forum meeting. I have added
two more issues in addition to the two issues I raised yesterday I
re-iterate the ABI issue and raises a new one related to
globalization. All these suggestions are targeting the “minor”
revision that Rich was talking about.
Your comments are welcome.
ISSUE 1: const correctness in the C and C++ bindings
const correctness guarantees is missing from the C and C++ bindings
for MPI. This means that the contract between the user and the MPI
library is weaker that it should be. Users need to cast away
const’ness before calling MPI and compilers cannot optimize the
caller or the library implementation taking into account the
const’ness of the parameter.
I suggest adding const keyword to the C and C++ MPI bindings
wherever appropriate. There is no impact on backward compatibility:
already compiled programs will run without a problem with new
dynamically loaded libraries. Recompiled programs should see no new
compilation errors or warning since the const keyword guarantees a
stronger contract.
Few examples; the following interfaces,
int MPI_Send(void* buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm);
int MPI_Gatherv(void* sendbuf, int sendcnt, MPI_Datatype sendtype,
void* recvbuf, int*recvcnts, int* displs, MPI_Datatype recvtype, int
root, MPI_Comm comm)
should be declared as,
int MPI_Send(const void* buf, int count, MPI_Datatype datatype, int
dest, int tag, MPI_Comm comm);
int MPI_Gatherv(const void* sendbuf, int sendcnt, MPI_Datatype
sendtype, void* recvbuf,const int* recvcnts, const int* displs,
MPI_Datatype recvtype, int root, MPI_Comm comm)
ISSUE 2: MPI is not secure
The C runtime library (CRT) has gone through a revision to enable
better secure interface for many of its functions. For example
strcpy is considered unsecure and a new function namedstrcpy_s has
been defined. This new function takes extra parameter that specifies
the destination buffer size, and returns errno_t. This change helps
implementation avoiding buffer overruns.
Similarly the MPI interface does not enable application to secure
their implementations by specifying the actual buffer size. The
actual buffer size must be calculated using the datatype and the
counts; which may mismatch what the programmer is thinking the size
should be or actually providing to the call.
I suggest adding a secure flavor for some of the API’s.
For example, the MPI_Recv,
int MPI_Recv(void* buf, int count, MPI_Datatype datatype, int
source, int tag, MPI_Comm comm, MPI_Status* status);
can be extended by adding a secured call,
int MPI_Recv_s(void* buf, MPI_Aint byte_count, int count,
MPI_Datatype datatype, int source, int tag, MPI_Comm comm,
MPI_Status* status);
An MPI implementation can verify that count x datatype is contained
within the address range [buf, buf+byte_count).
ISSUE 3: Globalization
The MPI standard speaks noting about Globalization/
Internationalization, although that some of the API’s takes a
character string as input/output parameter. This makes the behavior
of MPI with characters outside the ASCII character set undefined.
Suggestion 3.1
The spec should declare the strings in the existing API’s as UTF-8.
Thus, MPI implementations would need to update to be able to handle
UTF-8 strings.
Note: UTF-8 compatible to ASCII within the ASCII character range.
Suggestion 3.2
Add a Unicode flavor for the API’s taking a string. For example
int MPI_Comm_set_name(MPI_Comm comm, char* comm_name);
can be extended as,
int MPI_Comm_set_name_u(MPI_Comm comm, const wchar_t* comm_name);
using ‘_u’ suffix for Unicode and const for const correctness.
ISSUE 4: ABI
I know that this issue has been discussed before; however with the
success of MPI this issue is becoming more and more important to
ISV’s. From the feedback I’m getting back from vendors the most
important issues are related to the C and FORTRAN bindings. The
issues are,
Calling Convention
There is a need to standardize the calling convention, that is,
__cdecl, __stdcall and parameter passing order on different
architecture.
I think that the problem is easier to solve for x64 (where there is
only one calling convention, fastcall) and for C language bindings
where the C standard enable decorating the function with the calling
convention.
I suggest that the standard would harden the language bindings
calling convention.
Parameter Size
The standard left it open for the implementation to define the
baseline type for some common types like MPI_Comm. This is fine, but
the problem with that approach is that it is very difficult for
vendors to use different implementations (especially with late/
dynamic binding to the library). The standard can do better by
defining the size of these types. For example it can define MPI_Comm
as equal in size to MPI_Aint (which allows pointers or integers to
be used as handles).
I suggest that the standard would define the size of the various
types.
Other issues, like the MPI library name to link with, be it static
or a dynamic library seems to be less of an issue for the vendors
I’ve talked to.
Thanks,
.Erez
Microsoft