[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [mpi-21] MPI Forum suggestions 11/14



Short answers:

1. const: Sure, this seems like a good idea. But it *is* an API change, and I am in agreement with Bill Gropp that there should be no API changes for MPI 2.1 (except for bug fixes). Even if the API changes supposedly won't matter (because someone will find a case where it does matter). More specifically: this is a slippery slope for 2.1. Making one "minor" change will lead to more minor changes which will lead to ...

2. byte_count: This seems both redundant and syntactic sugar for a #define that a user could implement. Additionally, what if the buffer is not contiguous -- what should byte_count be? Let's not fall into the same morass as the MPI extent, true extent, etc.

3. Globalization: Just like #1 -- seems like a good idea, but I'm quite hesitant to add it to 2.1. FWIW: when we started the Open MPI project, we had i18n as an explicit goal for all of our help messages, etc. We even did a bunch of preliminary work in the i18n arena. We were then *overwhelmingly* told by all the non-native English speakers on the project (spanning many countries in Europe and Asia) that they did not want non-english capabilities because English is the language of HPC. I was quite surprised at the vehemence of the response that we got to *not* i18n-ize Open MPI. Can someone cite strong customer demand for internationalization? (an honest question) Note: I'm not asking about corporate policies; I'm asking about actual customer demand.

4. ABI: this is a Very Large Debate(tm), and likely worthy of face-to- face discussion (can we avoid a repeat of weeks of 10-page e-mails on the subject? See the old Beowulf list threads). For the same reasons as #1 and #3, it seems dangerous to discuss this within the context of 2.1. [Further] Discussion on ABI can certainly occur -- there are a lot of passions about this topic on both sides -- but I do not think that it is appropriate for 2.1. MPI 2.1 should be strictly limited to MPI-1 and MPI-2 bug fixes only and putting out new, consolidated documents (IMHO).

5. New language bindings (from Tony's mail): ditto to #1 and #3. Can someone cite strong customer demand for new language bindings? (an honest question)



On Nov 15, 2007, at 1:36 PM, Erez Haba wrote:

Hello all,

Here are my suggestions from yesterday’s forum meeting. I have added two more issues in addition to the two issues I raised yesterday I re-iterate the ABI issue and raises a new one related to globalization. All these suggestions are targeting the “minor” revision that Rich was talking about.

Your comments are welcome.

ISSUE 1: const correctness in the C and C++ bindings
const correctness guarantees is missing from the C and C++ bindings for MPI. This means that the contract between the user and the MPI library is weaker that it should be. Users need to cast away const’ness before calling MPI and compilers cannot optimize the caller or the library implementation taking into account the const’ness of the parameter.


I suggest adding const keyword to the C and C++ MPI bindings wherever appropriate. There is no impact on backward compatibility: already compiled programs will run without a problem with new dynamically loaded libraries. Recompiled programs should see no new compilation errors or warning since the const keyword guarantees a stronger contract.

Few examples; the following interfaces,

int MPI_Send(void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm);

int MPI_Gatherv(void* sendbuf, int sendcnt, MPI_Datatype sendtype, void* recvbuf, int*recvcnts, int* displs, MPI_Datatype recvtype, int root, MPI_Comm comm)

should be declared as,

int MPI_Send(const void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm);

int MPI_Gatherv(const void* sendbuf, int sendcnt, MPI_Datatype sendtype, void* recvbuf,const int* recvcnts, const int* displs, MPI_Datatype recvtype, int root, MPI_Comm comm)


ISSUE 2: MPI is not secure
The C runtime library (CRT) has gone through a revision to enable better secure interface for many of its functions. For example strcpy is considered unsecure and a new function namedstrcpy_s has been defined. This new function takes extra parameter that specifies the destination buffer size, and returns errno_t. This change helps implementation avoiding buffer overruns.


Similarly the MPI interface does not enable application to secure their implementations by specifying the actual buffer size. The actual buffer size must be calculated using the datatype and the counts; which may mismatch what the programmer is thinking the size should be or actually providing to the call.

I suggest adding a secure flavor for some of the API’s.

For example, the MPI_Recv,

int MPI_Recv(void* buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status* status);

can be extended by adding a secured call,

int MPI_Recv_s(void* buf, MPI_Aint byte_count, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status* status);

An MPI implementation can verify that count x datatype is contained within the address range [buf, buf+byte_count).

ISSUE 3: Globalization
The MPI standard speaks noting about Globalization/ Internationalization, although that some of the API’s takes a character string as input/output parameter. This makes the behavior of MPI with characters outside the ASCII character set undefined.


Suggestion 3.1
The spec should declare the strings in the existing API’s as UTF-8. Thus, MPI implementations would need to update to be able to handle UTF-8 strings.
Note: UTF-8 compatible to ASCII within the ASCII character range.


Suggestion 3.2
Add a Unicode flavor for the API’s taking a string. For example

int MPI_Comm_set_name(MPI_Comm comm, char* comm_name);

can be extended as,

int MPI_Comm_set_name_u(MPI_Comm comm, const wchar_t* comm_name);

using ‘_u’ suffix for Unicode and const for const correctness.


ISSUE 4: ABI
I know that this issue has been discussed before; however with the success of MPI this issue is becoming more and more important to ISV’s. From the feedback I’m getting back from vendors the most important issues are related to the C and FORTRAN bindings. The issues are,


Calling Convention
There is a need to standardize the calling convention, that is, __cdecl, __stdcall and parameter passing order on different architecture.
I think that the problem is easier to solve for x64 (where there is only one calling convention, fastcall) and for C language bindings where the C standard enable decorating the function with the calling convention.


I suggest that the standard would harden the language bindings calling convention.

Parameter Size
The standard left it open for the implementation to define the baseline type for some common types like MPI_Comm. This is fine, but the problem with that approach is that it is very difficult for vendors to use different implementations (especially with late/ dynamic binding to the library). The standard can do better by defining the size of these types. For example it can define MPI_Comm as equal in size to MPI_Aint (which allows pointers or integers to be used as handles).


I suggest that the standard would define the size of the various types.


Other issues, like the MPI library name to link with, be it static or a dynamic library seems to be less of an issue for the vendors I’ve talked to.


Thanks,
.Erez
Microsoft




--
Jeff Squyres
Cisco Systems