[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [mpi-21] ABI - call for working group



As far as I understand it, a morph layer such as MorphMPI or PnMPI is a valid implementation of MPI from the point of any application. The layer itself then chooses the local MPI implementation that it will use to implement MPI functionality. As such, the application can come pre-linked to its morph layer and the morph layer can do all the platform-specific work without the need to recompile anything. While I still see a need to standardize things like mpirun arguments, which is something the user actually sees, most of the other use-cases for the MPI ABI can be satisfied by open-source software that already exists: MorphMPI and PnMPI.

It seems to me that a more feasible MPI ABI proposal would exclude any features that solve problems that are already solved by the above morph layers. Instead, it should focus on things that actually cannot be done unless they're added to the standard. mpirun arguments is one. What are the others?

Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1@xxxxxxxx

At 10:35 AM 1/10/2008, William Yu wrote:
Quite a detailed justification.

I just curious why writing a "morph" layer is difficult because of the other libraries that use mpi is in use too?

Does this mean because mathlab needs (example) parallel blas and both of them need mpi is causing the difficulty?

I see linux device driver folk have the same issue but they seem to workaround it sufficiently well. Particularly those that involve both X and the linux kernel. (i worked on a capture card vendor in the past that distributed binary driver cores and open source wrappers).

Just curious because fixing an ABI obviously leaves less room for implementations.

Thanks.

________________ Reply Header ________________
Subject:        [mpi-21] ABI - call for working group
Author: Edric Ellis <Edric.Ellis@xxxxxxxxxxxxxxx>
Date:           January 10th 2008 5:21 pm

Hi All,

This is a call for organisations to support a working group working
towards defining an ABI for MPI. To make this happen, I need the support
of 3 other organisations.

I realise that this is a large issue with many facets, many of which I
don't understand fully (although I have tried to work through the
various sources of information out there - the various mailing list
discussions, and stuff like Greg Lindahl's paper). However, I hope that
a working group would be able to develop a reasonable proposal. Below I
include an outline from the perspective of The MathWorks as to what we
would like to see:

1. Motivation

Many current users of MPI are perfectly happy to recompile their
application to use different MPIs. However, a growing number of users
are unable or unwilling to recompile an application to use a different
MPI. Additionally, parallel processing is becoming increasingly
important to software vendors since they can no longer rely on
increasing single-core performance.

Since it's the case I'm most familiar with, I will describe how MATLAB
attempts to handle this sort of situation:
1.1 An example application: MATLAB

MATLAB is introducing explicit parallelism as our customers demand ever
larger data sets and ever higher performance. One part of the value of
our software is that we bring together various 3rd party libraries and
put them into a convenient environment for scientists and engineers.
Wherever possible, we prefer to allow our users to substitute their own
favourite versions of BLAS, LAPACK etc.
1.1.1 Why MATLAB can't currently use all MPIs

Here are the main difficulties encountered when trying to make MATLAB
work with the widest range of available MPI implementations:

*       We do not support customer recompilation of any part of MATLAB,
therefore any library that we use must be binary compatible with the one
that we ship
*       We do support a "de facto" ABI defined by the implementation
that we ship, several other MPIs match that one (but many don't).
*       Adding a "morph" layer is not simple, since we also build on
many other libraries which use MPI (such as BLACS)
*       Our choice of "de facto" ABI is limited by the fact that we only
want to build and qualify one MPI across our 6 supported (commodity)
platforms.
*       Aside: we provide means of dealing with the vagaries of
differing mpirun/mpiexec schemes, and have a means of bypassing that
altogether using MPI_Comm_connect/accept.

1.1.2 What MATLAB expects to handle

We fully expect to handle many of the issues relating to switching
libraries:

*       we give a flexible means of specifying which binary to select

*       we attempt to avoid making assumptions about the details of the
MPI   implementation (i.e. we stick closely to the standard)

2. Elements of a solution

2.1 Scope

Limit the scope to only those situations where it sensibly applies. This
may mean that it is not feasible to produce a solution for applications
written in Fortran due to the following compiler issues:

1.      name mangling issues (MPI_INIT vs. mpi_init_ vs. ...)
2.      value of .TRUE.
3.      issues to do with calling convention

2.2 mpi.h

Define the contents of mpi.h more closely

1.      Define values of constants (such as MPI_COMM_WORLD etc.)
2.      Define size of MPI_Status
3.      Define size and types of MPI handles (such as MPI_Datatype etc.)
4.      Define calling convention (e.g. cdecl vs. stdcall on WIN32
platforms)

2.3 Query implementation in use

It may prove useful to provide a means of querying the MPI layer for a
description of itself. For example, something along the lines of:

int MPI_Version_info( char * buf, int bufSize )

would allow an application to query at runtime the MPI implementation
that it is using. The intention is that the output is for information
only.

3. Pros, Cons and Others

3.1 Pros

1.      Allows application developers to ship binaries which can work
with any MPI implementation

2.      MPI developers can test their implementation against those
applications

3.      MPI implementors can benefit because of the potential expanded
user base

4.  Implementations already have to support a translation layer for the
Fortran interface, so there may not necessarily be a large
implementation overhead (ref:
http://www.open-mpi.org/community/lists/users/2005/03/0040.php)

5.      Hardware vendors can ensure that their hardware can be used with
the widest range of applications simply by ensuring that an ABI
conforming MPI is available.

3.2 Cons

1.      Standardizing types of MPI handles involves a significant amount
of work for MPI implementors (but see 3.1.4)

2.      Standardizing types of MPI handles may restrict MPI
implementation choices.

3.      May not be able to resolve C++ / F90 name mangling issues

3.3 Others

1.      Applications still must qualify against each MPI implementation
that they wish to support.

2.      Naming of the shared library - applications may be reasonably
expected to handle finding the right library using dlopen() or similar.

Best regards,

Edric.

--
Edric M. Ellis
The MathWorks,
Matrix House,
Cambridge Business Park,
Cambridge CB4 0HH, UK
Tel: +44 (0)1223 226751