[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [mpi-21] ABI (was: Call for MPI 2.2 and 3.0...)



Hi,

We theoretically have compile time compatibility with the current
standard. I think we're aiming at (again, theoretical) partial link time
compatibility, with all these ABIs and argument sizes.

Best regards.

Alexander

--
Dr Alexander Supalov
Intel GmbH
Hermuelheimer Strasse 8a
50321 Bruehl, Germany
Phone:          +49 2232 209034
Mobile:          +49 173 511 8735
Fax:              +49 2232 209029
 

-----Original Message-----
From: owner-mpi-21@xxxxxxxxxxxxx [mailto:owner-mpi-21@xxxxxxxxxxxxx] On
Behalf Of William Yu
Sent: Wednesday, December 05, 2007 12:21 PM
To: mpi-21@xxxxxxxxxxxxx
Cc: mpi-21@xxxxxxxxxxxxx
Subject: RE: [mpi-21] ABI (was: Call for MPI 2.2 and 3.0...)

Which is the objective of the forum? Shouldn't compile time be
sufficient?

Maybe this is a good question for our planned survey.

________________ Reply Header ________________
Subject:	RE: [mpi-21] ABI (was: Call for MPI 2.2 and 3.0...)
Author:	"Supalov, Alexander" <alexander.supalov@xxxxxxxxx>
Date:		December 05th 2007 10:38 am

Hi everybody,

It's a pleasure to read this trail, even though I may not agree with
everything that was said so far. For example, there are issues with
dlopen called from within statically linked apps, etc. However, that's a
detail we should probably leave for a subcommittee to dwell upon.

More important is that most of big ISVs I know of have already developed
their internal chameleon layers for so many MPIs, so that a common MPI
ABI is not really their top priority. The load they have to expend
validating so many MPIs - this is what eats up the time, resources, and
money.

In this context, what is important to me here is that we clearly
understand - and if possible agree on - what a common ABI is aiming at,
what aspects and levels of MPI compatibility exist, what they entail
technically, and whether they will bring any advantages to the MPI
itself and its users.

A modern parallel system without an MPI is as naked as a serial one
without an OS and assembler/compiler. Like those components, MPI is
eventually going to become a part of the standard middleware, like
sockets library, etc. This is definitely a growing need in the volume
cluster space. Note that proprietary systems may cast the following in a
different light, for various reasons. Let's stay with the masses for
now.

I tend to think of theoretical and practical compatibility. This is a
separation between "it may work in principle for an expert" and "it
works in practice for a novice user". It is practical compatibility that
matters for typical end users. They simply want things work out of the
box and earn them money. Otherwise they go for the next offer.

A common ABI is but a part of the practical MPI compatibility that also
includes common library names and dependencies, process startup
mechanism, environment query/control, input/output convention, command
line options and environment variables, GUI utilities and documentation,
and - oh dear - expected functional behavior and performance. I have no
doubt this list is incomplete.

I usually speak of three levels of MPI compatibility, separating the
theoretical and practical aspects:

1) Compile time compatibility. You rebuild your app over a new MPI, and
it works. This level of compatibility is guaranteed by the MPI standard
at the theoretical level (i.e., for an expert). It is not guaranteed
practically speaking (for a novice user), because MPIs differ in their
behavior, and application developers tend to program for the MPI they
like most, thus making possibly incorrect assumptions. In other words,
an application may not work upon transition to a new MPI. In fact, very
often it does not.

2) Link time compatibility. You bring your precompiled application,
relink it against an MPI found on the system, and the application works.
The MPI standard does not really help here, either theoretically or
practically. Theoretically this entails a common ABI: common calling
convention and common argument sizes. Common library names and
dependencies are desirable. Note however that the interface to the
process management and some other system services may yet break your
application practically speaking. It simply won't start as expected, and
the user will go. The semantic difference between the MPIs is exactly as
deadly here as above, too.

3) Runtime compatibility. You bring you prebuilt application to a new
system, and it works. Either you bring all your stuff with you in a
static application (problematic if you use GPL libraries), or you
dynamically link against what is available. Here you're faced with the
full range of aspects to address, some of which I mentioned above. There
may be sublevels inside this happy state, but that's a largely academic
matter. If you're not faithfully emulating an MPI users worked with
before, they will find a way to break their application and claim this
to be your problem.

Now, practically speaking, with all respect, we have sort of scratched
compile time compatibility (1) with the current MPI standard. We have an
API but we have enough dark corners and pages intentionally left blank
for yet another MPI to screw up every n-th existing application on the
critical first pass.

What we're speaking about here is most likely some variety of the link
time compatibility (2). How far are we prepared to go here? What will we
achieve if we do only the theoretical part of the job? Practically
speaking, whatever we do, we'll simply make yet another step to the
runtime compatibility (3), that's it. This is a worthy secondary goal if
thought of within this context. I don't think a common ABI is good for
anything else.

Now, runtime compatibility (3), if driven to the limit, is effectively
either the One Ring option, or a complete commoditization of the MPI,
with all that this entails: a strict and complete functional
specification, validation, certification, etc. Having read this trail,
I'm not sure we're quite ready for either just yet. But we should get
there once.

Best regards.

Alexander

--
Dr Alexander Supalov
Intel GmbH
Hermuelheimer Strasse 8a
50321 Bruehl, Germany
Phone:          +49 2232 209034
Mobile:          +49 173 511 8735
Fax:              +49 2232 209029
 

-----Original Message-----
From: owner-mpi-21@xxxxxxxxxxxxx [mailto:owner-mpi-21@xxxxxxxxxxxxx] On
Behalf Of William Yu
Sent: Wednesday, December 05, 2007 12:05 AM
To: mpi-21@xxxxxxxxxxxxx
Subject: Re: [mpi-21] ABI (was: Call for MPI 2.2 and 3.0...)

Of course, it would be nice to just have standard shared object
libraries and do away with wrapper compilers too ;-)

But seriously, why can't we standardize stuff like parameter size and
const values? I can't seem to find the historical discussions. 

I am however aware of the int/long versus pointer argument which is
present in a things C.

________________ Reply Header ________________
Subject:	Re: [mpi-21] ABI (was: Call for MPI 2.2 and 3.0...)
Author:	Jeff Squyres <jsquyres@xxxxxxxxx>
Date:		December 04th 2007 10:14 pm

On Dec 4, 2007, at 4:48 PM, Erez Haba wrote:

>> [erezh] __cdecl vs __stdcall vs __fastall vs your parameter passing
>> convention.
>
> Are those Microsoft-specific conventions?  I don't recognize them.  In
> POSIX, there's only one way to call the C MPI API functions.
>
> [erezh] see http://en.wikipedia.org/wiki/X86_calling_conventions

Ah, thanks!

> As I said, this probably should be defined per processor type/os type

Why wouldn't we use whatever the default convention is for calling  
standard C functions on a given platform (e.g., printf)?  Isn't that  
what we do today?

>>> 2.       Size of input parameters (e.g., what's the size of  
>>> MPI_Comm)
>>
>> Isn't that kinda useless without standardized values for the pre-
>> defined constants?  :-)
>>
>> [erezh] Yes, you need the predefined constants and defining the size
>> of the parameters.
>> This is a significant issue for customers. I think we can do that
>> for "C" without limiting implementations.
>
> This is where the MPI implementors start having religious
> debates...  :-)
>
> (e.g., int vs. pointer)
>
> [erezh] You don't define type, just the size of the parameter. For  
> example, we can define it as 4 bytes on 32bit and 8 bytes on 64bit.  
> The implementation can define the type as long as it matches the size.

For an ABI, you need to standardize both the size and the value.  It  
may not be difficult to agree on the sizes: sizeof(void*)/64 bits  
because you can treat it as an int or a pointer.  But the value is  
where the religious debates come in.  For example: MPI_COMM_WORLD --  
it is "0" or is it a pointer resolved at compile/link time?

>>> No they don't!  This is the Big Lie of an ABI.  ISV's *STILL* need  
>>> to
>> test against every MPI implementation.  Fine; you don't have to
>> compile the application against each MPI implementation, and ISV's  
>> can
>> have one executable that supports multiple MPI's -- I see the "win"  
>> in
>> this.  But ISV's still have to *test* the application against each  
>> MPI
>> implementation.  And that's the much more time-consuming step (vs.
>> recompiling against each MPI).
>>
>> [erezh] Yes, I agree. They still need to test against different MPI
>> implementations. However, they don't have to test their code
>> "marshaling" the interface for every different flavor of MPI
>> implementation.
>>
>> [erezh] The barrier to move to another MPI implementation on a
>> tested platform is much lower. Authoring a new adaptation layer with
>> new bugs, re-compile and test is a high barrier.
>
> What adaption layer?  <mpi.h> and 'mpif.h' is all you need.
>
> Are you referring to something else, such as how to compile and link
> MPI applications?
>
> [erezh] Who's <mpi.h>?
> Vendor A takes a pointer as MPI_Comm the other takes an int as  
> MPI_Comm.

But if the size and values of these things are standardized (which  
they must be for an ABI), why is this an issue?

>> 2. provide a mechanism for MPI applications to optionally identify  
>> (at
>> run-time) which MPI they are using (e.g., so that they can choose not
>> to run, or display a warning that this MPI implementation has not  
>> been
>> QA tested with the application, etc.)
>>
>> [erezh] I think that this is already achieved today by identifying
>> the binary that you're dynamically binding to.
>
> Can you explain how to do that?  Remember that you have to be able to
> do it from Fortran, too.
>
> [erezh] The application dynamically loads the library thus it  
> specify the library name.
> I'm not an expert on Fortran but I'm sure that the Fortran vendors  
> provide a way to dynamically load an bind libraries.

Ah -- you're talking about having applications be responsible for  
dynamically loading the Right libraries, resolving all symbols and  
constants at run-time, and using indirect methods for invocation and  
usage (e.g., you cited LoadLibrary() and GetProcAddress()).  I  
understand your point of view now.  I believe that some of the  
derivative language bindings may have used this approach (Python,  
Perl, etc.)...?

1. Why not hide all that gorp in an MPI implementation that can front  
other MPI implementations?  (i.e., try explaining loading DLLs and  
function pointers to physicists/engineers/those who are not computer  
scientists and just want to write their MPI apps and run)  MorphMPI  
has been proposed in various forms throughout the years.  Someone even  
implemented a MorphMPI recently -- see
http://www.clustermonkey.net//content/view/213/1/ 
.  Job done -- no ABI needed.  :-)

I honestly don't remember the arguments against using a MorphMPI  
approach (instead of doing an ABI)...

2. There is no standardization on the names of DLLs across MPI  
implementations.  Some MPI's require only one DLL; others require  
more.  Indeed, in both LAM and Open MPI, we have changed the names of  
the underlying DLLs over the course of different releases (and hide  
all these details in wrapper compilers).  Having customers chase these  
DLL names across different releases seems like a difficult idea.   
Implementors certainly *could* be forced to fix their DLL names across  
releases, but it certainly is convenient having the freedom to change  
things under the covers.

-- 
Jeff Squyres
Cisco Systems
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.