RE: Proposal for communicator id

Raja Daoud (raja@alex.osc.edu)
Mon, 12 Jun 95 18:44:50 EDT

> I have to agree with Jim, this proposal won't work - I don't want
> MPI_COMM_ID() to be collective (as this proposal implies).

I agree it's ugly in its theory, tough not in the details of the current
implementations. For both LAM & MPICH (and I presume other implementations
as well, correct me if I'm wrong about other real impl.), it won't be a
collective call, it would just return the comm. ID already stored in the
communicator structure. The collective step is taken when the communicator
is created. MPI_COMM_ID() would be a collective call in some theoretical
future implementation that decides to handle communicators differently.
But let's forget this proposal for a second.

The key issue is: what model do we want to have for a communicator? This
is, as I understand it, Jim's main concern. In MPI-1, there is no model
for what a communicator is "in real life" down at the byte level. On the
other hand, a profiling or debugging tool needs to access this information.
So it seems the solution can fall into one of several categories:

1- Keep it as in MPI-1 and too bad for prof./debug tools, they can't be
portable in this respect and have to have impl.-dependent modules.

2- Ditch the MPI-1 opaqueness and force a communicator model (integer)
on everybody (I don't think too many of us want this :-) ).

3- A compromise: keep opaqueness for production runs so an implementation
that doesn't want to take advantage of the MPI-2 extensions to the
standard profiling interface is free to do what it pleases. Remove
opaqueness and enforce a model to be followed (comm. ID == integer
having certain properties) for profiling and debugging in MPI-2.
This is where the ugly stuff comes in.

4- ??? other approaches ???

In real life, (1) is livable. LAM and MPICH would have their non-standard
MPI_COMM_ID() calls that return an integer having the properties described
previously. Interested parties would code tools that work on both systems
and that's the end of the story. Different implementations that need to
use these tools would need to hack the code and port them. Obviously
I would prefer a standard profiler access to what a communicator really is,
just like what has been proposed with datatypes <--> string specification.

> An alternative is to specify the hook functions, but allow their behaviour to
> be a "no-op". A debugging complient implementation would support the full
> semantics of the functions, while a high performance implementation would only
> perform "no-ops". The user would link to the desired version.

That's fine, but we still have to specify what the profiling function
dealing with the communicator looks like. What info does it return?
You seem to be taking the "integer" comm. ID interface for granted but
this is nowhere described in MPI-1.

In summary, the data points are:

- communicator details are not defined in MPI-1 (on purpose)
- profiling/debugging tools need detailed info to do a good job
- current implementations use group-wide and process-wide unique
integer comm. IDs

--Raja