Proposal for Datatype Compacting

Nathan E. Doss (doss@ERC.MsState.Edu)
Fri, 14 Jul 1995 09:43:11 -0500 (CDT)

(I have no idea if this is the right list for this? None
of the others seem to come anywhere close to applying.)

Proposal for Datatype Compacting Function

Rationale
---------

Suppose we implement gather as a spanning tree. Since the recv buffer
is only valid on the root process, we will need to allocate some
temporary space for receiving data on intermediate nodes. If the
datatype has big holes, we may not be able to allocate this space. In
this case, we might want to do something like the following in our
gather:

if (datatype has big holes)
use a linear gather
else
use a tree gather

The MPI-1 standard states that for collective operations such as
Bcast, Gather, Scatter, etc., the type *signature* for all processes
calling the function must match. Before we can execute our gather, we
must make sure that *all* processes make the same decision. Since
only the type signature must match, some processes could be provided
with a datatype w/ holes while others are provided with a contiguous
datatype.

Some possible solutions:

1) Always use a linear gather.
2) Use an allreduce to collectively decide which algorithm to
use.
3) Change the MPI-1 document to state that the type maps must
match.
4) Write a datatype squashing function that returns a (mostly)
hole-free datatype.

Currently, option 4 (it seems to me) is likely to be the best choice
for an implementor and of course, implementors are able to do this.
However, if the MPI Forum intends that application programmers be able
to write efficient collective operations on top of pt2pt functions,
this datatype squashing function should be a part of the MPI standard.

A related problem occurs if you want to know how big the holes are in
a datatype in order to decide whether or not to squash the type. You
could compare the extent with the size, but since the extent may have
been manipulated using MPI_UB & MPI_LB, this would not work. The
ability to determine the real extent of a type would be useful in this
case.

Proposed Functions
------------------

MPI_TYPE_COMPACT ( type, compact_type )
IN type datatype to be compacted
OUT compact_type datatype with identical type signature
as "type", but with smaller real extent

Note
----

Paul Pierce has suggested something like MPI_TYPE_COMPACT as a way to
solve the user-defined reduce problem. There was also a lot of
discussion about determining the real limits of a datatype at the last
meeting. Hopefully this points out that this type of functionality is
needed for more than just user-defined reduce operations.

-- 
Nathan Doss                  doss@ERC.MsState.Edu