Enquiries on MPI_Window

Joel Malard (joel@epcc.ed.ac.uk)
Wed, 19 Feb 1997 13:46:05 +0000 (GMT)

My apologies for the previous email that ended up empty whatever the reason.
Hopefully the text of this second email is more to the point.

It was mentioned at the 3rd Euro information meeting last week, that cached
attributes in MPI windows could be dropped from the MPI-2
specifications. Here is an example where I would like to find out
information about an MPI-window passed as an argument to a procedure.

The type of programs I am looking at consist in a main loop (as below) where
the norms of the columns of a matrix A are computed and stored in the rows
of another matrix R. This type of computation occurs in linear algebraic
computations where the matrix A(i:m,i:n) is an 'active' submatrix and the
matrix R is some triangular factor.

double A[m][n], R[n][n]
for (i = 0 ; i<n ; i++ ) {
< ... some computation ... >
some_sort_of_reduce (&A(i,i), &R(i,i), m-i+1, n-i+1, ... )
< ... some computation ... >
}

The prototype of the callee may look like:

void some_sort_of_reduce (double *B, double *S, m, n, ... )

and the procedure some_sort_of_reduce calls MPI_Put on the array S.

Some alternatives seems to be:

1. initialize a window near to where R is declared and
1.a: get the base address from the MPI_WINDOW_BASE attribute
1.b: pass the base address of the mpi_window to the called subroutine
1.c: change the call to MPI_Put so it will compute the appropriate
offset from a pointer into the local window.
2. Create a window inside the called subroutine
2.a: delete the window upon return of some_sort_of_reduce
2.b: create a persistent window (static in C) that is resized whenever
the base address changes or its current length is too small.

Option 1.a corresponds to the current standard, 1.b is awkward because
the call to some_sort_of_reduce appears long after R has been allocated
storage.

One drawback with passing down a window to some_sort_of_reduce
(1.a & 1.b) is that the caller must be aware of the callee using one-sided
communications.

One drawback with creating all necessary windows inside
some_sort_of_reduce is that the buffer S can be large. 1.c. has been discussed
on the mpi-1sided list.

Another issue:

I would like to be able to find out whether a window argument is still
valid, e.g. should the 'relevant' communicator be passed as argument
along with the window. The communicator that is passed to
MPI_Win_init is only a performance hint: should one pass the
underlying group along with the window instead? Could the relevant
information be extracted from the window, e.g. something of the sort:

mpi_win_comm (window, &comm) ;
mpi_barrier (comm) ;

or mpi_win_group (window, &group) ;
mpi_group_comm (group, &comm) ;
mpi_barrier (comm) ;

A last point is that when the procedure some_sort_of_reduce implements
a recursive doubling where each process talks alternatively to one of
its neighbor (i.e. Pi <-> P(i^(1<<j))) the MPI_Group argument of
synchronization procedures MPI_Start, MPI_Post becomes a bit heavy.
It seems that for p processes, log(p) groups need be created at least
on the initial call to some_sort_of_reduce or in a separate initialization
procedure.

Is it correct to describe MPI_Window objects as communication contexts
given that a specific window is not tied to anyone communicator except as a
performance hint in MPI_Win_init? Would it make sense to define the
matching procedures for MPI_Comm_compare, MPI_Comm_size, MPI_Comm_dup
(possibly with a new user-allocated buffer area), etc?

Joel.