-----Original Message-----
From: owner-mpi-21@xxxxxxxxxxxxx
[mailto:owner-mpi-21@xxxxxxxxxxxxx] On Behalf Of Torsten Hoefler
Sent: Thursday, August 30, 2007 1:52 PM
To: mpi-21@xxxxxxxxxxxxx
Cc: lums@xxxxxxxxxxxxxx; dgregor@xxxxxxxxxxxxxx
Subject: Generalized Request Progress
Hello,
we (Douglas Gregor and me) tried to port LibNBC [1] to use MPI
generalized requests to enable MPI_Wait{all|some}() for a specific
application scenario.
LibNBC uses an internal state machine to represent the stage
(round) of
the collective operation. A progress function (in this case
NBC_Test())
needs to be called to ensure internal progress (of the collective
operation).
We realized that there is no way to ensure progress with generalized
requests (without threads) because there is no function that is
called
everytime the user calls test or wait on the request (the query_fn is
only called after the request is finished). This seems to enforce a
threaded implementation of any non-blocking operation that uses
generalized requests. This scheme seems to be suboptimal in some
use-cases. First, not all parallel environments support threads
(e.g.,
Catamount) and second, we show in [2] that, at least for some
applications, a non-threaded implementation may perform better than a
threaded implementation.
We would like to add some way to progress the operation
without threads.
We consider this issue as a bug that renders the current interface
useless for several use-cases and it is easy to fix (thus,
we'd like to
see a fix in MPI 2.1). The possible two fixes seem to be
reasonable and
easy to implement:
1) change the behavior of query_fn such as that it is called
every time
when the associated request is tested or waited on (spinning
in case of
wait) - this may be suboptimal, because it could break
existing code and
changes the defined behavior
2) add a new function "MPI_Grequest_start_progress(progress_fn,
query_fn, free_fn, cancel_fn, data, request)" that defines the passed
progress function to be called when the associated request is
tested or
waited on.
Any comments? We are of course open for other suggestions and
discussions!
Thanks & Best,
Torsten
[1]: http://www.unixer.de/research/nbcoll/libnbc/
[2]: T. Hoefler and P. Kambadur and R. L. Graham and G. Shipman and
A. Lumsdaine: A Case for Standard Non-Blocking Collective
Operations
(accepted for publication at the EuroPVM/MPI 2007)
--
bash$ :(){ :|:&};: --------------------- http://www.unixer.de/ -----
Indiana University | http://www.indiana.edu
Open Systems Lab | http://osl.iu.edu/
150 S. Woodlawn Ave. | Bloomington, IN, 474045-7104 | USA
Lindley Hall Room 135 | +01 (812) 855-3608