mpi-io, mpi-dynamic generalizations

Richard Frost (frost@SDSC.EDU)
Mon, 23 Oct 1995 10:32:40 +0059 (PDT)

Hi folks,

I'm working on a project for ARPA and IBM involving the integration
of applications, web-browsers, databases (DB2 & Illustra), and
archival storage (Unitree & HPSS).

MPI will play a key role and I hope to leverage from MPI-IO as well.
In particular, we are driven by the use of MPI derived data types.

The following 2 components are necessary to complete a prototype system
using MPI. I'm interested in learning about similar efforts before
defining an API.

1. An on-the-wire protocol for transmiting derived data types,
and a mechanism for run-time parsing of these types on the
receiving end.

For example, consider a parallel DBMS responding to a
request from a parallel computation. We would like the DBMS
to store the derived data type as metadata for reuse at a
later run time instead of linking the DBMS to an intractable
number of hardcoded data types. (Yes, this is an MPI/SQL
interface.)

In particular, DB2 would need an MPI-IO-like module (MDIO)
for interfacing with computational applications and
browsers; and another module for interfacing with archival
storage. SQL transactions would only occur between the
DBMS modules and the DBMS.

2. A generalization of files; i.e., we need to overload
MPI-IO's concept of files so that we can perform I/O
transfers between agents (databases, web-browsers, archival
storage systems) along with actual file systems.

For example, a file might sometimes be designated by a Unix
file descriptor, and at other times an external communicator
containing an arbitrary number of processes. We need to
facilitate M-to-N I/O transactions between arbitrary
information sources and consumers in a way that gives
applications a simplified view of open, read, write, close.

See the "Example system architecture" on
http://www.sdsc.edu/EnablingTech/MassDataAnal/MassDataAnal.html for a
diagram.

Thanks,

Richard Frost
SDSC