The question is how many applications will be written to inteface them,
including O/S software. The choices will be Parallel SQL, OLE, and MPI.
- SQL has the capability I am pressing you for.
- OLE is a common object approach, any player must be linked to
the object library module for that datatype.
MPI will be useless for this task unless it can instantiate foriegn
types. Same with parallel archival storage systems.
> Heterogeneous Parallel Computation
> I just don't believe that this is an issue at all for normal
> heterogeneous computing. MPI already allows you easily to send
> structures between hetero machines. As I tried to say before, most
> applications don't handle data they don't understand the layout of.
Future systems, those which we characterize as "data mining" will
be plentiful. Again, choose PSQL or MPI -- if MPI can meet the
challenge.
>
> I still think I'm missing something in this scenario of yours :-
>
Thanks for being open!
> > 5. The DBMS has stored the CHAR_DATATYPE as metadata. It
> > constructs the datatype and signals the application
> > to initiate an asynchronous receive for the data.
> What is this CHAR_DATATYPE, and who is using it ?
This is the flattened representation of the MPI datatype in the
current MPI external proposal, used by MPI_GET_CHAR_DATATYPE.
The DBMS needs this datatype so that it can construct it even
when the application that created it is not "around". It especially
needs it when doing M-node to N-node collective I/O transfers
between other 3rd parties, other applications, and even when
moving data (in the backend) from it's disk cache to archival
storage and back.
When a fine structure of the data layout is known, the DBMS can
acheive significant optimizations in data transfer.
> (Given that the user asked only for a slice of the
> data this information can only be calculated once we have the user
> request).
This latency can be avoided, and is avoided by current DBMS systems.
But the layout needs to be known in advance at both ends.
>
> If the vis program is going to display the data it better know how it
> is layed out in its store. If it knows it can tell MPI. If it doesn't
> it can't do anything.
Future systems need not know the full data type in advance. If they
"discover" a data set containing metadata of interest, and subcomponents
which they can visualize, then the ability to instantiate a foriegn
datatype and extract known subtypes is very interesting. Consider a
web browser scanning and a DBMS browsing an archival storage array.
Another trend for future systems is the elimination of file names as
we know them. The desktop hasn't seen the limit yet, but larger systems
have already peaked on inode and other file count limits. Further, users
peak out long before this and turn to DBMS-link tools to manage the
hordes of data sets (formerly files) they are creating.
Personally, I would rather be able to instantiate foriegn types on
the fly instead of having to link with a limited number of data classes
(COBRA, OLE) or use PSQL. If I use the latter, there is no need
for MPI.
Richard Frost
SDSC