Message Passing over SCI (fwd) [and realtime]

Tony Skjellum (tony@Aurora.CS.MsState.Edu)
Tue, 5 Mar 1996 15:14:47 -0600 (CST)

Forwarded message:
Date: Tue, 05 Mar 96 13:13:10 cst
From: "dwadams" <dwadams@cacd.rockwell.com>
Encoding: 75 Text
Message-Id: <9602058260.AA826053353@ccmgw1.cacd.rockwell.com>
To: sci@sunrise.scu.edu, tony@CS.MsState.Edu, green@xtp.com,
strayer@ca.sandia.gov, roark@mcopn.dseg.ti.com, weaver@virginia.edu,
bcm@sei.cmu.edu
Cc: dsburkha@cacd.rockwell.com, sghoffma@cacd.rockwell.com,
gejeffre@cacd.rockwell.com, mpi-impl@mcs.anl.gov
Subject: Message Passing over SCI
Sender: owner-mpi-impl@mcs.anl.gov
Precedence: bulk
Content-Type: text
Content-Length: 3944

We are working on a project in which we plan to use SCI in an
embedded, real-time, distributed processing application. For a myriad
of reasons I won't go into here, we want to use message passing
instead of the global addressability SCI offers. To meet our real-time
constraints, we require latencies to be as low as possible, e.g., <
100 microseconds would be a reasonable goal. We will be running the
message passing software in conjunction with a lean, deterministic,
microkernel based real-time executive.

A desire we have is to use either an open or defacto message passing
standard if we can find one that meets our latency and other
requirements. We think these other requirements are simple and
basically consist of the following:

(1) We want to create read/write "virtual associations/connections"
between concurrently executing processes in different SCI nodes. The
time required to establish (or tear down) associations isn't
particularly critical, e.g., could be in the millisecond range.
(2) Once established, the associations will remain in effect for
relatively long periods of time (minutes or hours).
(3) The time during which the associations are established is when we
need the low latencies as we need to get data through (potentially)
several nodes in as short a time as possible, e.g., tens of times per
second.
(4) There should be no limitations on the numbers of virtual
associations established concurrently.
(5) During the time some associations are established, other
associations may come and go either between the same or different
processes on the same or different nodes.
(6) We would like (but don't require) multicast.
(7) Our (current) bandwidth requirements aren't particularly high,
perhaps requiring 20% of our initial SCI implementation.
(8) We need it NOW (!).

That's about it.

It seems there should be existing message passing protocols out there
we could use. If we were really lucky, there would even already be SCI
implementations.

To date, the primary contenders we've looked at are MPI (Message
Passing Interface) and XTP (eXpress Transfer Protocol). The following
are our current conclusions regarding these:

(1) MPI seems so general and so oriented towards parallel processing,
we frankly don't know if it will solve a job as simple as ours without
building layers on top of it to create the functionality required.
(2) We are pretty convinced XTP provides all the needed functionality.
However, even though it is considered a LWP (LightWeight Protocol),
the only latency figures we've seen have been in the hundreds of
microseconds (on Intel 80x86 and IBM RS6000 processors). Although it
could have been OS overhead (DOS and AIX respectively) causing these
figures to be so high, we'd rather not gamble.
We also don't if a version of XTP exists for SCI (and, if so, what
platform(s) it runs on).

In addition to the above, we understand there is a Posix activity
underway (1003.21) to define a message passing standard for real-time
applications. It is my current understanding, however, any
implementations of an approved standard are at least a couple years
off. I also have concerns whether the 1003.21 definition of
"real-time" meets mine.

In light of all the above, this message is being sent to SCI people
(on the reflector), people in the MPI and XTP forums, and to the chair
of the 1003.21 activity.

Soooo, if anybody out there has any information that would help me
out, please let me know.


Dan Adams
dwadams@cacd.rockwell.com
(319) 395-8226
(319) 395-2087