I also agree with Kelvin's technical observations on message scheduling
(can we perhaps include these in the user-defined profiles as we discussed,
as one typical scenario?).
I also have a minor observation that one of the reasons RMA does not extend
in the same provable framework to parallel machines is that it is a
scheduling-theoretic technique for perfect uniprocessors. In that sense, it
is well-known that proofs do not scale up (provably) to multiple processors
or any system with at least two blocking resources.
Alex.
-------------------------------------------------------------------------
Dr. Alexander D. Stoyen
President & CEO
21st Century Systems, Inc.
420 Hardscrabble Road
Chappaqua, NY 10514-3030
(914) 769-2939 (office)
(914) 769-0949 (fax)
Email: alex@21csi.com
----------
> From: Arkady Kanevskyby way of Kelvin Nilsen <kelvin@newmonics.com>
<arkady@linus.mitre.org>
> To: mpi-realtime@mcs.anl.gov
> Subject: Re: Mixed-model admission tests
> Date: Monday, December 16, 1996 3:36 PM
>
> Kelvin,
>
> These are a very good comemnts.
> Many of them we had discussed in the past many times.
> I will provide my feedback to them.
>
> The main difficulty comes from the underlying assumptions of the
> CPU priority scheduling models. Communication that we are "trying"
> to schedule is not a single resource. For one type of platform,
distributed
> shared memory, scheduling may require to schedule: application processor
> CPU (sender and receiver), data bus (sender and receiver),
> memory (sender and receiver), communication co-processor (sender and
> receiver),
> network interface chip (sender and receiver),
> underlying physical network that can have various topologies (like
> ring, mesh, cross bars). The schedulers (if they exist) for all these
> components of a platform potencially use different technologies that are
> most suitable for their own design. The RMA is very hard to adopt to
analyze
> all these resource that need to be schedule to provide "predictable"
> real-time behavior for communication. This is probably the reason there
> is no theoretical work that extended RMA to parallel platforms.
>
> The communication scheduling problem need to address scheduling of many
> heterogeneous resources and scheduling algorithms need to be flexible
> enough to handle scalability issues that are in the heart of parallel
> processing. Ever since the beginning of the MPI/RT meetings I advocated
> the position that the standard should not dictate how to schedule
> communication.
> Instead it should provide an API that supports user perspective for
> timing (and fault-tolerance). Then implementors can map user requests
> onto the scheduling primitives that their platforms use: priority,
calendars,
> events, signals and so on.
>
> A simple mapping that Kelvin presents for priority is a good starting
point
> and can be included as advice to implementors. But this example
> also illustrates the difficulties of scheduling communication. The worst
> case available utilization (WCAU) preseneted only covers single resource
-
> CPU.
> A simple extentions of WCAU to handle multiple resources is very
pessimistic
> with very low utilization which most of the high-performance users
> will not accept. MITRE looked into this problem and decided that for
> a large class of signal processing applications it is easier to take
> advantage of time-driven scheduling utilizing synchronized clocks that
> most high-performance platforms for real-time applications provide.
> That also allows MPI/RT implementation to utilize multiple routes to
avoid
> hot spots at the channel creation time to provide maximal support for
> real-time users.
>
> The MPI/RT implementation also releave users from the necessity of
figuring
> out "right" priorities and doing the "scheduling" and its analysis
> and allows then to concentrate on the "application". However, MPI/RT
> allows users to "play" with detail scheduling if they so desire.
> My vision that MPI/RT implementors know a lot more about the platform
> and hence should be able to provide real-time predictable behavior while
> maximizing "platform utilization" (not just individual resources like
CPU)
> better than the users. MPI/RT API allows to provide application
information
> detailed enough for an implementation to "utilize" the platform.
>
> Notice that for some platforms, like shared memory, the simple single
resource
> model is sufficient and hence Kelvin's suggestion will work well.
> But in general, to support scalability of applications a more
comprehensive
> analysis is needed.
>
> Arkady
>
> > From mpi-realtime-human@mcs.anl.gov Mon Dec 9 23:52:11 1996
> > X-Sender: kelvin@newmonics.com
> > X-Mailer: Windows Eudora Pro Version 3.0 (32)
> > Date: Mon, 09 Dec 1996 22:40:54 -0600
> > To: mpi-realtime@mcs.anl.gov
> > From: Kelvin Nilsen <kelvin@newmonics.com>
> > Subject: Mixed-model admission tests
> > Cc: kelvin@newmonics.com
> > Mime-Version: 1.0
> > Content-Type> : > text/plain> ; > charset="us-ascii">
> > Sender: owner-mpi-core@mcs.anl.gov
> > Content-Length: 7798
> >
> >
> > I attended the MPI/RT meeting on Dec. 5 and 6. This was my first real
> > involvement with MPI/RT. Since there are a lot of similarities in the
> > objectives and approaches of MPI/RT and my work with a real-time
variant of
> > Java (PERC), I am hopeful that our two groups can learn from each
other...
> > (To participate in the real-time-java mailing list, send a "subscribe"
> > message to real-time-java-request@iastate.edu. To review past
discussions,
> > refer to www.newmonics.com:Technologies:Real-Time Java.) I'm going to
use
> > the PERC name from this point forward (rather than real-time Java)
because
> > I do not have Sun's permission to use their trademarked name in this
> context.
> >
> > Both PERC and MPI/RT propose to accept responsibility for reliable
> > execution of dynamic real-time workloads. In both environments, a
workload
> > is proposed to the system, the system analyzes the proposed workload,
and
> > the system either accepts or rejects the workload. Once a workload has
> > been accepted, real-time constraints are "guaranteed" to be satisfied.
> >
> > In comments describing PERC implementation techniques, we point out
that
> > real-time threads can be scheduled either through the use of a static
> > cyclic schedule, or a dynamic static priority schedule, or a hybrid
between
> > the two. (Dynamic priority implementations would also be acceptable.)
The
> > current MPI/RT draft supports two modes of operation: time-driven and
> > event-driven. (I'm intentionally ignoring priority-driven for now).
> >
> > In the MPI/RT meeting, Arkady suggested that his hope was that we would
> > come to some sort of agreement as to how each mode of operation should
be
> > handled separately at first, and that we would later tackle the problem
of
> > mixing them together. I told him individually, and now in public, that
I'd
> > like to see some thought given to how the modes would eventually mix.
(I
> > have a sense that one reason it may be difficult to reach consensus on
> > certain design issues is because we don't have a good understanding of
the
> > inherent implementation complexities associated with particular API
> features.)
> >
> > These are my thoughts regarding one possible implementation of a
> > mixed-model admissions test. I welcome feedback and suggestions for
> > improvement.
> >
> > Assumptions:
> > 1. A static cyclic schedule has already been constructed
> > 2. The static schedule was constructed so as to minimize the
durations
> > of CPU time that are allocated to static tasks. e.g. the static
> > cyclic scheduler prefers:
> > _______ ________
> > ________|___A___|________|____B___|_________
> > 0 10 20 30 40 50
> >
> > over:
> > _______ ________
> > ________|___A___|____B___|__________________
> > 0 10 20 30 40 50
> >
> > 3. Our job, given an existing static schedule and a proposed workload
> > of dynamic tasks, is to decide whether to accept responsibility
for
> > execution of the dynamic tasks.
> >
> > The analysis is based on the techniques used by Liu-Layland to derive
the
> > standard RMA result. To illustrate the analysis, consider the
following
> > static cyclic schedule, of duration 70
> > ____ ____ ________
> > [__A_|____|_B__|_______|____C___|__]
> > 0 10 20 30 45 65
> >
> > 1. At the time the static schedule is constructed, we precompute some
> > tables which characterize the ways in which this static schedule
> > might interfere with dynamic tasks. In particular, we consider
that
> > the Liu-Layland critical instant might occur at time 0, time 20,
or
> > time 45 so we build interference graphs for each of those cases:
> >
> > Available CPU Time for Critical Point A
> >
> > 30 | .
> > | ____/
> > 20 | /
> > | /
> > 10 | __/
> > | /
> > 0 |_/
> > |_._._._._._._.
> > 0 20 40 60
> > 10 30 50 70
> >
> > Available CPU Time for Critical Point B
> >
> > 30 | .
> > | /
> > 20 | __/
> > | ____/
> > 10 | /
> > | /
> > 0 |_/
> > |_._._._._._._.
> > 0 20 40 60
> > 10 30 50 70
> >
> > Available CPU Time for Critical Point C
> >
> > 30 | .
> > | /
> > 20 | /
> > | __/
> > 10 | /
> > | __/
> > 0 |___/
> > |_._._._._._._.
> > 0 20 40 60
> > 10 30 50 70
> >
> > Now, what we really want to know is the worst-case
> > interference imposed by the existing static schedule on execution
> > of dynamic real-time tasks. In this case, the worst-case
interference
> > is represented by the third graph. In more complicated cases, it
is
> > possible that the worst-case interference graph would need to be
> > computed by merging the above graphs. The general technique is to
> > define the function as the minimum of the values of the individual
> > graphs for each possible value of the x coordinate. Note that
this
> > chart can be repeated to allow analysis of dynamic workloads with
> > task periods that exceed the length of the cyclic schedule.
> >
> > Worst-Case Available CPU Time
> >
> > 30 | .
> > | /
> > 20 | /
> > | __/
> > 10 | /
> > | __/
> > 0 |___/
> > |_._._._._._._.
> > 0 20 40 60
> > 10 30 50 70
> >
> > 2. Note that, in terms of the "Worst-Case Available CPU Time"
illustrated
> > above, the dynamic tasks will, at worst, be able to run only
during
> > times represented by upward slanting graph segments. Suppose
> > a certain dynamic workload can be sheduled in Y time in the
absence of
> > interference from statically scheduled tasks. On the above graph,
> > find the rightmost x coordinate that maps to the given Y and call
this
> > X. The worst-case amount by which this event will be delayed by
> > interference from statically scheduled tasks is X-Y.
> >
> > 3. Sort the proposed dynamic workload in terms of decreasing
execution
> > frequency. In order from highest to lowest priority, examine each
> > task:
> >
> > a) Compute its "raw" completion time in terms of its WCET and the
> > WCET and frequencies of all higher "priority" tasks.
> >
> > If its completion time adjusted for static schedule
interference
> > does not satisfy the task's latency requirement, reject the
> > workload as unschedulable.
> >
> > b) Add this task into the "interference" workload that is used to
> > compute completion times for all lower priority tasks.
> >
> > Some observations:
> >
> > 1. This assumes tasks (message transmissions) can be preempted at
arbitrary
> > times. If this is not the case, more detailed bookkeeping is
required.
> >
> > 2. This ignores the overhead of distributed dynamic scheduling. If
> multiple
> > nodes desire to transmit dynamic messages during time slots not
reserved
> > in the static schedule, there is a certain (in some cases
> > nondeterministic)
> > overhead associated with contention resolution protocols.
> >
> > 3. This analysis can be easily modified to support deadlies shorter
than
> > periods, such as is the case with so-called "emergency messages."
> >
> > I'd like discussion and comments on this... I do not propose this as
the
> > ultimate solution.
> >
> >
> > --
> > Kelvin Nilsen, President voice: 515-296-0897
> > NewMonics Inc. fax: 515-296-9910
> > 2501 N. Loop Dr., Suite 614A http://www.newmonics.com
> > Ames, IA 50010 email: kdn@newmonics.com
> >
>