[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [mpi-21] MPI Forum suggestions 11/14



If one cannot read from the send buffer while an MPI send is in progress, then it seems to me that all Java and C# MPI bindings are in violation of the MPI standard. The garbage collector can come through and scan memory at any time.

	- Doug

On Nov 30, 2007, at 2:20 AM, Erez Haba wrote:

Do you favor implementers over MPI users?

The programming model is very important for users. This specific restrictions puts users knowingly (unlikely) or unknowingly (very likely) in direct violation of the standard. We might choose not to change it but this will continue to surprise application developers. I expect to find many existing applications violating this MPI restriction unknowingly.

That issue is delicate, that's true. I would like to hear about one implementation and/or application that take advantage of this restriction and why.

Thanks,
.Erez

-----Original Message-----
From: owner-mpi-21@xxxxxxxxxxxxx [mailto:owner-mpi-21@mpi- forum.org] On Behalf Of Rusty Lusk
Sent: Thursday, November 29, 2007 7:25 PM
To: mpi-21@xxxxxxxxxxxxx
Subject: Re: [mpi-21] MPI Forum suggestions 11/14


Hi Steven,

An explicit goal of the MPI Forum was to constrain implementations as
little as possible.   This is just an example.  I think it was a good
goal, and encouraged implementations.  If we break existing
implementations by invalidating them, we will expose ourselves to
being ignored.   This is not to say that it must be one way or the
other, only that the issue is a delicate one, and that the
conservative approach (not changing the existing bindings) has merit.

Rusty

On Nov 29, 2007, at 5:54 PM, Steven Ericsson-Zenith wrote:


Of course Rusty you know that this is an implementation detail invading the programming model. Is this pragmatic of early implementations still required?

With respect,
Steven

--
Dr. Steven Ericsson-Zenith
Institute for Advanced Science & Engineering
http://iase.info
http://senses.info



On Nov 29, 2007, at 2:18 PM, Rusty Lusk wrote:

Unless I misunderstand the issue you are discussing, the original
intent of the MPI Forum did not have to do with cache issues at
all, but deliberately intended to make the buffer available to the
MPI implementation for it to use as it saw fit.  One illustrative
example is that at least one implementation, and maybe more, used
the buffer to do in-place byte swapping before sending the buffer
to another process running on a machine with a different byte
order.  If you envision a set of sends going to different machines
with different byte orders, you can see why each send buffer must
be left alone by the application until the Isend completes.

On Nov 29, 2007, at 4:10 PM, Hubert Ritzdorf wrote:



Erez Haba wrote:
The 'const' keyword is a contract saying that the called
function will not alter the buffer in any way. A copy of the
buffer is required only if the function implementation need to
change the buffer before sending.

I fail to see how Hubert's sequence of events implies the
restrictions put in the standard.
This implementation can result in different values being sent to
process B and process A (assuming sending to self is implemented
as reading from the cache) which is incorrect.

This would be no problem since the cache is also bypassed when
sending data to self.
The description below seems more like an application issue
rather than MPI library issue. Such an application should flush
the cache at the right time before calling MPI_Send, if it wants
to maintain coherency.

I agree.

Hubert
Thanks,
.Erez

-----Original Message-----
From: owner-mpi-21@xxxxxxxxxxxxx [mailto:owner-mpi-21@mpi-
forum.org] On Behalf Of Steven Ericsson-Zenith
Sent: Tuesday, November 20, 2007 11:20 AM
To: mpi-21@xxxxxxxxxxxxx
Subject: Re: [mpi-21] MPI Forum suggestions 11/14


The second step here is a violation of any parallel construction - since it is a write. The question here did not allow for a concurrent write. It was rather: is there a set of state transitions in which a concurrent read would cause a cache flush and potentially apprehend a differing value. The answer appears to be no.

The coherency problem would indeed be solved if the API had
specified
a const buffer. The only problem I see with a use of const is
that it
may require an extra copy operation at the function interface - and
this is undesirable. It's an old problem too (and one of the
reasons I
originally invented the Ease primitives).


With respect,
Steven

--
Dr. Steven Ericsson-Zenith
Institute for Advanced Science & Engineering
http://iase.info
http://senses.info



On Nov 19, 2007, at 11:40 PM, Hubert Ritzdorf wrote:


(*) Assuming MPI process A with an int buffer "buf" and "buf[0]
= 0"
was
set by MPI Process A. The cache and main memory contains buf
[0] = 0.
(*) Another MPI process or thread sets buf[0] = 1 after it was
triggered
to do this. buf[0] is now 1 within the memory.
But if MPI process A would perform now perform a = buf[0],
the value of a would be 0 (since the SX system is not cache
coherent
and the cache still contains buf[0] = 0)
(*) Now MPI process A calls MPI_Send (buf, ....)
(*) If the buf[0] is still in the cache and the MPI_Send would
perform
b = buf[0], b would still get the value 0.
(*) Now, the buffer "buf" is sent to the destination process.
When "sending" the data to another MPI process B in the same
node,
the cache is bypassed and the correct value buf[0] = 1 is written
("sent") to MPI process B. In addition, the value of buf[0] is
flushed within the cache.
(*) If now MPI process A would perform c = buf[0] within MPI_Send,
c would get value 1.


When I correctly understand the meaning and definition of keyword
const
the usage of keyword const would not applicable for "*buf".
When I remember correctly, also other vector systems were not
cache coherent. As far as I know, the IBM cell processor is also
not always cache coherent. When looking to the memory wall,
future (scalar) systems may also not be cache coherent.

Hubert

Steven Ericsson-Zenith wrote:

I think that there are two issues here.

1. What exactly is the behavior that relates to this cache
coherency that would distinguish the use of const from the
absence
of const?

2. Why would the standard accommodate such a propriety behavior?

If the send buffer is read from memory and causes a cache flush
that has not maintained coherency isn't this a problem in
anycase?
It appears to suggest that any value in the cache is irrelevant.

Perhaps you mean it the other way around: that is is the value in
the cache that is the effective value of the send buffer, and
that
this value may be flushed if a read causes a cache update? This
*would* potentially cause the effective value of the send
buffer to
change during the send.


With respect,
Steven


On Nov 18, 2007, at 4:55 PM, Erez Haba wrote:


Can you elaborate or send a pointer related to that
architecture?

The const keyword is a contract between the caller and the
caller
and the callee. It means in our example that the MPI_Send
function
will not change the user supplied buffer. Is this the case with
the SX systems?

-----Original Message-----
From: owner-mpi-21@xxxxxxxxxxxxx [mailto:owner-mpi-21@mpi-
forum.org] On Behalf Of Hubert Ritzdorf
Sent: Sunday, November 18, 2007 11:27 AM
To: mpi-21@xxxxxxxxxxxxx
Subject: Re: [mpi-21] MPI Forum suggestions 11/14

This is correct. The SX systems are not cache coherent. The send
buffer
is read directly from
memory and may cause a cache flush. Thus, the const keyword
would
not be correct.


Hubert Ritzdorf
NEC Laboratories Europe
IT Research Division
NEC Europe Ltd.


Snir, Marc wrote:

If I remember correctly, NEC or some other Japanese vendor
had a
system with non-choherent shared memory were this added
restriction was important. I guess that a read could cause a
flush to be canceled.

On 11/17/07 1:45 AM, "Steven Ericsson-Zenith"
<steven@xxxxxxxxxxxxx> wrote:



I think there is an intersect here with Erez's interest in
const.

In a parallel language it should be possible to send a constant
value
and read it at the same time (formally that would be fine).
Indeed, it
is possible in Occam and Ease/Carnap. I don't know the
semantics of
const in C functions, I assume it requires either a constant
parameter, a pass by value copy operation or some other
guarantee
that
the const is not written to (even by concurrent processes).


The issue in MPI has to do with the termination properties,
as I
recall; i.e., the validity of the send buffer is not
deterministic
since the read is not synchronized with the send (so you can't
know in
an interleaving that a subsequent alteration - including
deallocation
- to the buffer value has not occurred). Futzing with the send
buffer
during a send operation seems to me to be an unforgivable
rational.

It really is equivalent to the usage rules one finds in a
process
oriented language (like CSP/Occam or Ease/Carnap), an
asynchronous
operation is, in effect, a distinct process and the buffer
(if it
is a
variable) ceases to be in scope in all other processes
during the
period of concurrency. Indeed, such variables can only come
back
into
scope if there is a cooperative barrier, they cannot come
back into
scope from  subordinate processes. Occam overcomes this by not
allowing subordinates, Ease/Carnap overcomes this by only
allowing
constants (i.e., pass by value) to be passed to subordinates.

With respect,
Steven
--
Dr. Steven Ericsson-Zenith
Institute for Advanced Science & Engineering
http://iase.info
http://senses.info


On Nov 16, 2007, at 3:25 PM, Erez Haba wrote:



Thanks for the pointer!!!

Can someone elaborate on the rational to prohibit read
access to a
buffer being sent?

I think that it's more than a little loss of functionally.
Consider
a multi threaded application that wishes to send some data to
other
ranks and at the same time process this data locally?
Also consider the scenario when rank x sends the same data to
rank y
and z concurrently? does the text below prohibit that?


Quote --------------

In a multi-threaded implementation of MPI, the system may de-
schedule a thread that is blocked on a send or receive
operation,
and schedule another thread for execution in the same address
space.
In such a case it is the user's responsibility not to
access or
modify a communication buffer until the communication
completes.
Otherwise, the outcome of the computation is undefined.


[] Rationale.

We prohibit read accesses to a send buffer while it is
being used,
even though the send operation is not supposed to alter the
content
of this buffer. This may seem more stringent than
necessary, but
the
additional restriction causes little loss of functionality and
allows better performance on some systems --- consider the
case
where data transfer is done by a DMA engine that is not cache-
coherent with the main processor. ( End of rationale.)


------------


-----Original Message----- From: Lisandro Dalcin [mailto:dalcinl@xxxxxxxxx] Sent: Friday, November 16, 2007 3:03 PM To: mpi-21@xxxxxxxxxxxxx Cc: Erez Haba Subject: Re: [mpi-21] MPI Forum suggestions 11/14

On 11/16/07, Erez Haba <erezh@xxxxxxxxxxxxx> wrote:


Such implementations would be really bad for muti-threaded
applications, or those using async send. Is there an MPI
contract
that the application should not access the send buffer for
read
while the send is not complete?


AFAIK, there is. Please, see:

http://www.mpi-forum.org/docs/mpi-11-html/node40.html



-----Original Message-----
From: owner-mpi-21@xxxxxxxxxxxxx [mailto:owner-mpi-21@mpi-
forum.org] On Behalf Of Darius Buntinas
Sent: Friday, November 16, 2007 1:11 PM
To: mpi-21@xxxxxxxxxxxxx
Subject: Re: [mpi-21] MPI Forum suggestions 11/14


I believe the standard allows an implementation to modify the
buffer in
a send call, as long as it's changed back before it
completes.


I heard (don't remember from who), that there was an
implementation
that
would do endian conversion in place on the source buffer
before
sending,
then change it back.

Changing the standard to require the buffer pointer to be
const
could
theoretically disallow that implementation. So adding consts
wouldn't
be a minor change.


-d

On 11/16/2007 02:06 PM, David Gingold wrote:


On Nov 16, 2007, at 2:42 PM, Jeff Squyres wrote:



1. const: Sure, this seems like a good idea. But it
*is* an
API
change, and I am in agreement with Bill Gropp that there
should
be no
API changes for MPI 2.1 (except for bug fixes). Even if
the
API
changes supposedly won't matter (because someone will
find a
case
where it does matter). More specifically: this is a
slippery
slope
for 2.1. Making one "minor" change will lead to more minor
changes
which will lead to ...



If an implementation declares in its mpi.h:

int MPI_Send(const void *, int, MPI_Datatype, int, int,
MPI_Comm);

does this violate the existing standard?

I don't see how this would break code that calls MPI_Send
().  C
libraries made a similar transition with the advent of
ANSI C; I
believe
the const semantics are defined in a way that lets one do
just
this.

The const declaration allows an important optimization when
compiling
the calling functions.  But if an implementation is
allowed to
do
this,
then perhaps there is no need to change to the standard.

-dg



--
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería
(CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química
(INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas
(CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594









Marc Snir
Director, Illinois Informatics Initiative
U Illinois at Urbana-Champaign
LIS Bldg., Room 123
501 E Daniel St
Champaign, IL 61820
(217) 244 6568




--
Dr. Steven Ericsson-Zenith
Institute for Advanced Science & Engineering
http://iase.info
http://senses.info