because sometimes there are big differences between
the thinking of parallel programming specialists :-(see annex A)
and the thinking of compiler specialists (see annex B ;-)
about the task and implementation of compilers,
I think the following advices are necessary
- in chapter 4.3 PUT/GET:
Advice to users. Normally an optimizing compiler is allowed
to allocate the target window on the target host in a
register. Therefore the application on the target host
probably do not come to know the new values stored by
MPI_PUT. In ANSI C the application should therefore
declare the target window as volatile. In Fortran there
are three possibilities: 1. The non-standard volatile
declaration (supported by most compilers, except e.g. Cray),
2. allocate the target window in a common block and use a
library routine (e.g. MPI_TEST or MPI_GET_COUNTER) before
accessing the target window, and 3. calling a library
routine (e.g. MPI_TEST or MPI_GET_COUNTER) before accessing
the target window and using the target window as an additional
dummy argument in that call. (End of advice to users.)
- in chapter 4.10 Handlers
(To append into the advice to users:)
Because the handler routine can be called at any time
the application itsself and a handler routine must
be very careful in using global variables by both.
Additional to the use of MPI_HANDLER_LOCK
in C such non-extern global variables should be declared
as volatile because the compiler can eliminate memory
accesses due to its knowledge that MPI_HANDLER_LOCK
cannot access these global variables; in Fortran this
problem does not arise if the application uses common
blocks and accesses them only between a call to
MPI_HANDLER_LOCK and to MPI_HANDLER_UNLOCK.
Annex C is a comment about the history to these advices.
-----------------------------------------------------------
Annex A
Draft mpi-1sided, 26-FEB-96, page 8, lines 23-37
"This problem, however, will not occur ...;
this will cause the registers to be saved in memory,
before the put occurs."
-----------------------------------------------------------
Annex B
The following example shows, that there are compiler that
DO NOT SAVE THE REGISTERS INTO THE MEMORY BEFORE
CALLING A SUBROUTINE (e.g. any MPI-Routine)!
awsws6 18% uname -a
ULTRIX awsws6.rus.uni-stuttgart.de 4.3 0 RISC
awsws6 19% cat vola_test.f
program volatest
volatile vola
integer vola
integer vola_addr
integer new_vola
integer nonvola
integer nonvola_addr
integer new_nonvola
vola_addr = loc(vola)
vola = 11
write (*,*) 'vola=',vola
vola = 2*vola
call handler(vola_addr)
new_vola = vola
write (*,*) 'new_vola=',new_vola
nonvola_addr = loc(nonvola)
nonvola = 44
write (*,*) 'nonvola=',nonvola
nonvola = 2*nonvola
call handler(nonvola_addr)
new_nonvola = nonvola
write (*,*) 'new_nonvola=',new_nonvola
stop
end
subroutine handler(vvv_addr)
integer vvv_addr
integer mem(1), mem_addr
mem_addr = loc(mem)
write(*,*) ' handler old value=',mem(1+(vvv_addr-mem_addr)/4)
mem(1+(vvv_addr-mem_addr)/4) = 9000+mem(1+(vvv_addr-mem_addr)/4)
write(*,*) ' handler new value=',mem(1+(vvv_addr-mem_addr)/4)
return
end
awsws6 20% f77 -o vola_test vola_test.f
awsws6 21% vola_test
vola= 11
handler old value= 22
handler new value= 9022
new_vola= 9022
nonvola= 44
handler old value= 44 ! Marc would here expect 88
handler new value= 9044 ! and here 9088
new_nonvola= 88 ! and here 9088
awsws6 22%
(This compiler has the "philosophy" that the called routine
must preserve the registers' content)
------------------------------------------------------------
Annex C
We had to learn in the DFN-RPC project that such advices do not
prevent the users from getting wrong results because they did
not understand our advice and therefore they did not use volatile;
but they found the such optimizing compilers! :-(
------------------------------------------------------------
Comments?
Rolf Rabenseifner (Computer Center )
Rechenzentrum Universitaet Stuttgart (University of Stuttgart)
Allmandring 30 Phone: ++49 711 6855530
D-70550 Stuttgart 80 FAX: ++49 711 6787626
Germany rabenseifner@rus.uni-stuttgart.de