1) Does this apply only to MPI_COMM_WORLD? I think that is the intent.
2) Is MPI_COMM_SELF better? Maybe, maybe not.
3) When do the call-backs get executed?
I think we need to make this painfully clear. We need to say
that the call-backs are called in reverse order (which is already
said). We also need to say that it is AS THOUGH they were
called immediately before MPI_Finalize. We also need to say
that MPI_COMM_WORLD is not yet freed (maybe we need to rephrase
the part that indicates that the call-backs are invoked by
something a free of MPI_COMM_WORLD).