Charlie Fineman
Rajeev Thakur writes:
>
> All this sounds fine. My only concern is the MPI_COMM_WORLD case,
> where the implementation must figure out the sets of processes
> printing out identical data and display it accordingly.
> I think that, in the general case, finding these subsets is quite a
> task, best left to a good quality debugger.
>
> A simpler approach is to require that in the MPI_COMM_WORLD case,
> all processes must output identical data, and only one copy gets
> printed. If they don't output identical data, the implementation flags
> an error. To find out the erroneous process(es), the user could print
> with MPI_COMM_SELF in the next run.
>
> Note that this is similar to the fsingl and fmulti modes in
> CUBIX. In fsingl mode (the default), all processes output the same
> data and only one copy is displayed. An error is flagged if the
> outputs are different. In fmulti mode, processes can print whatever
> they want, and it all gets printed. These modes also work for input.
>
> Rajeev
>
>
>
> > From: John M May <johnmay@coral.llnl.gov>
> > Date: Wed, 19 Jun 1996 09:26:25 -0700 (PDT)
> >
> > Many programs use print statements for debugging or for reporting results.
> > Just as MPI's datatypes are useful for describing data in message passing
> > and file I/O operations, they could also be useful for rudimentary
> > formatting of output data. Furthermore, MPI's communicators can be used
> > to define which processors participate in an output function. This would
> > allow the system to combine output from multiple nodes into a single line.
> > Therefore, I would like to propose an MPI_Print function. Here are some
> > ideas on how it might work:
> >
> > MPI_PRINT( comm, buf, datatype, count, status )
> > IN comm [SAME] Set of processes to participate in output
> > IN buf Address of buffer to be written
> > IN datatype [SAME] Type of output data
> > IN count [SAME] Repetition count of type over buffer
> > OUT status Status information
> >
> > The communicator not only defines which nodes will participate in the
> > output but also how data will be combined. The output from each
> > node will be preceded by its global rank (i.e., its rank in
> > MPI_COMM_WORLD). However if multiple nodes in the given communicator
> > output identical data, the data will be merged to a single line, and
> > the line will be preceded by a specification of the nodes that sent
> > that data (see example below). The buf parameter specifies the
> > output buffer in the usual way, and the datatype tells the system
> > not only where to find the data in the buffer but also what its type
> > is, much as format specifications like %d and %f do in a C printf
> > function. The count allows a datatype to be repeated multiple times
> > over the buffer, and the status will tell the caller how many items
> > were printed.
> >
> > Some examples:
> >
> > #define BUFSIZE 80
> > char string[] = "Hello world";
> > float floats[6] = { 1, 2, .03, 400, 5e5, .00000006 };
> > struct fancy { int i,
> > char string[BUFSIZE] };
> > struct fancy mystruct;
> > MPI_Datatype fancy_type;
> > MPI_Status status;
> >
> > /* Initialize fancy_type to have the type signature
> > * { MPI_INT, MPI_CHAR, ... , MPI_CHAR }
> > */
> >
> > MPI_Comm_rank( MPI_COMM_WORLD, &mystruct.i );
> > gethostname( mystruct.string, BUFSIZE );
> >
> > /* Assume running on 4 nodes */
> > /* Nodes independently print first element in arbitrary order */
> > MPI_Print( MPI_COMM_SELF, floats, MPI_FLOAT, 1, &status );
> > /* Output:
> > 0: 1
> > 3: 1
> > 2: 1
> > 1: 1
> > */
> >
> > /* Nodes print the same array element collectively */
> > MPI_Print( MPI_COMM_WORLD, floats, MPI_FLOAT, 1, &status );
> > /* Output:
> > 0..3: 1
> > */
> >
> > /* Make one array element different on node zero */
> > if( mystruct.i == 0 ) {
> > floats[1] = 0;
> > }
> >
> > /* Nodes print the entire array collectively; they're the same
> > * on all nodes except for node 0.
> > */
> > MPI_Print( MPI_COMM_WORLD, floats, MPI_FLOAT, 6, &status );
> > /* Output:
> > 0: 1 0 0.03 400 500000 6e-8
> > 1..3: 1 2 0.03 400 500000 6e-8
> > */
> >
> > /* Printing a C-string works as you'd hope; characters after a null
> > * are ignored. How should this work in Fortran?
> > */
> > MPI_Print( MPI_COMM_WORLD, string, MPI_CHAR, sizeof(string), &status );
> > /* Output:
> > 0..3: Hello world
> > */
> >
> > /* Print a complex type that's different on every node. Lines in a
> > * collective print always appear in rank order.
> > */
> > MPI_Print( MPI_COMM_WORLD, &mystruct, fancy_type, 1, &status );
> > /* Output:
> > 0: 0 myhost
> > 1: 1 myhost
> > 2: 2 myhost
> > 3: 3 myhost
> > */
> >
> >
> > Notes
> >
> > We can argue about the exact output format; here I've chosen a
> > pretty common format for naming nodes and groups of nodes, followed by
> > a colon, a space, and then each element described by the datatype
> > separated by a space. The floats come out in the form you'd get with
> > %g in printf, and a newline is automatically added at the end of each
> > line. I'm certainly open to suggestion on this format, but I would
> > argue against making it too complex or highly configurable, since I
> > expect the main use of this function will be for quick debugging.
> >
> > Merging multiple lines is likely to be expensive, but I think it's
> > a reasonable cost in this context because the amount of data to
> > merge will be relatively small, and print statements are not usually
> > time critical operations anyway. Merging the data saves the user
> > from the error prone task of comparing multiple output lines by eye.
> >
> > One could also argue for letting the user send the output to stderr
> > or some other location; again, I like keeping the number of parameters
> > small given the indented use of the function.
>