Also MPI_Abort currently does not return the error code back to the
run environment.
This enhancement has been requested by Walter Spector and Ken Taylor
for FNMOC
See PV's 799785 & 802953. FNMOC has some operational jobs
that they want to
have query the exit status and then take appropriate action.
This feature will be implemented for MPI on both Irix and Linux.
if (unexpdeath){
MPI_SGI_printf("MPI: MPI_COMM_WORLD rank %d has terminated without calling
MPI_Finalize()\n",
mpi_sgi_base_grank[mpi_sgi_my_hrank]+unexpdeath-1);
label = MPI_CMD_EXITSTAT;
MPI_SGI_ctrl_send(&label, sizeof(int));
MPI_SGI_ctrl_send(&mpi_sgi_exit_stat, sizeof(int));
The MPI daemon will then terminate itself and all its child processes.
The mpirun process will get the exit status and execute the following function:
static int
get_exitstat(xmpi_arg_t *arg, void *dummy)
{
int len;
len = sizeof(int);
xmpi_net_recv(arg->recv_fd,
&MPI_exitstat, len);
if (WIFEXITED(MPI_exitstat))
{
int estat = WEXITSTATUS(MPI_exitstat);
exit(estat);
} else if (WIFSIGNALED(MPI_exitstat))
{
int sig_num = WTERMSIG(MPI_exitstat);
xmpi_all_error(0,"Received signal %1d\n", sig_num);
exit(1);
}
}
A message will be displayed if a child process was terminated by receipt
of
a signal:
$ mpirun -np 2 ./mtest1
groupsize= 2
lib-4051 : UNRECOVERABLE library error
The file must not exist prior to OPEN if STATUS is 'NEW'.
Encountered during an OPEN of unit 1
Fortran unit 1 is not connected
IOT Trap
MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
MPI: Received signal 6
If a child process terminated via MPI_Abort, the exit status will be
returned to the calling
environment. Any comm argument to MPI_Abort will be treated as
if the comm were
MPI_COMM_WORLD. This is standard compliant (see pg 197 of the
MPI standard).
$ cat mpibug.f
program mpibug
use mpi
implicit none
integer:: ierr
call mpi_init (ierr)
call mpi_abort (MPI_COMM_WORLD,
42, ierr)
end
$ mpirun -np 2 ./mpibug
MPI: MPI_COMM_WORLD rank 1 has terminated without calling MPI_Finalize()
$ echo $?
42