使用fortran MPI_Barrier进行分段错误

时间:2017-05-24 23:04:43

标签: fortran mpi

我看到人们使用C(Segmentation fault while using MPI_Barrier in `libpmpi.12.dylib`)和C ++(Why does MPI_Barrier cause a segmentation fault in C++)使用MPI_Barrier产生分段错误。但是,我不会重现他们得到的错误。

然而,现在我得到了相同的错误fortran MPI_Barrier。 我的代码很简单:

program main

implicit none

include 'mpif.h'


! local variables
!
character(len=80) :: filename, input
character(len=4) :: command
integer :: ierror, i, l, cmdunit
logical :: terminate
integer :: num_procs, my_id, impi_error
real :: program_start, program_end
call MPI_INIT(impi_error)
call MPI_COMM_RANK(MPI_COMM_WORLD,my_id,impi_error)
call MPI_COMM_SIZE(MPI_COMM_WORLD,num_procs,impi_error)
call MPI_Barrier(MPI_COMM_WORLD)
program_start = MPI_Wtime()
filename='sc.cmd'
cmdunit=8
print *, my_id, cmdunit
call MPI_Barrier(MPI_COMM_WORLD)
call MPI_Barrier(MPI_COMM_WORLD)
call MPI_Barrier(MPI_COMM_WORLD)
call MPI_Barrier(MPI_COMM_WORLD)
call MPI_Barrier(MPI_COMM_WORLD)
program_end = MPI_Wtime()
if (my_id == 0) then
    write(*,'(a,F25.16,a)') "MDStressLab runs in ", program_end -             program_start, " s."
endif
call MPI_FINALIZE(impi_error)
end program

代码没什么特别之处。但是,当我使用命令mpif90 tmp.f90编译代码然后使用命令mpirun -n 2 ./a.out运行时。它给了我:

           0           8
           1           8

Program received signal SIGSEGV: Segmentation fault - invalid memory     reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7FBF2C700E08
#1  0x7FBF2C6FFF90
#0  0x7F2EDF972E08
#2  0x7FBF2C3514AF
#1  0x7F2EDF971F90
#2  0x7F2EDF5C34AF
#3  0x7FBF2CA4F808
#4  0x400EB4 in MAIN__ at tmp.f90:?
#3  0x7F2EDFCC1808
#4  0x400EB4 in MAIN__ at tmp.f90:?
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 35660 on node min-virtual-machine exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

有趣的是它只会崩溃2个节点。除了2之外,它将运行1~10个节点。由于这在C和C ++中也是随机发生的,我认为MPI库中可能存在一些隐藏的错误。这只是我的猜测。有人可以帮忙吗?

1 个答案:

答案 0 :(得分:1)

只需替换

call MPI_Barrier(MPI_COMM_WORLD)

call MPI_Barrier(MPI_COMM_WORLD, impi_error)

请注意,如果您的Fortran编译器和MPI库支持Fortran 2008,您还可以选择替换

include mpif.h

use mpi_f08

并且您将不再需要impi_error参数,因为Fortran 2008绑定使这个可选