MPI_Comm_rank总是写0

时间:2012-11-14 14:36:18

标签: c openmpi mpich

如何获得预期的输出

rank 0
size 2
rank 1
size 2

或那些线的一些排列?

ranktest.c

#include <mpi.h>
#include <stdio.h>
int main(int argc, char *argv[]){
    MPI_Init(NULL, NULL);
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    printf("rank %d\n", world_rank);
    printf("size %d\n", world_size);
    MPI_Finalize();
    return 0;
}

编译并运行

tsbertalan@hustlenbustle:~$ mpicc ranktest.c
tsbertalan@hustlenbustle:~$ mpirun -np 2 ./a.out 
rank 0
size 1
rank 0
size 1

在另一位主持人身上:

tsbertalan@stamp:~$ mpicc ranktest.c
tsbertalan@stamp:~$ mpirun -np 2 ./a.out 
rank 0
size 2
rank 1
size 2

我试过

tsbertalan@hustlenbustle:~$ sudo aptitude reinstall openmpi-bin libopenmpi-dev

但没有改变。 / etc / openmpi / openmpi-default-hostfile和/etc/openmpi/openmpi-mca-params.conf都只包含两个主机上的注释。这可能有什么不同?

更改为MPI_Init(&argc, &argv),或者更改为int main()也无效。

真正的问题,多亏了user3469194:

linuxmint@linuxmint ~ $ sudo aptitude remove libopenmpi-dev mpich2
linuxmint@linuxmint ~ $ sudo aptitude install libmpich2-dev openmpi-bin
linuxmint@linuxmint ~ $ mpicc ranktest.c
linuxmint@linuxmint ~ $ mpirun -np 2 ./a.out
rank 0
size 1
rank 0
size 1
linuxmint@linuxmint ~ $ sudo aptitude remove libmpich2-dev openmpi-bin
linuxmint@linuxmint ~ $ sudo aptitude install libopenmpi-dev mpich2
linuxmint@linuxmint ~ $ mpicc ranktest.c
linuxmint@linuxmint ~ $ mpirun -np 2 ./a.out
[linuxmint:16539] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ../../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 357

(加上更多)

回复一些建议: (参见this github repo,2012年12月1日的提交。)

  

尝试在MPI_Init()之前移动world_rank和world_size的定义,它会改变什么吗?

当然,这种方法效果不好:

tsbertalan@perrin:~/svn/524hw4$ git checkout 7b5e229 ranktest.c
(reverse-i-search)`clean': tsbertalan@hustlenbustle:~/Documents/School/12fall2012/524apc524/hw/hw4$ make ^Cean && make ranktest && mpirun -np 2 ranktest
tsbertalan@perrin:~/svn/524hw4$ make clean && make ranktest && mpirun -np 2 ranktest
rm -f heat_serial heat_omp heat_mpi heat_serial_O* ranktest
mpicc ranktest.c -o ranktest
*** An error occurred in MPI_Comm_rank
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[perrin:15206] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Comm_rank
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[perrin:15207] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
tsbertalan@perrin:~/svn/524hw4$ git checkout HEAD ranktest.c

或者,在我的家用电脑上:

tsbertalan@hustlenbustle:~/Documents/School/12fall2012/524apc524/hw/hw4$ git checkout 7b5e229 ranktest.c
tsbertalan@hustlenbustle:~/Documents/School/12fall2012/524apc524/hw/hw4$ vim ranktest.c 
tsbertalan@hustlenbustle:~/Documents/School/12fall2012/524apc524/hw/hw4$ make clean && make ranktest && mpirun -np 2 ranktest
rm -f heat_serial heat_omp heat_mpi heat_serial_O* ranktest
mpicc ranktest.c -o ranktest
Attempting to use an MPI routine before initializing MPICH
Attempting to use an MPI routine before initializing MPICH
tsbertalan@hustlenbustle:~/Documents/School/12fall2012/524apc524/hw/hw4$ git checkout HEAD ranktest.c
  

这几乎总是运行用一个MPI编译的程序与另一个mpirun的问题。第一台机器(hustlenbustle)是否也安装了mpich2?事情在哪里出现?特别是which mpiccwhich mpirun的结果是什么?

我在每次尝试之前都会在每台计算机上重新编译。我继续前进made a make target。但是,根据要求:

tsbertalan@hustlenbustle:~$ which mpicc
/usr/bin/mpicc
tsbertalan@hustlenbustle:~$ which mpirun
/usr/bin/mpirun

tsbertalan@perrin:~/svn/524hw4$ which mpicc
/usr/bin/mpicc
tsbertalan@perrin:~/svn/524hw4$ which mpirun
/usr/bin/mpirun

而且,对于shiggles,这是一些aptitude搜索hnbperrin的输出。如果我应该搜索其他内容,请告诉我。

  

在Open MPI下,以下命令应打印出版本:mpirun -V。如果它不打印mpiexec (OpenRTE) 1.x.x.,则可能与运行时间不匹配。

tsbertalan@hustlenbustle:~$ mpirun -V
mpirun (Open MPI) 1.4.3

tsbertalan@perrin:~/svn/524hw4$ mpirun -V
mpirun (Open MPI) 1.4.1

但是,我正在为每次测试重新编译。

也许sudo aptitude reinstall SOMETHING可能会有所帮助?

2 个答案:

答案 0 :(得分:1)

在我的计算机上的mpic.c版本中找到了这段代码(当我安装了带有mpi的新软件包时,这给我带来了麻烦)。看来您的计算机上有这样的东西,而另一台主机的版本正确。

int MPI_Comm_rank( MPI_Comm comm, int *rank)
{
  *rank=0;
  return 0;
}

如您所见,rank始终设置为0(类似的大小函数可能将变量设置为1)。

答案 1 :(得分:1)

我也有这个问题。问题是mpicc是OpenMPI(看到这个,只是运行mpicc -v),而mpirun是MPICH2(mpirun -V)。我只是通过卸载MPICH2来解决它。