MPI_Barrier()如何工作?

时间:2016-05-23 22:14:37

标签: mpi

我有这段代码:

acceptAction = UIMutableUserNotificationAction()
        acceptAction.identifier = "Accept"
        acceptAction.title = "Accept"
        acceptAction.activationMode = UIUserNotificationActivationModeForeground
        acceptAction.destructive = false
        acceptAction.authenticationRequired = false

        let declineAction = UIMutableUserNotificationAction()
        declineAction.identifier = "Decline"
        declineAction.title = "Decline"
        declineAction.activationMode = UIUserNotificationActivationMode.Background
        declineAction.destructive = false
        declineAction.authenticationRequired = false

当我以

运行时
  

mpiexec -n 2 MPI.exe

该计划有效;输出是:

#include <cstdint>
#include <mpi.h>
#include <iostream>
using namespace std;

int main(int argc, char **argv)
{
    MPI_Init(&argc, &argv);
    int rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (rank == 0)
        MPI_Barrier(MPI_COMM_WORLD);
    cout << "Some output\n";
    if (rank == 1)
        MPI_Barrier(MPI_COMM_WORLD);
    MPI_Barrier(MPI_COMM_WORLD);
    cout << "end\n";
    MPI_Finalize();
    return 0;
}

然而,当我以

运行时
  

mpiexec -n 3 MPI.exe

程序正常工作。我期望这样的输出:

Some output
End
Some output
End

在此步骤中,我希望程序停止。

1 个答案:

答案 0 :(得分:3)

您需要确保每个进程的屏障调用次数相同。在您的特定情况下,当n = 3时,您对等级0和等级1有两个障碍调用,但对于等级2只有1个。程序将阻塞,直到等级2过程也达到障碍。

以下是n = 3应该发生的事情:

together:
    rank 0 will reach barrier 1 then block
    rank 1 will print "some output", reach barrier 2 then block
    rank 2 will print "some output", reach barrier 3 then block
together:
    rank 0 will print "some output", reach barrier 3 then block
    rank 1 will reach barrier 3 then block
    rank 2 will print "end" then hit finalize

有一个进程在finalize中而其他进程被阻塞将是未定义的行为。

对n = 2进行相同的分析:

together:
    rank 0 will reach barrier 1 then block
    rank 1 will print "some output", reach barrier 2 then block
together:
    rank 0 will print "some output", reach barrier 3 then block
    rank 1 will reach barrier 3 then block
together:
    rank 0 will print "end" then hit finalize
    rank 1 will print "end" then hit finalize

这表明输出应为:

some output
some output
end 
end

但是你得到了:

some output
end 
some output
end

这与mpi基础设施如何缓存各级别的stdout传输有关。如果我们引入延迟以便MPI决定它应该收集结果,我们可以更好地看到行为:

#include <cstdint>
#include <unistd.h>
#include <mpi.h>
#include <iostream>
using namespace std;

int main(int argc, char **argv)
{
    MPI_Init(&argc, &argv);
    int rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (rank == 0) {
        cout << rank << " Barrier 1\n" << flush;
        MPI_Barrier(MPI_COMM_WORLD);
    }
    cout << rank << " Some output \n" << flush;
    usleep(1000000);
    if (rank == 1) {
        cout << rank << " Barrier 2\n" << flush;
        MPI_Barrier(MPI_COMM_WORLD);
    }
    cout << rank << " Barrier 3\n" << flush;
    MPI_Barrier(MPI_COMM_WORLD);
    cout << rank << " end\n" << flush;
    usleep(1000000);
    MPI_Finalize();
    return 0;
}

产生:

$ mpiexec -n 2 ./a.out 
0 Barrier 1
1 Some output 
0 Some output 
1 Barrier 2
1 Barrier 3
0 Barrier 3
0 end
1 end

$ mpiexec -n 3 ./a.out 
2 Some output 
0 Barrier 1
1 Some output 
0 Some output 
1 Barrier 2
1 Barrier 3
2 Barrier 3
2 end
0 Barrier 3
^Cmpiexec: killing job...

或者,查看以下C ++ 11代码中的时间戳:

#include <cstdint>
#include <chrono>
#include <mpi.h>
#include <iostream>
using namespace std;

inline unsigned long int time(void) { 
    return std::chrono::high_resolution_clock::now().time_since_epoch().count(); 
}

int main(int argc, char **argv)
{
    MPI_Init(&argc, &argv);
    int rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (rank == 0) {
        MPI_Barrier(MPI_COMM_WORLD);
    }
    cout << rank << " " << time() << " Some output\n";
    if (rank == 1) {
        MPI_Barrier(MPI_COMM_WORLD);
    }
    MPI_Barrier(MPI_COMM_WORLD);
    cout << rank << " " << time() << " end\n";
    MPI_Finalize();
    return 0;
}

输出:

$ mpiexec -n 2 ./a.out 
0 1464100768220965374 Some output
0 1464100768221002105 end
1 1464100768220902046 Some output
1 1464100768221000693 end

按时间戳排序:

$ mpiexec -n 2 ./a.out 
1 1464100768220902046 Some output
0 1464100768220965374 Some output
1 1464100768221000693 end
0 1464100768221002105 end

结论是障碍表现得如预期,并且印刷声明不一定会告诉你。

编辑:2016-05-24显示程序行为的详细分析。