MPI_Scatter无效指针错误:MPI_Type_vector可能有问题

时间:2018-06-08 15:58:08

标签: c mpi

我正在编写一个程序,将84 * 84的矩阵分散到4个过程中。每个过程接收一个84 * 21的子矩阵。它看起来像这样:

1 1 2 2 3 3 4 4
1 1 2 2 3 3 4 4
1 1 2 2 3 3 4 4
1 1 2 2 3 3 4 4
1 1 2 2 3 3 4 4
1 1 2 2 3 3 4 4
1 1 2 2 3 3 4 4
1 1 2 2 3 3 4 4

然后,我应该创建一个新的数据类型来表示这种离散数据形式。我使用MPI_Type_vectorMPI_Type_create_resized来创建此类数据类型。但是,在MPI_Scatter函数期间引发了无效指针异常,我不认为我做了任何不正确的内存操作。有没有陷阱或东西?

顺便说一句,问题只发生在n大于80时。

这是代码。

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
/*
 * compile command:
 * mpiicc -g -c matvectmul_col_mul_version.c -o matvectmul_col_mul_version.o
 * mpiicc  matvectmul_col_mul_version.o -o matvectmul_col_mul_version
 *
 * run command:
 *      srun -n 4 -l ./matvectmul_col_mul_version 84
 */
int main(int argc, char* argv[]) {
        int n = atoi(argv[1]);
        double *loc_matrix = NULL;
        int my_rank, comm_sz;
        MPI_Init(NULL, NULL);
        MPI_Comm comm = MPI_COMM_WORLD;
        MPI_Comm_size(comm, &comm_sz);
        MPI_Comm_rank(comm, &my_rank);
        int m = n / comm_sz;
        MPI_Datatype vect_mpi_t;

        loc_matrix = (double*)malloc(n*m*sizeof(double));
        double *matrix = NULL;
        double *vector = NULL;
        if (my_rank == 0) {
                matrix = (double*)malloc(n*n*sizeof(double));
                memset(matrix,0,n*n*sizeof(double));
        }
        MPI_Datatype mpi_tmp_t;
        MPI_Type_vector(n,m,n,MPI_DOUBLE,&mpi_tmp_t);
        MPI_Type_create_resized(mpi_tmp_t,0,m*sizeof(double),&vect_mpi_t);
        MPI_Type_commit(&vect_mpi_t);

        MPI_Scatter(matrix,1,vect_mpi_t,loc_matrix,n*m,MPI_DOUBLE,0,MPI_COMM_WORLD);
        //MPI_Scatter(matrix,1,vect_mpi_t,loc_matrix,1,vect_mpi_t,0,MPI_COMM_WORLD); Also not work

        if (my_rank == 0) {
                free(matrix);
        }
        MPI_Finalize();
        return 0;
}

这是错误信息:

srun: error: cn003: task 0: Aborted (core dumped)
0: *** Error in `/home/2016011275/workspace/HW1/task2/./matvectmul_col_mul_version': free(): invalid pointer: 0x0000000001917840 $
**
0: ======= Backtrace: =========
0: /lib64/libc.so.6(+0x7c503)[0x7fce91e51503]
0: /apps/rm/intel/compilers_and_libraries_2018.0.128/linux/mpi/intel64/lib/libmpi.so.12(+0x2f08ab)[0x7fce92fc68ab]
0: /apps/rm/intel/compilers_and_libraries_2018.0.128/linux/mpi/intel64/lib/libmpi.so.12(+0x4d97c6)[0x7fce931af7c6]
0: /apps/rm/intel/compilers_and_libraries_2018.0.128/linux/mpi/intel64/lib/libmpi.so.12(+0x4de03f)[0x7fce931b403f]
0: /apps/rm/intel/compilers_and_libraries_2018.0.128/linux/mpi/intel64/lib/libmpi.so.12(PMPI_Scatter+0x360)[0x7fce931b2c00]
0: /home/2016011275/workspace/HW1/task2/./matvectmul_col_mul_version[0x400e63]
0: /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fce91df6b35]
0: /home/2016011275/workspace/HW1/task2/./matvectmul_col_mul_version[0x400ae9]
0: ======= Memory map: ========
0: 00400000-00402000 r-xp 00000000 00:29 11118498385                        /home/2016011275/workspace/HW1/task2/matvectmul_col_mu
l_version
0: 00601000-00602000 r--p 00001000 00:29 11118498385                        /home/2016011275/workspace/HW1/task2/matvectmul_col_mu
l_version
0: 00602000-00603000 rw-p 00002000 00:29 11118498385                        /home/2016011275/workspace/HW1/task2/matvectmul_col_mu
l_version

0 个答案:

没有答案