外部并行循环中的内部顺序循环 - OMP

时间:2017-02-18 16:27:34

标签: c parallel-processing openmp

我必须在OMP中并行化第一个for循环,但在其中有一个由于数据依赖性而无法并行化的for循环。我尝试在外部进行并行处理,但指针存在问题。

问题的最小例子:

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <omp.h>

int main()
{

 int N = 5;
 int size = 6;
 int n, j, i;

 double t[] = {1,2,3,4,5,6};


 double z, h2M, R2M, dz;
 int *dynamic_d;
 int *dynamic_A;
 int *dynamic_B;
 int *output;

 dynamic_d = (int *) calloc (N+1, sizeof(int));

 for(i = 0; i < N+1; i++){
    *(dynamic_d + i) = i;
 }

 dynamic_A = (int*) calloc (N+2, sizeof(int));
 dynamic_B = (int*) calloc (N+2, sizeof(int));
 output = (int*) calloc (size, sizeof(int));


 for (j = 0; j < size; j++) {  
    z = t[j] + 1;
    *dynamic_A = 0;
    *dynamic_B = 1;                   

    *(dynamic_A + 1) = *dynamic_d;
    *(dynamic_B + 1) = 1;

    for (n = 2; n <= N+1; n++) {
          dz = *(dynamic_d + n-1)*z;
          *(dynamic_A + n) = *(dynamic_A + n-1) + dz + (*(dynamic_A + n-2));
          *(dynamic_B + n) = *(dynamic_B + n-1) + dz + (*(dynamic_B + n-2));
    }

    h2M = z + *(dynamic_d + N-1) - *(dynamic_d + N);
    R2M = -h2M + z + *(dynamic_d + N);

    *(dynamic_A + N+1) = *(dynamic_A + N) + R2M + *(dynamic_A + N-1);
    *(dynamic_B + N+1) = *(dynamic_B + N) + R2M + *(dynamic_B + N-1);

    *(output + j) = t[j] + *(dynamic_A + N+1) + *(dynamic_B + N+1);
 }

 printf("\n\noutput:\n");
 for (j = 0; j < size; j++){
    printf("| %d ", output[j]);
 }
 printf("\n");

 return 0;
}

1 个答案:

答案 0 :(得分:0)

唯一的数据依赖是两个数组dynamic_Adynamic_B,因为它们是唯一一个在循环中写入和读取的数组。 dynamic_d仅被阅读且output仅被写入(因此没有问题)。

但是,如果仔细查看dynamic_Adynamic_B依赖项,您可以看到它们不是循环传输的,因为迭代dynamic_A[i]中计算的j的任何值都是仅在该迭代中使用。整个数组将在最外层循环的下一次迭代中被覆盖。

您需要重写代码,以便每个线程都有自己的dynamic_Adynamic_B私有副本。例如:

#pragma omp parallel private(dynamic_A, dynamic_B, z, h2M, R2M)
{
  dynamic_A = (int*) calloc (N+2, sizeof(int));
  dynamic_B = (int*) calloc (N+2, sizeof(int));

  #pragma omp for
  for (j = 0; j < size; j++) {
    ...
  }

  free(dynamic_A);
  free(dynamic_B);
}