使用 OpenMP 时 CPU 利用率未达到 100%

时间:2021-04-27 21:22:56

标签: parallel-processing openmp

我正在使用 OpenMP 通过在 for 循环之前添加“#pragma omp parallel for”来并行化我的代码中的 for 循环。代码运行时不会产生任何错误,但我注意到在整个执行过程中核心利用率并不是恒定的。一开始,我在所有线程上获得了 100% 的利用率,但是随着执行的继续,这个数字对于所有线程下降到 ~70%。我使用 htop 监视了这种行为(我在 Linux 机器上运行)。一开始,所有线程都在右侧显示 100% 的绿色条。有时,绿色开始减少,一些红色开始像这样爬升:

enter image description here

我查看并看到每当使用内核线程时都会显示红色。但是,我仍然不明白为什么它首先出现。此行为仅在 for 循环并行化时发生。当我取出#pragma 语句时,代码始终以 100% 的速度运行。我的 for 循环格式如下:

  double xe1, ye1, ze1, te1, e1;
  double xe2, ye2, ze2, te2, e2;
  double xi1, yi1, zi1, ti1;
  double xi2, yi2, zi2, ti2;
  bool calculate_signal = true;
  time_t loop_start, loop_end;
  int max = omp_get_max_threads();
  omp_set_dynamic(0);
  omp_set_num_threads(max);
  #pragma omp parallel for
  for (int k = 0; k < index; k++) {
    time(&loop_start);
    AvalancheMicroscopic aval;
    aval.SetSensor(&sensor);
    aval.EnableSignalCalculation(calculate_signal);
 
    aval.AvalancheElectron(electrons.at(k*4 + 0),electrons.at(k*4 + 1),electrons.at(k*4 + 2),electrons.at(k*4 + 3), 0.1, 0,0,0);
    int np = aval.GetNumberOfElectronEndpoints();
    DriftLineRKF drift;
    drift.SetSensor(&sensor);
    drift.EnableSignalCalculation(calculate_signal);
    
    for (int j = np; j--;) {
      aval.GetElectronEndpoint(j, xe1, ye1, ze1, te1, e1, 
                                  xe2, ye2, ze2, te2, e2, status);
      drift.DriftIon(xe1, ye1, ze1, te1);
    }
    time(&loop_end);
    std::cout << "Time for " << np << " electrons: " << double(loop_end - loop_start) << std::endl;
  }

此代码是使用名为 Garfield++ (https://garfieldpp.web.cern.ch/garfieldpp/; https://gitlab.cern.ch/garfield/garfieldpp/-/tree/master/) 的工具包编写的。在这里,电子()是保存坐标的一维向量。索引变量的范围可以从 1 到 7000,具体取决于用户。我已经使用 index = 1000 和 index = 200 运行并获得了相同的行为。此外,np 的范围可以从 1 到 ~2000,并且是由外循环决定的,因此用户无法控制。

我尝试过 schedule(dynamic),但这似乎没有帮助。非常感谢您对导致这种行为的原因以及如何解决它的任何见解。谢谢。

编辑 1:编辑代码以提供更多详细信息。

编辑 2:我再次运行代码以查找每个循环所需的平均时间。正如预期的那样,答案不是很简单(至少对我来说不是)。代码运行了两次:一次没有并行化(#pragma 语句被注释掉),另一次运行了并行化。两次执行都使用相同的电子向量运行。下面,我打印了每个循环所需的时间(以秒为单位)。完成内循环所需的时间在分号的右边,而数字在分号的左边。我使用 time_t 计算了时间。我编辑了上面的代码以显示我所做的。另外,需要注意的另一件事是,第一个案例(无并行)的运行时间为 6065 秒,而第二个案例的运行时间为 2183 秒。

Using 1 core                        Using 10 cores
Time for 1 electrons: 7             Time for 1 electrons: 11
Time for 1565 electrons: 91         Time for 1 electrons: 0
Time for 1 electrons: 8             Time for 1 electrons: 10
Time for 1 electrons: 7             Time for 92 electrons: 12
Time for 3229 electrons: 162        Time for 1 electrons: 12
Time for 1 electrons: 7             Time for 1 electrons: 12
Time for 1 electrons: 7             Time for 397 electrons: 21
Time for 1 electrons: 6             Time for 563 electrons: 41
Time for 1 electrons: 8             Time for 902 electrons: 79
Time for 1 electrons: 7             Time for 572 electrons: 18
Time for 1 electrons: 7             Time for 1 electrons: 14
Time for 2028 electrons: 103        Time for 1207 electrons: 80
Time for 898 electrons: 57          Time for 391 electrons: 4
Time for 1 electrons: 7             Time for 1392 electrons: 1
Time for 1 electrons: 8             Time for 297 electrons: 9
Time for 1 electrons: 7             Time for 1439 electrons: 39
Time for 1 electrons: 8             Time for 1 electrons: 14
Time for 1739 electrons: 91         Time for 1 electrons: 13
Time for 1977 electrons: 99         Time for 1 electrons: 14
Time for 3053 electrons: 152        Time for 393 electrons: 21
Time for 166 electrons: 20          Time for 543 electrons: 31
Time for 1 electrons: 8             Time for 1767 electrons: 38
Time for 1 electrons: 7             Time for 744 electrons: 5
Time for 573 electrons: 39          Time for 436 electrons: 44
Time for 31 electrons: 16           Time for 2444 electrons: 19
Time for 774 electrons: 51          Time for 2469 electrons: 4
Time for 680 electrons: 44          Time for 368 electrons: 30
Time for 514 electrons: 36          Time for 895 electrons: 29
Time for 1281 electrons: 69         Time for 2223 electrons: 6
Time for 2847 electrons: 148        Time for 1 electrons: 8
Time for 1004 electrons: 57         Time for 1394 electrons: 9
Time for 1422 electrons: 76         Time for 1 electrons: 4
Time for 762 electrons: 48          Time for 3283 electrons: 9
Time for 1817 electrons: 96         Time for 1 electrons: 3
Time for 2133 electrons: 110        Time for 1 electrons: 11
Time for 895 electrons: 52          Time for 1010 electrons: 5
Time for 939 electrons: 55          Time for 1 electrons: 12
Time for 743 electrons: 44          Time for 484 electrons: 5
Time for 1121 electrons: 63         Time for 112 electrons: 19
Time for 2482 electrons: 126        Time for 733 electrons: 0
Time for 1449 electrons: 79         Time for 1 electrons: 14
Time for 2037 electrons: 106        Time for 143 electrons: 9
Time for 4227 electrons: 204        Time for 1 electrons: 11
Time for 611 electrons: 39          Time for 394 electrons: 36
Time for 738 electrons: 44          Time for 1489 electrons: 3
Time for 1416 electrons: 75         Time for 540 electrons: 66
Time for 519 electrons: 39          Time for 1 electrons: 13
Time for 676 electrons: 45          Time for 1835 electrons: 27
Time for 497 electrons: 36          Time for 837 electrons: 5
Time for 303 electrons: 27          Time for 559 electrons: 18
Time for 2695 electrons: 144        Time for 828 electrons: 8
Time for 120 electrons: 18          Time for 811 electrons: 28
Time for 142 electrons: 19          Time for 136 electrons: 49
Time for 809 electrons: 48          Time for 1881 electrons: 9
Time for 27 electrons: 14           Time for 1 electrons: 12
Time for 343 electrons: 26          Time for 816 electrons: 13
Time for 168 electrons: 19          Time for 651 electrons: 2
Time for 78 electrons: 15           Time for 1 electrons: 11
Time for 883 electrons: 50          Time for 878 electrons: 18
Time for 563 electrons: 36          Time for 361 electrons: 14
Time for 418 electrons: 30          Time for 2581 electrons: 26
Time for 808 electrons: 46          Time for 155 electrons: 10
Time for 1977 electrons: 96         Time for 518 electrons: 50
Time for 610 electrons: 38          Time for 1241 electrons: 8
Time for 326 electrons: 26          Time for 722 electrons: 4
Time for 84 electrons: 16           Time for 223 electrons: 6
Time for 3116 electrons: 145        Time for 362 electrons: 10
Time for 979 electrons: 53          Time for 865 electrons: 15
Time for 259 electrons: 24          Time for 1 electrons: 10
Time for 761 electrons: 44          Time for 791 electrons: 3
Time for 2690 electrons: 128        Time for 1 electrons: 11
Time for 795 electrons: 48          Time for 1 electrons: 10
Time for 1 electrons: 8             Time for 826 electrons: 83
Time for 1762 electrons: 80         Time for 2428 electrons: 2
Time for 2410 electrons: 124        Time for 1159 electrons: 17
Time for 454 electrons: 33          Time for 149 electrons: 10
Time for 1 electrons: 7             Time for 1355 electrons: 4
Time for 2912 electrons: 146        Time for 457 electrons: 25
Time for 1624 electrons: 85         Time for 523 electrons: 26
Time for 857 electrons: 54          Time for 561 electrons: 16
Time for 4517 electrons: 221        Time for 235 electrons: 43
Time for 2083 electrons: 113        Time for 708 electrons: 4
Time for 1268 electrons: 75         Time for 1388 electrons: 51
Time for 7548 electrons: 358        Time for 369 electrons: 3
Time for 557 electrons: 40          Time for 403 electrons: 7
Time for 1 electrons: 7             Time for 1234 electrons: 13
Time for 1 electrons: 7             Time for 3203 electrons: 13
Time for 1 electrons: 8             Time for 262 electrons: 27
Time for 1167 electrons: 65         Time for 444 electrons: 43
Time for 1410 electrons: 77         Time for 686 electrons: 7
Time for 1 electrons: 7             Time for 7654 electrons: 26
Time for 4490 electrons: 412        Time for 731 electrons: 23
Time for 3093 electrons: 146        Time for 1336 electrons: 77
Time for 1170 electrons: 67         Time for 2117 electrons: 111
Time for 1 electrons: 7             Time for 1750 electrons: 96
Time for 1 electrons: 8             Time for 1089 electrons: 58
Time for 1 electrons: 7             Time for 3823 electrons: 185
Time for 1 electrons: 7             Time for 996 electrons: 54
Time for 713 electrons: 45          Time for 1 electrons: 7
Time for 973 electrons: 57          Time for 1 electrons: 8

0 个答案:

没有答案