Question

我有N个线程执行各种任务，这些线程必须定期与线程屏障同步，如下图所示，有3个线程和8个任务。 ||表示时间障碍，所有线程必须等到8个任务完成后才重新开始。

Thread#1  |----task1--|---task6---|---wait-----||-taskB--|          ...
Thread#2  |--task2--|---task5--|-------taskE---||----taskA--|       ...
Thread#3  |-task3-|---task4--|-taskG--|--wait--||-taskC-|---taskD   ...

我找不到一个可行的解决方案，认为Semaphores http://greenteapress.com/semaphores/index.html的小书鼓舞人心。我想出了一个解决方案，使用下面显示的std :: atomic，“看起来”使用三个std :: atomic。我担心我的代码会破坏角落案例因此引用的动词。那么您可以分享有关此类代码验证的建议吗？你有更简单的傻瓜证明代码吗？

std::atomic<int> barrier1(0);
std::atomic<int> barrier2(0);
std::atomic<int> barrier3(0);

void my_thread()
{

  while(1) {
    // pop task from queue
    ...
    // and execute task 
    switch(task.id()) {
      case TaskID::Barrier:
        barrier2.store(0);
        barrier1++;
        while (barrier1.load() != NUM_THREAD) {
          std::this_thread::yield();
        }
        barrier3.store(0);
        barrier2++;
        while (barrier2.load() != NUM_THREAD) {
          std::this_thread::yield();
        }
        barrier1.store(0);
        barrier3++;
        while (barrier3.load() != NUM_THREAD) {
          std::this_thread::yield();
        }
       break;
     case TaskID::Task1:
       ...
     }
   }
}

Answer 1

Boost提供了barrier implementation作为C ++ 11标准线程库的扩展。如果使用Boost是一个选项，那么你应该看一下。

如果您必须依赖标准库设施，则可以根据std::mutex和std::condition_variable推出自己的实施，而不会有太多麻烦。

class Barrier {
    int wait_count;
    int const target_wait_count;
    std::mutex mtx;
    std::condition_variable cond_var;

    Barrier(int threads_to_wait_for)
     : wait_count(0), target_wait_count(threads_to_wait_for) {}

    void wait() {
        std::unique_lock<std::mutex> lk(mtx);
        ++wait_count;
        if(wait_count != target_wait_count) {
            // not all threads have arrived yet; go to sleep until they do
            cond_var.wait(lk, 
                [this]() { return wait_count == target_wait_count; });
        } else {
            // we are the last thread to arrive; wake the others and go on
            cond_var.notify_all();
        }
        // note that if you want to reuse the barrier, you will have to
        // reset wait_count to 0 now before calling wait again
        // if you do this, be aware that the reset must be synchronized with
        // threads that are still stuck in the wait
    }
};

这种实现优于基于原子的解决方案，即condition_variable::wait中等待的线程应该被操作系统的调度程序发送到睡眠状态，因此您不会通过等待线程在屏障上旋转来阻止CPU核心

关于重置障碍的几句话：最简单的解决方案是只使用单独的reset()方法，并让用户确保永远不会同时调用reset和wait。但在许多用例中，这对用户来说并不容易实现。

对于自复位屏障，您必须考虑等待计数的竞争：如果在从wait返回的最后一个线程之前重置等待计数，则某些线程可能会卡在屏障中。这里一个聪明的解决方案是终止条件不依赖于等待计数变量本身。相反，你引入了第二个计数器，只有调用notify的线程才会增加。然后其他线程观察该计数器的变化，以确定是否退出等待：

void wait() {
    std::unique_lock<std::mutex> lk(mtx);
    unsigned int const current_wait_cycle = m_inter_wait_count;
    ++wait_count;
    if(wait_count != target_wait_count) {
        // wait condition must not depend on wait_count
        cond_var.wait(lk, 
            [this, current_wait_cycle]() { 
                return m_inter_wait_count != current_wait_cycle;
            });
    } else {
        // increasing the second counter allows waiting threads to exit
        ++m_inter_wait_count;
        cond_var.notify_all();
    }
}

在（非常合理）假设所有线程在inter_wait_count溢出之前离开等待时，此解决方案是正确的。

Answer 2

对于原子变量，使用其中三个作为障碍只是过度杀伤，这只会使问题复杂化。你知道线程的数量，所以你可以在每次线程进入屏障时简单地原子递增一个计数器，然后旋转直到计数器变得大于或等于N.这样的事情：

void barrier(int N) {
    static std::atomic<unsigned int> gCounter = 0;
    gCounter++;
    while((int)(gCounter - N) < 0) std::this_thread::yield();
}

如果您没有比CPU核心更多的线程和更短的预期等待时间，您可能希望删除对std::this_thread::yield()的呼叫。这个电话可能真的很贵（超过一微秒，我下注，但我还没有测量过）。根据任务的大小，这可能很重要。

如果您想要重复障碍，只需按照N递增：

unsigned int lastBarrier = 0;
while(1) {
    switch(task.id()) {
        case TaskID::Barrier:
            barrier(lastBarrier += processCount);
            break;
    }
}

Answer 3

我想在@ComicSansMS给出的解决方案中指出，在执行wait_count

之前，cond_var.notify_all();应重置为0

这是因为当第二次调用屏障时，如果wait_count未重置为0，则if条件将始终失败。

如何使用std :: atomic实现可重用的线程障碍

3 个答案: