Question

我在Google云上有一个虚拟机，带有1个CPU插槽，每个核心有16个核心和2个线程（超线程）。

这是lscpu：

的输出

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Stepping:              0
CPU MHz:               2300.000
BogoMIPS:              4600.00
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-31

我正在运行我的进程，并且我试图在不同的逻辑CPU之间分配我的线程。

unsigned num_cpus = std::thread::hardware_concurrency();
LOG(INFO) << "Going to assign threads to " << num_cpus << " logical cpus";
cpu_set_t cpuset;
int rc = 0;
for (int i = 0; i < num_cpus - 5; i++) {
    worker_threads.push_back(std::thread(&CalculationWorker::work, &(workers[i]), i));
    // Create a cpu_set_t object representing a set of CPUs. Clear it and mark
    // only CPU i as set.
    CPU_ZERO(&cpuset);
    CPU_SET(i, &cpuset);
    int rc = pthread_setaffinity_np(worker_threads[i].native_handle(),
            sizeof(cpu_set_t), &cpuset);
    if (rc != 0) {
        LOG(ERROR) << "Error calling pthread_setaffinity_np: " << rc << "\n";
    }
    LOG(INFO) << "Set affinity for worker " << i << " to " << i;
}

问题是num_cpus确实是32但是当我在每个正在运行的线程中运行以下代码行时：

LOG(INFO) << "Worker thread " << worker_number << " on CPU " << sched_getcpu();

sched_getcpu()为所有线程返回0 它是否与虚拟机有关？

更新
我发现pthread_setaffinity_np确实有效，显然有一些守护进程在后台运行，这就是我看到其他核心被利用的原因。然而，sched_getcpu仍然无法工作并在所有线程上返回0，尽管我可以清楚地看到它们在不同的核心上运行。

Answer 1

您可以尝试在虚拟机上运行这个较小的程序：

#include <iostream>
#include <thread>
using namespace std;

int main(int argc, char *argv[])
{
    int rc, i;
    cpu_set_t cpuset;
    pthread_t thread;

    thread = pthread_self();

    //Check no. of cores on the machine
    cout << thread::hardware_concurrency() << endl;

    /* Set affinity mask */
    CPU_ZERO(&cpuset);
    for (i = 0; i < 8; i++) //I have 4 cores with 2 threads per core so running it for 8 times, modify it according to your lscpu o/p
        CPU_SET(i, &cpuset);

    rc = pthread_setaffinity_np(thread, sizeof(cpu_set_t), &cpuset);
    if (rc != 0)
    cout << "Error calling pthread_setaffinity_np !!! ";

    /* Assign affinity mask to the thread */
    rc = pthread_getaffinity_np(thread, sizeof(cpu_set_t), &cpuset);
    if (rc != 0)
    cout << "Error calling pthread_getaffinity_np !!!";

    cout << "pthread_getaffinity_np() returns:\n";
    for (i = 0; i < CPU_SETSIZE; i++)
    {
        if (CPU_ISSET(i, &cpuset))
            {
            cout << " CPU " << i << endl;
            cout << "This program (main thread) is on CPU " << sched_getcpu() << endl; 
        }
    }
    return 0;
}

这将让您了解VM pthread_setaffinity_np是否正常工作。在VM的情况下没有这样的特定限制，相反，它可能是由于某些进程的内核在云上的一些强制执行。您可以阅读更多相关信息here。

或者尝试使用sched_setaffinity()确认您是否确实能够在VM上设置cpusets。

我找到了你的评论（当我将所有线程的亲和力设置为单个核心时，线程仍然在不同的核心上运行）和原始帖子的注释（sched_getcpu()为所有线程返回0）一个混乱。可能在所有线程中为您的主线程（进程）返回此0。

sched_getcpu不起作用

1 个答案: