C ++ OpenMP:如何在函数成员内的并行区域中使用私有/受保护成员变量?

时间:2018-01-24 00:48:15

标签: c++ parallel-processing openmp

TL; DR:我有一个类成员函数,其中有一些并行代码使用其他私有或受保护的类成员。

我的班级结构类似于:

class ChildClass : public GeneralClass
{
private:
    std::vector<Eigen::MatrixXd> edgePotentials;   
protected:
    // Graph structure
    size_t numberOfNodes;
    size_t numberOfEdges;
    vector< vector<size_t> > edges;
    vector< vector<size_t> > containing_edges;   // containing_edges[i] is the list of edges that contain the node i

    // Some intermediate quantities
    std::vector<Eigen::MatrixXd> P;
    std::vector<Eigen::MatrixXd> RC;

    // Caching decision variables for the above quantities
    std::vector< std::vector<bool> > hasChangedP;
    std::vector< std::vector<bool> > hasChangedRC;

    // A component of the main algorithm (that need to be parallized)
    virtual void computeP(size_t d, const std::vector<Eigen::MatrixXd> &X);
public:
    ChildClass();

    // Main algorithm
    virtual double MainAlgorithm() override;
};

现在在成员MainAlgorithm中,我调用了一些需要并行化的函数:

double ChildClass::MainAlgorithm()
{
    /// Initialization
    vector<MatrixXd> X(D);
    ...

    // Main algorithm
    for(size_t d = 0; d < D; d++){
        // Step 1: compute the caching decision variables hasChangedP and hasChangedRC (to see if P and RC need to be re-computed or not)

        // Step 2: Call this function 
        computeP(d, X);

        // Step 3: Update X
    }     
}

并且所讨论的功能具有以下结构:

void ChildClass::computeP(size_t d, const vector<MatrixXd> &X)
{
    // for each edge
    #pragma omp parallel for
    for(size_t e = 0; e < numberOfEdges; e++){
        size_t i = edges[e][0];
        size_t j = edges[e][1];
        if(hasChangedRC[e][d]){
            RC[e].col(d) = edgePotentials[e]*X[1-d].col(j);
        }
    }

    // now for each node
    #pragma omp parallel for
    for(size_t i = 0; i < numberOfNodes; i++){
        if(hasChangedP[d][i]){
            ... compute P[d].col(i) based on RC, edgePotentials, containing_edges...
        }
    }

}

目前#pragma omp parallel for根本没有帮助。我想这是因为班级成员(numberOfNodes, numberOfEdges, RC, edgePotentials, containing_edges,......)无法在并行区域中共享?

你能帮我解决一下吗?非常感谢你!

更新

  • numberOfNodes可以从几千到几十万,numberOfEdges几次numberOfNodes
  • 正如@zzxyz所建议的,我试图将循环划分为N块(其中N是线程数)。而不是

    #pragma omp parallel for
    for(size_t e = 0; e < numberOfEdges; e++){
         // Code for each edge here
    }
    

我用过:

    size_t threads = 8;
    size_t p = floor(numberOfEdges/threads);

    #pragma omp parallel for
    for(size_t b = 0; b < threads; b++){
        size_t first = b*p;
        size_t last = (b+1)*p - 1;
        if(b >= threads - 1){
            last = numberOfEdges - 1;
        }
        for(size_t e = first; e <= last; e++){
            // Code for each edge here
        }
    }

类似于节点上的循环。然而,这也没有帮助。 (正如@zzxyz后面指出的那样,这是OpenGM已经为我们自动完成的事情。)

0 个答案:

没有答案