并行算法比顺序慢

时间:2014-05-14 17:44:23

标签: c# parallel-processing

我花了最后几天创建一个代码的并行版本(大学工作),但我走到了死胡同(至少对我来说):并行版本几乎比顺序版本慢两倍,并且我不知道为什么。这是代码:

Variables.GetMatrix();
int ThreadNumber = Environment.ProcessorCount/2;
int SS = Variables.PopSize / ThreadNumber;
//GeneticAlgorithm GA = new GeneticAlgorithm();
Stopwatch stopwatch = new Stopwatch(), st = new Stopwatch(), st1 = new Stopwatch();
List<Thread> ThreadList = new List<Thread>();
//List<Task> TaskList = new List<Task>();
GeneticAlgorithm[] SubPop = new GeneticAlgorithm[ThreadNumber];
Thread t;
//Task t;
ThreadVariables Instance = new ThreadVariables();

stopwatch.Start();
st.Start();
PopSettings();
InitialPopulation();
st.Stop();

//Lots of attributions...
int SPos = 0, EPos = SS;

for (int i = 0; i < ThreadNumber; i++)
{
    int temp = i, StartPos = SPos, EndPos = EPos;
    t = new Thread(() =>
    {
        SubPop[temp] = new GeneticAlgorithm(Population, NumSeq, SeqSize, MaxOffset, PopFit, Child, Instance, StartPos, EndPos);
        SubPop[temp].RunGA();
        SubPop[temp].ShowPopulation();
    });
    t.Start();
    ThreadList.Add(t);
    SPos = EPos;
    EPos += SS;
}

foreach (Thread a in ThreadList)
    a.Join();

double BestFit = SubPop[0].BestSol;
string BestAlign = SubPop[0].TV.Debug;

for (int i = 1; i < ThreadNumber; i++)
{
    if (BestFit < SubPop[i].BestSol)
    {
        BestFit = SubPop[i].BestSol;
        BestAlign = SubPop[i].TV.Debug;
        Variables.ResSave = SubPop[i].TV.ResSave;
        Variables.NumSeq = SubPop[i].TV.NumSeq;
    }
}

基本上,代码会创建一个对象类型的数组,instantiante并在数组的每个位置运行算法,并在最后收集对象数组的最佳值。这种类型的算法适用于三维数据阵列,并且在并行版本上,我分配每个线程来处理数组的一个范围,从而避免数据的并发性。不过,我的时间很慢......有什么想法吗?

我使用Core i5,它有四个核心(两个+两个超线程),但任何大于我使用的线程数都会使代码运行得更慢。

我能解释的并行运行的代码是:

在我发布的代码中调用的第二个方法进行了大约10,000次迭代,并且在每次迭代中它调用一个函数。此函数可能会或可能不会更多地调用其他函数(遍布每个线程的两个不同对象)并进行大量计算,这取决于算法特有的一系列因素。并且一个线程的所有这些方法都在数据数组的一个区域中工作,而其他线程无法访问这些方法。

3 个答案:

答案 0 :(得分:1)

使用System.Linq可以简化:

int ThreadNumber = Environment.ProcessorCount/2;
int SS = Variables.PopSize / ThreadNumber;
int numberOfTotalIterations = // I don't know what goes here.

var doneAlgorithms = Enumerable.Range(0, numberOfTotalIterations)
                               .AsParallel() // Makes the whole thing running in parallel
                               .WithDegreeOfParallelism(ThreadNumber) // We don't need this line if you want the system to manage the number of parallel processings.
                               .Select(index=> _runAlgorithmAndReturn(index,SS))
                               .ToArray(); // This is obsolete if you only need the collection of doneAlgorithms to determine the best one.
                                           // If not, keep it to prevent multiple enumerations.

// So we sort algorithms by BestSol ascending and take the first one to determine the "best".
// OrderBy causes a full enumeration, hence the above mentioned obsoletion of the ToArray() statement.
GeneticAlgorithm best = doneAlgorithms.OrderBy(algo => algo.BestSol).First();

BestFit = best.Bestsol;
BestAlign = best.TV.Debug;
Variables.ResSave = best.TV.ResSave;
Variables.NumSeq = best.TV.NumSeq;

并声明一种方法,使其更具可读性

/// <summary>
/// Runs a single algorithm and returns it
/// </summary>
private GeneticAlgorithm _runAlgorithmAndReturn(int index, int SS)
{
    int startPos = index * SS;
    int endPos = startPos + SS;
    var algo = new GeneticAlgorithm(Population, NumSeq, SeqSize, MaxOffset, PopFit, Child, Instance, startPos, endPos);
    algo.RunGA();
    algo.ShowPopulation();
    return algo;
}

答案 1 :(得分:0)

你自己创建了线程,所以那里有一些极端的开销。像建议的评论一样平行。还要确保单个工作单位花费的时间足够长。单个线程/工作单元应该存活至少~20 ms。

非常基本的东西。我建议你真正了解.NET中的多线程是如何工作的。

我看到你没有创建太多线程。但是最佳线程数不能仅根据处理器数来确定。内置的Parallel类具有高级算法以减少总体时间。

分区和线程是一些非常复杂的事情,需要很多知识才能正确,所以除非你真的知道你在做什么,否则依赖于Parallel类来为你处理它。

答案 2 :(得分:0)

创建线程有很大的开销。

使用ThreadPool,而不是创建新线程,如下所示:

Variables.GetMatrix();
int ThreadNumber = Environment.ProcessorCount / 2;
int SS = Variables.PopSize / ThreadNumber;
//GeneticAlgorithm GA = new GeneticAlgorithm();
Stopwatch stopwatch = new Stopwatch(), st = new Stopwatch(), st1 = new Stopwatch();
List<WaitHandle> WaitList = new List<WaitHandle>();
//List<Task> TaskList = new List<Task>();
GeneticAlgorithm[] SubPop = new GeneticAlgorithm[ThreadNumber];
//Task t;
ThreadVariables Instance = new ThreadVariables();

stopwatch.Start();
st.Start();
PopSettings();
InitialPopulation();
st.Stop();
//lots of attributions...
int SPos = 0, EPos = SS;

for (int i = 0; i < ThreadNumber; i++)
{
    int temp = i, StartPos = SPos, EndPos = EPos;
    ManualResetEvent wg = new ManualResetEvent(false);
    WaitList.Add(wg);
    ThreadPool.QueueUserWorkItem((unused) =>
    {
        SubPop[temp] = new GeneticAlgorithm(Population, NumSeq, SeqSize, MaxOffset, PopFit, Child, Instance, StartPos, EndPos);
        SubPop[temp].RunGA();
        SubPop[temp].ShowPopulation();
        wg.Set();
    });

    SPos = EPos;
    EPos += SS;
}

ManualResetEvent.WaitAll(WaitList.ToArray());

double BestFit = SubPop[0].BestSol;
string BestAlign = SubPop[0].TV.Debug;

for (int i = 1; i < ThreadNumber; i++)
{
    if (BestFit < SubPop[i].BestSol)
    {
        BestFit = SubPop[i].BestSol;
        BestAlign = SubPop[i].TV.Debug;
        Variables.ResSave = SubPop[i].TV.ResSave;
        Variables.NumSeq = SubPop[i].TV.NumSeq;
    }
}

请注意,我没有使用 Join 来等待线程执行,而是使用 WaitHandles