Question

我运行了以下代码，随着时间的推移（一两个小时），我注意到迭代项目需要更长时间。我正在做的事情导致这种情况发生吗？如果是这样我该如何解决？

        int totalProcessed = 0;
        int totalRecords = MyList.Count();

        Parallel.ForEach(Partitioner.Create(0, totalRecords), (range, loopState) =>
        {
            for (int index = range.Item1; index < range.Item2; index++)
            {
                DoStuff(MyList.ElementAt(index));
                Interlocked.Increment(ref totalImported);
                if (totalImported % 1000 == 0)
                    Log(String.Format("Processed {0} of {1} records",totalProcessed, totalRecords));
            }
        });

         public void DoStuff(IEntity entity)
         {
              foreach (var client in Clients)
              {
                  // Add entity to a db using EF
                  client.Add(entity);
              }
          }

感谢您的帮助

Answer 1

ElementAt是一种非常慢的扩展方法，具有以下实现：

public static void T ElementAt(this IEnumerable<T> collection, int index) 
{
    int i = 0;
    foreach(T e in collection)
    {
        if(i == index)
        {
            return e;
        }
        i++;
    }
    throw new IndexOutOfRangeException();
}

很明显，当索引更大时，它的工作时间更长。您应该使用索引器MyList[index]而不是ElementAt。

Answer 2

正如@mace所指出的，使用ElementAt会遇到性能问题。每次调用它时，迭代器都从MyList的开头开始，跳过n个元素，直到达到所需的索引。随着指数持仓走高，这会逐渐恶化。

如果您仍需要MyList的流式访问权限，则可以使用Skip和Take来缓解性能问题。当您在MyList寻找职位时，仍会有一些绩效影响，但Take将确保您在到达目的地时获得一批元素，而不是为每个执行此操作 element。

我还注意到你正在使用分区样式foreach，但是你在整个范围内都是这样做的。我在下面的示例中实现了分区样式和批处理。

int totalRecords = MyList.Count(); int batchSize = 250; Parallel.ForEach(Partitioner.Create(0, totalRecords, batchSize), range => { foreach (var thing in MyList.Skip(range.Item1).Take(batchSize)) { DoStuff(thing); //logging and stuff... } });

<强>更新

再次阅读问题后，您可能还会遇到太多线程被用于可能是IO绑定问题的问题，即网络然后是DB \磁盘。我说这是因为你说CPU利用率很低，这让我觉得你在IO上受阻，而且这种情况越来越糟。

如果它纯粹低于ElementAt，您仍会看到高CPU使用率。

配置MaxDegreeOfParallelism以调整要使用的最大线程数：

const int BatchSize = 250; int totalRecords = MyList.Count(); var partitioner = Partitioner.Create(0, totalRecords, BatchSize); var options = new ParallelOptions { MaxDegreeOfParallelism = 2 }; Parallel.ForEach(partitioner, options, range => { foreach (int thing in MyList.Skip(range.Item1).Take(BatchSize)) { DoStuff(thing); //logging and stuff... } });

并行ForEach随着时间的推移耗尽极少的处理能力

2 个答案: