内部.par集合在Scala中破坏了外部ForkJoinTaskSupport

时间:2018-07-19 00:34:19

标签: scala

我想制作一个使用固定数量线程的并行集合。  为此的标准建议是为并行集合设置任务支持,以将ForkJoinTaskSupport与具有固定线程数的ForkJoinPool一起使用。直到您在并行集合中进行的处理本身使用并行集合,这才能很好地工作。在这种情况下,似乎为ForkJoinPool设置的限制消失了。

一个简单的测试如下所示:

import java.util.concurrent.atomic.AtomicInteger
import java.util.concurrent.ForkJoinPool
import scala.collection.parallel.ForkJoinTaskSupport

object InnerPar {

  def forkJoinPoolIsSuccess(useInnerPar:Boolean): Boolean = {
    val numTasks = 100
    val numThreads = 10

    // every thread in the outer collection will increment
    // and decrement this counter as it starts and exits
    val threadCounter = new AtomicInteger(0)

    // function that returns the thread count when we first
    // started running and creates an inner parallel collection
    def incrementAndCountThreads(idx:Int):Int = {
      val otherThreadsRunning:Int = threadCounter.getAndAdd(1)
      if (useInnerPar) {
        (0 until 20).toSeq.par.map { elem => elem + 1 }
      }
      Thread.sleep(10)
      threadCounter.getAndAdd(-1)
      otherThreadsRunning + 1
    }

    // create parallel collection using a ForkJoinPool with numThreads
    val parCollection = (0 until numTasks).toVector.par
    parCollection.tasksupport = new ForkJoinTaskSupport(new ForkJoinPool(numThreads))
    val threadCountLogList = parCollection.map { idx =>
      incrementAndCountThreads(idx)
    }

    // the total number of threads running should not have exceeded
    // numThreads at any point, similarly we hope that the number of
    // simultaneously executing threads was close numThreads at some point
    val respectsNumThreadsCapSuccess = threadCountLogList.max <= numThreads

    respectsNumThreadsCapSuccess
  } 


  def main(args:Array[String]):Unit = {
    val testConfigs = Seq(true, false, true, false)
    testConfigs.foreach { useInnerPar =>
      val isSuccess =  forkJoinPoolIsSuccess(useInnerPar)
      println(f"useInnerPar $useInnerPar%6s, success is $isSuccess%6s") 
    }
  }
}

然后,我们得到以下输出,该结果表明,如果我们在crementAndAndThreadThreads()内创建一个并行集合,则将同时运行多个numThreads(在示例10中)线程。

useInnerPar   true, success is  false
useInnerPar  false, success is   true
useInnerPar   true, success is  false
useInnerPar  false, success is   true

还请注意,在内部集合中使用ForkJoinTaskSupport不能解决问题。换句话说,如果对内部集合使用以下代码,则会得到相同的结果:

  if (useInnerPar) {
    val innerParCollection = (0 until 20).toVector.par
    innerParCollection.tasksupport = new ForkJoinTaskSupport(new ForkJoinPool(3))
    innerParCollection.map { elem => elem + 1 }
  }

我正在Linux 3.10.0 x86_64内核上使用Scala 2.12.5和Java OpenJDK 1.8.0_161-b14。

我想念什么吗?如果没有,是否可以解决此问题?

谢谢!

1 个答案:

答案 0 :(得分:1)

核心问题是在Java 8中,传递给ForkJoinPool的numThreads参数只是一个指南,而不是硬性限制。在Java 9中,可以设置maxPoolSize参数,该参数应为池中的线程数提供硬限制,并直接解决此问题。我不知道用Java 8解决此问题的好方法。

有关更多详细信息,请参见以下内容: https://github.com/scala/bug/issues/11036

相关问题