Spark - 失败:Set()解释

时间:2018-02-09 09:20:53

标签: apache-spark

我们的Spark Job有时会因为一个未知的原因而停止。

我们可能在日志中看到的唯一线索是failed: Set()的重复日志语句,如下所示。

我们非常感谢任何关于为什么在下面显示消息的想法。

18/02/08 22:12:14 INFO Executor: Finished task 0.0 in stage 51.0 (TID 38). 2008 bytes result sent to driver
18/02/08 22:12:14 INFO TaskSetManager: Finished task 0.0 in stage 51.0 (TID 38) in 312094 ms on localhost (executor driver) (1/1)
18/02/08 22:12:14 INFO TaskSchedulerImpl: Removed TaskSet 51.0, whose tasks have all completed, from pool 
18/02/08 22:12:14 INFO DAGScheduler: ShuffleMapStage 51 (rdd at EsSparkSQL.scala:97) finished in 602.298 s
18/02/08 22:12:14 INFO DAGScheduler: looking for newly runnable stages
18/02/08 22:12:14 INFO DAGScheduler: running: Set(ShuffleMapStage 60, ShuffleMapStage 1, ShuffleMapStage 19, ShuffleMapStage 6)
18/02/08 22:12:14 INFO DAGScheduler: waiting: Set(ShuffleMapStage 45, ShuffleMapStage 16, ShuffleMapStage 3, ShuffleMapStage 18, ShuffleMapStage 39, ShuffleMapStage 10, ShuffleMapStage 55, ShuffleMapStage 62, ShuffleMapStage 41, ResultStage 63, ShuffleMapStage 49, ShuffleMapStage 5, ShuffleMapStage 35, ShuffleMapStage 42, ShuffleMapStage 21, ShuffleMapStage 57, ShuffleMapStage 14, ShuffleMapStage 29)
18/02/08 22:12:14 INFO DAGScheduler: failed: Set()
18/02/08 22:13:33 INFO JDBCRDD: closed connection
18/02/08 22:13:33 INFO Executor: Finished task 0.0 in stage 60.0 (TID 44). 2008 bytes result sent to driver
18/02/08 22:13:33 INFO TaskSetManager: Finished task 0.0 in stage 60.0 (TID 44) in 196274 ms on localhost (executor driver) (1/1)
18/02/08 22:13:33 INFO TaskSchedulerImpl: Removed TaskSet 60.0, whose tasks have all completed, from pool 
18/02/08 22:13:33 INFO DAGScheduler: ShuffleMapStage 60 (rdd at EsSparkSQL.scala:97) finished in 681.143 s
18/02/08 22:13:33 INFO DAGScheduler: looking for newly runnable stages
18/02/08 22:13:33 INFO DAGScheduler: running: Set(ShuffleMapStage 1, ShuffleMapStage 19, ShuffleMapStage 6)
18/02/08 22:13:33 INFO DAGScheduler: waiting: Set(ShuffleMapStage 45, ShuffleMapStage 16, ShuffleMapStage 3, ShuffleMapStage 18, ShuffleMapStage 39, ShuffleMapStage 10, ShuffleMapStage 55, ShuffleMapStage 62, ShuffleMapStage 41, ResultStage 63, ShuffleMapStage 49, ShuffleMapStage 5, ShuffleMapStage 35, ShuffleMapStage 42, ShuffleMapStage 21, ShuffleMapStage 57, ShuffleMapStage 14, ShuffleMapStage 29)
18/02/08 22:13:33 INFO DAGScheduler: failed: Set()
18/02/08 22:28:02 INFO JDBCRDD: closed connection

1 个答案:

答案 0 :(得分:2)

根据Spark 1.0.2 (also 1.1.0) hangs on a partition,行DAGScheduler: failed: Set()表示失败阶段的集合为空。

相关问题