无法在YARN群集上运行Spark作业

时间:2017-04-07 05:53:56

标签: scala hadoop apache-spark java-8

我有一个简单的hadoop集群,Spark在其上运行(即Spark使用YARN作为集群管理器)。

我正在使用Hadoop 2.7; scala 2.112.1; spark 2.1.0和jdk 8.

现在,当我提交作业时,它会失败,并显示以下消息:

17/04/06 23:57:55 INFO yarn.Client: Application report for application_1491534363989_0004 (state: ACCEPTED)
17/04/06 23:57:56 INFO yarn.Client: Application report for application_1491534363989_0004 (state: FAILED)
17/04/06 23:57:56 INFO yarn.Client:
     client token: N/A
     diagnostics: Application application_1491534363989_0004 failed 2 times due to AM Container for appattempt_1491534363989_0004_000002 exited with  exitCode: 15
For more detailed output, check application tracking page:http://rm100.hadoop.cluster:8088/cluster/app/application_1491534363989_0004Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1491534363989_0004_02_000001
Exit code: 15

JDK 8有什么问题吗?

更新

当我使用JDK 7运行相同的程序时,它运行正常。所以我的问题是:Spark,scala,hadoop是否与JDK 8有任何问题?

1 个答案:

答案 0 :(得分:0)

我一直在使用java 8使用spark over yarn集群,一切运行顺畅。据我所知,更新版本的spark和scala需要java 8或更高版本。以下是您需要考虑的一些事项。

  1. 检查hadoop-env.sh
  2. 中的JAVA_HOME路径
  3. 启动纱线群集时,请确保使用SELECT product_name ,count(case when product_name='pro_a1' and price between 20 and 30 then 1 end) ,count(case when product_name='pro_b1' and price between 50 and 70 then 1 end) FROM Products group by product_name; 确定所有必需节点。
  4. 检查hadoop日志中的日志。
  5. 转到jps了解详情