使用纱线主控器时,Spark-submit失败,scala.Predef中的错误要求失败

时间:2017-03-15 23:12:12

标签: apache-spark mapr sparkcore

我的Spark工作失败并出现以下异常,我无法找出导致失败的要求缺失:

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
        at scala.Predef$.require(Predef.scala:221)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:472)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:470)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:470)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:468)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:468)
        at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:727)
        at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1021)
        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
        at org.apache.spark.deploy.yarn.Client.main(Client.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:742)

Spark-submit命令:

spark-submit --conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/xyz/conf/log4j.xml \
-DHOME=/xyz/transformation -DENV=e1 \
-DJOB=xformation --conf spark.local.dir=/warehouse/tmp/spark1489619325 \
--queue dev --master yarn --deploy-mode cluster \
--properties-file /xyz/conf/job.conf \
--files /xyz/conf/e1.properties --class TransformationJob /xyz/job.jar

同一程序与master作为本地程序可以正常工作。

任何建议都会有很大的帮助。提前谢谢。

3 个答案:

答案 0 :(得分:0)

我在classpath中有一个巨大的罐子列表,使用' - jars'元素和罐子之一是罪魁祸首,当我从' - jars'问题得到解决,我仍然不确定为什么spark-submit因为那个jar而失败。

答案 1 :(得分:0)

我遇到了由数据文件引起的错误。 我指向了错误的火车数据方向。

训练数据和测试数据不匹配。

训练数据:0,1 0 0 1 0 0 1 0 0 1

测试数据:1, 2, 0, 0, 10

我修正了train数据源方向,问题解决了。

答案 2 :(得分:-1)

我收到了类似的错误:

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
        at scala.Predef$.require(Predef.scala:221)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8$$anonfun$apply$5.apply(Client.scala:501)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8$$anonfun$apply$5.apply(Client.scala:499)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8.apply(Client.scala:499)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8.apply(Client.scala:497)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:497)
        at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:763)
        at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:143)
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1109)
        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1169)
        at org.apache.spark.deploy.yarn.Client.main(Client.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

解决方法是:

在终端或日志上,您将有一条WARN行,如下所示:

WARN客户端:资源文件:多次添加到分布式缓存中。

只需从spark提交脚本中删除这个额外的jar。希望这对所有人都有帮助。

相关问题