"Bad substitution" when submitting spark job to yarn-cluster

时间:2015-09-01 22:14:48

标签: apache-spark yarn

I am doing a smoke test against a yarn cluster using yarn-cluster as the master with the SparkPi example program. Here is the command line:

  $SPARK_HOME/bin/spark-submit --master yarn-cluster 
 --executor-memory 8G --executor-cores 240 --class org.apache.spark.examples.SparkPi 

examples/target/scala-2.11/spark-examples-1.4.1-hadoop2.7.1.jar

The yarn accepts the job but then complains about a "bad substitution". Maybe it is on the hdp.version ??

15/09/01 21:54:05 INFO yarn.Client: Application report for application_1441066518301_0013 (state: ACCEPTED)
15/09/01 21:54:05 INFO yarn.Client:
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1441144443866
     final status: UNDEFINED
     tracking URL: http://yarnmaster-8245.lvs01.dev.ebayc3.com:8088/proxy/application_1441066518301_0013/
     user: stack
15/09/01 21:54:06 INFO yarn.Client: Application report for application_1441066518301_0013 (state: ACCEPTED)
15/09/01 21:54:10 INFO yarn.Client: Application report for application_1441066518301_0013 (state: FAILED)
15/09/01 21:54:10 INFO yarn.Client:
     client token: N/A
     diagnostics: Application application_1441066518301_0013 failed 2 times due to AM Container for appattempt_1441066518301_0013_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://yarnmaster-8245.lvs01.dev.ebayc3.com:8088/cluster/app/application_1441066518301_0013Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e03_1441066518301_0013_02_000001
Exit code: 1
Exception message: /mnt/yarn/nm/local/usercache/stack/appcache/
application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/
launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:
/usr/hdp/current/hadoop-client/*::$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:
/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:
/etc/hadoop/conf/secure: bad substitution

Stack trace: ExitCodeException exitCode=1: /mnt/yarn/nm/local/usercache/stack/appcache/application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
    at org.apache.hadoop.util.Shell.run(Shell.java:456)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Of note here is:

/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:
/etc/hadoop/conf/secure: bad substitution

The "sh" is linked to bash:

$ ll /bin/sh
lrwxrwxrwx 1 root root 4 Sep  1 05:48 /bin/sh -> bash

6 个答案:

答案 0 :(得分:20)

hdp.version未正确替换导致。您必须在hdp.version下的java-opts文件中设置$SPARK_HOME/conf

你必须设置

spark.driver.extraJavaOptions -Dhdp.version=XXX 
spark.yarn.am.extraJavaOptions -Dhdp.version=XXX
spark-defaults.conf下的$SPARK_HOME/conf下的

,其中XXX是hdp的版本。

答案 1 :(得分:10)

如果你使用spark with hdp,那么你必须做以下事情:

$SPARK_HOME/conf/spark-defaults.conf

中添加这些条目
spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

java-opts中创建一个名为$SPARK_HOME/conf的文件,并将已安装的HDP版本添加到该文件中,如下所示:

-Dhdp.version=2.2.0.0-2041 (your installed HDP version)

要确定安装了哪个hdp版本,请在群集中运行此命令:

hdp-select status hadoop-client

答案 2 :(得分:5)

我有同样的问题:

launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*::$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

由于我找不到任何/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo*文件,我刚编辑了mapred-site.xml并删除了 “在/ usr / HDP / $ {hdp.version} /hadoop/lib/hadoop-lzo-0.6.0 $ {hdp.version}的.jar:”

答案 3 :(得分:1)

我也使用BigInsights 4.2.0.0使用yarn,spark和mapreduce 2来解决这个问题,导致它的原因是iop.version。 要修复它,你必须将iop.version变量添加到mapred-site,这可以通过以下步骤完成:

在Ambari Server中转到:

  • MAPREDUCE2
  • 配置(标签)
  • 高级(标签)
  • 点击自定义mapred-site
  • 添加属性...
  • 将iop.version和BigInsights版本放在一起。
  • 重新启动所有服务。

这已经解决了。

答案 4 :(得分:1)

  1. 转到ambari-yarn。
  2. 点击Configs-> Advanced-> Custom yarn-site-> Add Property ...

    将hdp版本添加为HDP版本的键和值。 您将获得hdp版本以及下面的命令

    hdp-select versions

    e.g。 2.5.3.0-37

    现在将您的属性添加为

    hdp.version = 2.5.3.0-37

    1. 否则将$ {hdp.version}替换为yarn-site.xml中的hdp版本(2.5.3.0-37)和yarn-env.sh

答案 5 :(得分:0)

这可能是由/bin/sh链接到破折号引起的,而不是bash,这通常发生在基于Debian的系统上。

要解决此问题,请运行sudo dpkg-reconfigure dash并选择否。