在pyspark中运行脚本时缺少应用程序资源

时间:2017-05-11 10:01:51

标签: python cassandra cron pyspark ipython

我一直在尝试通过pyspark执行脚本.py,但我一直收到此错误:

11:55 $ ./bin/spark-submit --jars spark-cassandra-connector-2.0.0-M2-s_2.11.jar --py-files example.py
Exception in thread "main" java.lang.IllegalArgumentException: Missing application resource.
    at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:241)
    at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitArgs(SparkSubmitCommandBuilder.java:160)
    at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:276)
    at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:151)
    at org.apache.spark.launcher.Main.main(Main.java:86)

我可以通过这样做轻松执行它:

 11:57 $  pyspark --jars spark-cassandra-connector-2.0.0-M2-s_2.11.jar

然后在IPython(交互式shell)中逐块粘贴代码。但我想把脚本放在一个cronjob中,以便它可以自动执行。我需要一个输入cronjob的命令,而spark-submit不起作用。有什么想法吗?

1 个答案:

答案 0 :(得分:1)

你需要再次将python文件放在最后。

./bin/spark-submit --jars spark-cassandra-connector-2.0.0-M2-s_2.11.jar --py-files example.py example.py