我一直在尝试通过pyspark执行脚本.py,但我一直收到此错误:
11:55 $ ./bin/spark-submit --jars spark-cassandra-connector-2.0.0-M2-s_2.11.jar --py-files example.py
Exception in thread "main" java.lang.IllegalArgumentException: Missing application resource.
at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:241)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitArgs(SparkSubmitCommandBuilder.java:160)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:276)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:151)
at org.apache.spark.launcher.Main.main(Main.java:86)
我可以通过这样做轻松执行它:
11:57 $ pyspark --jars spark-cassandra-connector-2.0.0-M2-s_2.11.jar
然后在IPython
(交互式shell)中逐块粘贴代码。但我想把脚本放在一个cronjob中,以便它可以自动执行。我需要一个输入cronjob的命令,而spark-submit
不起作用。有什么想法吗?
答案 0 :(得分:1)
你需要再次将python文件放在最后。
./bin/spark-submit --jars spark-cassandra-connector-2.0.0-M2-s_2.11.jar --py-files example.py example.py