Spark + Scala:如何在build.sbt中添加外部依赖项

时间:2020-04-05 20:45:05

标签: scala apache-spark sbt spark-streaming

我是Spark的新手(使用v2.4.5),并且仍在尝试找出添加外部依赖项的正确方法。尝试将Kafka流添加到我的项目时,我的build.sbt看起来像这样:

name := "Stream Handler"

version := "1.0"

scalaVersion := "2.11.12"

libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % "2.4.5" % "provided",  
    "org.apache.spark" % "spark-streaming_2.11" % "2.4.5" % "provided",
    "org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "2.4.5"
)

此构建成功完成,但是与spark-submit一起运行时,我得到了java.lang.NoClassDefFoundError的{​​{1}}。

我可以通过这样的KafkaUtils选项传递依赖项来使代码正常工作:

--packages

理想情况下,我想在$ spark-submit [other_args] --packages "org.apache.spark:org.apache.spark:spark-streaming-kafka-0-10_2.11:2.4.5" 中设置所有依赖项,但是我不确定自己做错了什么。任何建议将不胜感激!

1 个答案:

答案 0 :(得分:1)

您的"org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "2.4.5"是错误的。

像mvnrepo一样将其更改为下方。https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10

libraryDependencies += "org.apache.spark" %% "spark-streaming-kafka-0-10" % "2.4.5"