Spark提交无法从jar中选择classpath

时间:2017-08-19 18:02:05

标签: scala apache-spark gradle

我创建了一个spark作业,它将从一个cassandra表中获取数据并插入到另一个表中,我正在使用gradle来构建jar文件,我可以创建一个包含所有依赖项的jar,我正在使用下面的命令触发火花作业

spark-submit --class DataMigration OrderAnalytics.jar

OrderAnalytics.jar中存在所有必需的罐子,即lib / **我仍然得到NoClassDefFoundError,如下所示

17/08/19 22:56:38 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)

META-INF看起来像这样

Manifest-Version: 1.0
Main-Class: DataMigration
Class-Path: lib/spark-sql_2.11-2.2.0.jar lib/spark-cassandra-connector
 _2.11-2.0.3.jar lib/univocity-parsers-2.2.1.jar lib/spark-sketch_2.11
 -2.2.0.jar lib/spark-core_2.11-2.2.0.jar lib/spark-catalyst_2.11-2.2.
 0.jar lib/spark-tags_2.11-2.2.0.jar lib/parquet-column-1.8.2.jar lib/
 parquet-hadoop-1.8.2.jar lib/jackson-databind-2.6.5.jar lib/xbean-asm
 5-shaded-4.4.jar lib/unused-1.0.0.jar lib/jsr166e-1.1.0.jar lib/commo
 ns-beanutils-1.9.3.jar lib/joda-time-2.3.jar lib/joda-convert-1.2.jar
  lib/scala-reflect-2.11.8.jar lib/avro-1.7.7.jar lib/avro-mapred-1.7.
 7-hadoop2.jar lib/chill_2.11-0.8.0.jar lib/chill-java-0.8.0.jar lib/h
 adoop-client-2.6.5.jar lib/spark-launcher_2.11-2.2.0.jar lib/spark-ne
 twork-common_2.11-2.2.0.jar lib/spark-network-shuffle_2.11-2.2.0.jar 
 lib/spark-unsafe_2.11-2.2.0.jar lib/jets3t-0.9.3.jar lib/curator-reci
 pes-2.6.0.jar lib/javax.servlet-api-3.1.0.jar lib/commons-lang3-3.5.j
 ar lib/commons-math3-3.4.1.jar lib/jsr305-1.3.9.jar lib/jul-to-slf4j-
 1.7.16.jar lib/jcl-over-slf4j-1.7.16.jar lib/log4j-1.2.17.jar lib/slf
 4j-log4j12-1.7.16.jar lib/compress-lzf-1.0.3.jar lib/snappy-java-1.1.
 2.6.jar lib/lz4-1.3.0.jar lib/RoaringBitmap-0.5.11.jar lib/json4s-jac
 kson_2.11-3.2.11.jar lib/jersey-client-2.22.2.jar lib/jersey-common-2
 .22.2.jar lib/jersey-server-2.22.2.jar lib/jersey-container-servlet-2
 .22.2.jar lib/jersey-container-servlet-core-2.22.2.jar lib/netty-3.9.
 9.Final.jar lib/stream-2.7.0.jar lib/metrics-core-3.1.2.jar lib/metri
 cs-jvm-3.1.2.jar lib/metrics-json-3.1.2.jar lib/metrics-graphite-3.1.
 2.jar lib/jackson-module-scala_2.11-2.6.5.jar lib/ivy-2.4.0.jar lib/o
 ro-2.0.8.jar lib/pyrolite-4.13.jar lib/py4j-0.10.4.jar lib/commons-cr
 ypto-1.0.0.jar lib/janino-3.0.0.jar lib/commons-compiler-3.0.0.jar li
 b/antlr4-runtime-4.5.3.jar lib/commons-codec-1.10.jar lib/parquet-com
 mon-1.8.2.jar lib/parquet-encoding-1.8.2.jar lib/parquet-format-2.3.1
 .jar lib/parquet-jackson-1.8.2.jar lib/jackson-core-2.6.5.jar lib/com
 mons-collections-3.2.2.jar lib/commons-compress-1.4.1.jar lib/avro-ip
 c-1.7.7.jar lib/avro-ipc-1.7.7-tests.jar lib/kryo-shaded-3.0.3.jar li
 b/hadoop-common-2.6.5.jar lib/hadoop-hdfs-2.6.5.jar lib/hadoop-mapred
 uce-client-app-2.6.5.jar lib/hadoop-yarn-api-2.6.5.jar lib/hadoop-map
 reduce-client-core-2.6.5.jar lib/hadoop-mapreduce-client-jobclient-2.
 6.5.jar lib/hadoop-annotations-2.6.5.jar lib/leveldbjni-all-1.8.jar l
 ib/httpcore-4.3.3.jar lib/httpclient-4.3.6.jar lib/activation-1.1.1.j
 ar lib/mx4j-3.0.2.jar lib/mail-1.4.7.jar lib/bcprov-jdk15on-1.51.jar 
 lib/java-xmlbuilder-1.0.jar lib/curator-framework-2.6.0.jar lib/zooke
 eper-3.4.6.jar lib/guava-16.0.1.jar lib/json4s-core_2.11-3.2.11.jar l
 ib/javax.ws.rs-api-2.0.1.jar lib/hk2-api-2.4.0-b34.jar lib/javax.inje
 ct-2.4.0-b34.jar lib/hk2-locator-2.4.0-b34.jar lib/javax.annotation-a
 pi-1.2.jar lib/jersey-guava-2.22.2.jar lib/osgi-resource-locator-1.0.
 1.jar lib/jersey-media-jaxb-2.22.2.jar lib/validation-api-1.1.0.Final
 .jar lib/jackson-module-paranamer-2.6.5.jar lib/xz-1.0.jar lib/minlog
 -1.3.0.jar lib/objenesis-2.1.jar lib/commons-cli-1.2.jar lib/xmlenc-0
 .52.jar lib/commons-httpclient-3.1.jar lib/commons-io-2.4.jar lib/com
 mons-lang-2.6.jar lib/commons-configuration-1.6.jar lib/protobuf-java
 -2.5.0.jar lib/gson-2.2.4.jar lib/hadoop-auth-2.6.5.jar lib/curator-c
 lient-2.6.0.jar lib/htrace-core-3.0.4.jar lib/jetty-util-6.1.26.jar l
 ib/xercesImpl-2.9.1.jar lib/hadoop-mapreduce-client-common-2.6.5.jar 
 lib/hadoop-mapreduce-client-shuffle-2.6.5.jar lib/hadoop-yarn-common-
 2.6.5.jar lib/base64-2.3.8.jar lib/json4s-ast_2.11-3.2.11.jar lib/sca
 lap-2.11.0.jar lib/hk2-utils-2.4.0-b34.jar lib/aopalliance-repackaged
 -2.4.0-b34.jar lib/javassist-3.18.1-GA.jar lib/commons-digester-1.8.j
 ar lib/commons-beanutils-core-1.8.0.jar lib/apacheds-kerberos-codec-2
 .0.0-M15.jar lib/xml-apis-1.3.04.jar lib/hadoop-yarn-client-2.6.5.jar
  lib/hadoop-yarn-server-common-2.6.5.jar lib/hadoop-yarn-server-nodem
 anager-2.6.5.jar lib/jaxb-api-2.2.2.jar lib/jackson-jaxrs-1.9.13.jar 
 lib/jackson-xc-1.9.13.jar lib/guice-3.0.jar lib/scala-compiler-2.11.0
 .jar lib/javax.inject-1.jar lib/jline-0.9.94.jar lib/apacheds-i18n-2.
 0.0-M15.jar lib/api-asn1-api-1.0.0-M20.jar lib/api-util-1.0.0-M20.jar
  lib/jettison-1.1.jar lib/stax-api-1.0-2.jar lib/aopalliance-1.0.jar 
 lib/cglib-2.2.1-v20090111.jar lib/scala-xml_2.11-1.0.1.jar lib/scala-
 parser-combinators_2.11-1.0.1.jar lib/scala-library-2.11.8.jar lib/sl
 f4j-api-1.7.16.jar lib/netty-all-4.0.43.Final.jar lib/jackson-core-as
 l-1.9.13.jar lib/jackson-mapper-asl-1.9.13.jar lib/jackson-annotation
 s-2.6.5.jar lib/commons-net-3.1.jar lib/paranamer-2.6.jar

更新
Allison Berman提出的评论和答案很少,我建议我尝试如下

C:\Dev-Tra\OrderAnalytics\build\libs>spark-submit --jars OrderAnalytics.jar \ --class example.DataMigration
Error: Cannot load main class from JAR file:/C:/
Run with --help for usage help or --verbose for debug output

C:\Dev-Tra\OrderAnalytics\build\libs>spark-submit --jars OrderAnalytics.jar --class example.DataMigration
Exception in thread "main" java.lang.IllegalArgumentException: Missing application resource.
        at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:241)
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitArgs(SparkSubmitCommandBuilder.java:160)
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:274)
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:151)
        at org.apache.spark.launcher.Main.main(Main.java:86)

但是根据Spark Documentation,它应该如下所示,它能够开始工作但不能获得所有依赖的罐子

C:\Dev-Tra\OrderAnalytics\build\libs>spark-submit  --class example.DataMigration OrderAnalytics.jar
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/08/21 21:59:36 INFO SparkContext: Running Spark version 2.2.0
17/08/21 21:59:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/08/21 21:59:36 INFO SparkContext: Submitted application: DataMigration
17/08/21 21:59:36 INFO SecurityManager: Changing view acls to: ram
17/08/21 21:59:36 INFO SecurityManager: Changing modify acls to: ram
17/08/21 21:59:36 INFO SecurityManager: Changing view acls groups to:
17/08/21 21:59:36 INFO SecurityManager: Changing modify acls groups to:
17/08/21 21:59:36 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(ram); groups with vi
ew permissions: Set(); users  with modify permissions: Set(ram); groups with modify permissions: Set()
17/08/21 21:59:37 INFO Utils: Successfully started service 'sparkDriver' on port 62239.
17/08/21 21:59:37 INFO SparkEnv: Registering MapOutputTracker
17/08/21 21:59:37 INFO SparkEnv: Registering BlockManagerMaster
17/08/21 21:59:37 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/08/21 21:59:37 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/08/21 21:59:37 INFO DiskBlockManager: Created local directory at C:\Users\ram\AppData\Local\Temp\blockmgr-38ef35e6-219e-450c-b7da-c8075464a232
17/08/21 21:59:37 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
17/08/21 21:59:37 INFO SparkEnv: Registering OutputCommitCoordinator
17/08/21 21:59:37 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/08/21 21:59:37 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.1.101:4040
17/08/21 21:59:38 INFO SparkContext: Added JAR file:/C:/Dev-Tra/OrderAnalytics/build/libs/OrderAnalytics.jar at spark://192.168.1.101:62239/jars/OrderAnalytics
.jar with timestamp 1503332978023
17/08/21 21:59:38 INFO Executor: Starting executor ID driver on host localhost
17/08/21 21:59:38 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 62248.
17/08/21 21:59:38 INFO NettyBlockTransferService: Server created on 192.168.1.101:62248
17/08/21 21:59:38 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/08/21 21:59:38 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.1.101, 62248, None)
17/08/21 21:59:38 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.101:62248 with 366.3 MB RAM, BlockManagerId(driver, 192.168.1.101, 62248
, None)
17/08/21 21:59:38 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.1.101, 62248, None)
17/08/21 21:59:38 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.1.101, 62248, None)
17/08/21 21:59:40 INFO Native: Could not load JNR C Library, native system calls through this library will not be available (set this logger level to DEBUG to
see the full stack trace).
17/08/21 21:59:40 INFO ClockFactory: Using java.lang.System clock to generate timestamps.
17/08/21 21:59:41 WARN NettyUtil: Found Netty's native epoll transport, but not running on linux-based operating system. Using NIO instead.
17/08/21 21:59:41 INFO Cluster: New Cassandra host localhost/127.0.0.1:9042 added
17/08/21 21:59:41 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
17/08/21 21:59:42 INFO SparkContext: Starting job: runJob at RDDFunctions.scala:36
17/08/21 21:59:42 INFO DAGScheduler: Got job 0 (runJob at RDDFunctions.scala:36) with 4 output partitions
17/08/21 21:59:42 INFO DAGScheduler: Final stage: ResultStage 0 (runJob at RDDFunctions.scala:36)
17/08/21 21:59:42 INFO DAGScheduler: Parents of final stage: List()
17/08/21 21:59:42 INFO DAGScheduler: Missing parents: List()
17/08/21 21:59:42 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at DataMigration.scala:18), which has no missing parents
17/08/21 21:59:42 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 12.3 KB, free 366.3 MB)
17/08/21 21:59:42 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.8 KB, free 366.3 MB)
17/08/21 21:59:42 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.101:62248 (size: 5.8 KB, free: 366.3 MB)
17/08/21 21:59:42 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
17/08/21 21:59:42 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at DataMigration.scala:18) (first 15 tasks are f
or partitions Vector(0, 1, 2, 3))
17/08/21 21:59:42 INFO TaskSchedulerImpl: Adding task set 0.0 with 4 tasks
17/08/21 21:59:42 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, NODE_LOCAL, 17002 bytes)
17/08/21 21:59:42 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
17/08/21 21:59:42 INFO Executor: Fetching spark://192.168.1.101:62239/jars/OrderAnalytics.jar with timestamp 1503332978023
17/08/21 21:59:42 INFO TransportClientFactory: Successfully created connection to /192.168.1.101:62239 after 18 ms (0 ms spent in bootstraps)
17/08/21 21:59:42 INFO Utils: Fetching spark://192.168.1.101:62239/jars/OrderAnalytics.jar to C:\Users\ram\AppData\Local\Temp\spark-73cbbbe8-9e06-4a11-976
a-a766305d4148\userFiles-3e4c9dea-6273-4d9e-a17b-c807aa0e3da5\fetchFileTemp7196411614488839489.tmp
17/08/21 21:59:43 INFO Executor: Adding file:/C:/Users/ram/AppData/Local/Temp/spark-73cbbbe8-9e06-4a11-976a-a766305d4148/userFiles-3e4c9dea-6273-4d9e-a17b
-c807aa0e3da5/OrderAnalytics.jar to class loader
17/08/21 21:59:44 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
        at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
        at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
        at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
        at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 14 more
17/08/21 21:59:44 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, NODE_LOCAL, 15334 bytes)
17/08/21 21:59:44 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
17/08/21 21:59:44 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.NoClassDefFoundError: com/twitter/jsr166e/Long
Adder
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
        at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
        at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
        at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
        at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 14 more

17/08/21 21:59:44 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
17/08/21 21:59:44 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
        at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
        at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
        at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
        at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
17/08/21 21:59:44 INFO TaskSchedulerImpl: Cancelling stage 0
17/08/21 21:59:44 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/08/21 21:59:44 INFO TaskSchedulerImpl: Stage 0 was cancelled
17/08/21 21:59:44 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on localhost, executor driver: java.lang.NoClassDefFoundError (com/twitter/jsr166e/Lo
ngAdder) [duplicate 1]
17/08/21 21:59:44 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/08/21 21:59:44 INFO DAGScheduler: ResultStage 0 (runJob at RDDFunctions.scala:36) failed in 1.894 s due to Job aborted due to stage failure: Task 0 in stage
 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.NoClassDefFoundError: com/twitter/jsr166e/L
ongAdder
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
        at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
        at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
        at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
        at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 14 more

Driver stacktrace:
17/08/21 21:59:44 INFO DAGScheduler: Job 0 failed: runJob at RDDFunctions.scala:36, took 2.152376 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost tas
k 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
        at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
        at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
        at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
        at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 14 more

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
        at scala.Option.foreach(Option.scala:257)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2075)
        at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:36)
        at example.DataMigration$.main(DataMigration.scala:20)
        at example.DataMigration.main(DataMigration.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
        at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
        at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
        at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
        at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
        at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 14 more
17/08/21 21:59:51 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
17/08/21 21:59:52 INFO SerialShutdownHooks: Successfully executed shutdown hook: Clearing session cache for C* connector
17/08/21 21:59:52 INFO SparkContext: Invoking stop() from shutdown hook
17/08/21 21:59:52 INFO SparkUI: Stopped Spark web UI at http://192.168.1.101:4040
17/08/21 21:59:52 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/08/21 21:59:52 INFO MemoryStore: MemoryStore cleared
17/08/21 21:59:52 INFO BlockManager: BlockManager stopped
17/08/21 21:59:52 INFO BlockManagerMaster: BlockManagerMaster stopped
17/08/21 21:59:52 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/08/21 21:59:52 INFO SparkContext: Successfully stopped SparkContext
17/08/21 21:59:52 INFO ShutdownHookManager: Shutdown hook called
17/08/21 21:59:52 INFO ShutdownHookManager: Deleting directory C:\Users\ram\AppData\Local\Temp\spark-73cbbbe8-9e06-4a11-976a-a766305d4148

任何人都可以告诉我为什么火花无法选择jar的类路径或者如何解决这个问题?

由于

1 个答案:

答案 0 :(得分:1)

Indrajit是正确的,您需要包含该包。当我将文件保留在默认包中时,我遇到了类似的问题。使您的文件夹结构与此http://www.scala-sbt.org/0.13/docs/Directories.html

相同

在src / main / scala或src / main / java中添加一个新文件夹YOUR_PACKAGE,并将DataMigration放在YOUR_PACKAGE中。确保DataMigration的第一行是:

package YOUR_PACKAGE

你的火花提交将是:

spark-submit --jars OrderAnalytics.jar \
--class YOUR_PACKAGE.DataMigration 
相关问题