NoSuchMethodException:com.google.common.io.ByteStreams.limit

时间:2015-10-23 11:12:15

标签: apache-spark

我运行spark将数据写入hbase,但发现NoSuchMethodException:

  

15/10/23 18:45:21 WARN TaskSetManager:阶段1.0中丢失的任务0.0(TID 1,dn18-formal.i.nease.net):java.lang.NoSuchMethodError:com.google.common.io .ByteStreams.limit(Ljava / IO / InputStream的; J),Ljava / IO / InputStream的;

我在guava.jar目录中找到了hadoop/hbase,版本是12.0,但com.google.common.io.ByteStreams.limit从14.0开始,因此发生了NoSuchMethodException

我尝试通过- -jars运行spark-submmit,但同样如此。我试着添加

        configuration.set("spark.executor.extraClassPath", "/home/ljh")
        configuration.set("spark.driver.userClassPathFirst","true");

到我的代码,仍然是一样的。

如何解决这个问题?如何从类路径中删除guava.jar中的hadoop/hbase?为什么它不使用spark dir中的guava.jar

这是我的代码

  rdd.foreach({ res =>
        val configuration = HBaseConfiguration.create();

        configuration.set("hbase.zookeeper.property.clientPort", "2181");
        configuration.set("hbase.zookeeper.quorum", “ip.66");
        configuration.set("hbase.master", “ip:60000");
        configuration.set("spark.executor.extraClassPath", "/home/ljh")
        configuration.set("spark.driver.userClassPathFirst","true");
        val hadmin = new HBaseAdmin(configuration);
        configuration.clear();
        configuration.addResource("/home/hadoop/conf/core-default.xml")
        configuration.addResource("/home/hadoop/conf/core-site.xml")
        configuration.addResource("/home/hadoop/conf/mapred-default.xml")
        configuration.addResource("/home/hadoop/conf/mapred-site.xml")
        configuration.addResource("/home/hadoop/conf/yarn-default.xml")
        configuration.addResource("/home/hadoop/conf/yarn-site.xml")
        configuration.addResource("/home/hadoop/conf/hdfs-default.xml")
        configuration.addResource("/home/hadoop/conf/hdfs-site.xml")
        configuration.addResource("/home/hadoop/conf/hbase-default.xml")
        configuration.addResource("/home/ljhn1829/hbase-site.xml")
        val table = new HTable(configuration, "ljh_test2");
        var put = new Put(Bytes.toBytes(res.toKey()));
        put.add(Bytes.toBytes("basic"), Bytes.toBytes("name"), Bytes.toBytes(res.totalCount + "\t" + res.positiveCount));
        table.put(put);
        table.flushCommits()
      })

错误消息:

  

15/10/23 19:06:42 WARN TaskSetManager:阶段1.0中失去的任务0.0(TID 1,gdc-dn126-formal.i.nease.net):java.lang.NoSuchMethodError:   com.google.common.io.ByteStreams.limit(Ljava / IO / InputStream的; J),Ljava / IO / InputStream的;         at org.apache.spark.util.collection.ExternalAppendOnlyMap $ DiskMapIterator.nextBatchStream(ExternalAppendOnlyMap.scala:420)         在org.apache.spark.util.collection.ExternalAppendOnlyMap $ DiskMapIterator。(ExternalAppendOnlyMap.scala:392)         在org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:207)         at org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:63)         在org.apache.spark.util.collection.Spillable $ class.maybeSpill(Spillable.scala:83)         在org.apache.spark.util.collection.ExternalAppendOnlyMap.maybeSpill(ExternalAppendOnlyMap.scala:63)         在org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:129)         在org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:60)         在org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:46)         在org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90)         在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)         在org.apache.spark.rdd.RDD.iterator(RDD.scala:244)         在org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)         在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)         在org.apache.spark.rdd.RDD.iterator(RDD.scala:244)         在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)         在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)         在org.apache.spark.scheduler.Task.run(Task.scala:70)         在org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:213)         在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:615)         在java.lang.Thread.run(Thread.java:745)

     

15/10/23 19:06:42 INFO TaskSetManager:在阶段1.0中启动任务0.1(TID 2,gdc-dn166-formal.i.nease.net,PROCESS_LOCAL,1277   字节)         15/10/23 19:06:42 INFO BlockManagerInfo:在gdc-dn166-formal.i.nease.net:3838内存中添加了broadcast_2_piece0   (大小:3.2 KB,免费:1060.3 MB)         15/10/23 19:06:42错误YarnScheduler:gdc-dn126-formal.i.nease.net上丢失的执行者1:远程Rpc客户端解除关联         15/10/23 19:06:42 WARN ReliableDeliverySupervisor:与远程系统的关联   [akka.tcp://sparkExecutor@gdc-dn126-formal.i.nease.net:1656]有   失败,地址现在为[5000] ms门控。原因是:   [解除关联。         15/10/23 19:06:42 INFO TaskSetManager:从TaskSet 1.0重新排队1的任务         15/10/23 19:06:42 INFO DAGScheduler:执行官丢失:1(纪元1)         15/10/23 19:06:42 INFO BlockManagerMasterEndpoint:尝试从BlockManagerMaster中删除执行程序1。         15/10/23 19:06:42 INFO BlockManagerMasterEndpoint:删除块管理器BlockManagerId(1,gdc-dn126-formal.i.nease.net,44635)         15/10/23 19:06:42 INFO BlockManagerMaster:在removeExecutor中成功删除了1         15/10/23 19:06:42 INFO ShuffleMapStage:ShuffleMapStage 0现在在执行程序1上不可用(0/1,false)         15/10/23 19:06:42 INFO MapOutputTrackerMasterEndpoint:要求将shuffle 1的地图输出位置发送到   gdc-dn166-formal.i.nease.net:28595         15/10/23 19:06:42 INFO MapOutputTrackerMaster:shuffle 1的输出状态大小为84字节         15/10/23 19:06:42 WARN TaskSetManager:阶段1.0中失去的任务0.1(TID 2,gdc-dn166-formal.i.nease.net):FetchFailed(null,shuffleId = 1,mapId = -1,reduceId = 0,消息=         org.apache.spark.shuffle.MetadataFetchFailedException:缺少shuffle 1的输出位置         在org.apache.spark.MapOutputTracker $$ anonfun $ org $ apache $ spark $ MapOutputTracker $$ convertMapStatuses $ 1.apply(MapOutputTracker.scala:389)         在org.apache.spark.MapOutputTracker $$ anonfun $ org $ apache $ spark $ MapOutputTracker $$ convertMapStatuses $ 1.apply(MapOutputTracker.scala:386)         在scala.collection.TraversableLike $$ anonfun $ map $ 1.apply(TraversableLike.scala:244)         在scala.collection.TraversableLike $$ anonfun $ map $ 1.apply(TraversableLike.scala:244)         在scala.collection.IndexedSeqOptimized $ class.foreach(IndexedSeqOptimized.scala:33)         在scala.collection.mutable.ArrayOps $ ofRef.foreach(ArrayOps.scala:108)         在scala.collection.TraversableLike $ class.map(TraversableLike.scala:244)         在scala.collection.mutable.ArrayOps $ ofRef.map(ArrayOps.scala:108)         在org.apache.spark.MapOutputTracker $ .org $ apache $ spark $ MapOutputTracker $$ convertMapStatuses(MapOutputTracker.scala:385)         在org.apache.spark.MapOutputTracker.getServerStatuses(MapOutputTracker.scala:172)         在org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher $ .fetch(BlockStoreShuffleFetcher.scala:42)         在org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40)         在org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90)         在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)         在org.apache.spark.rdd.RDD.iterator(RDD.scala:244)         在org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)         在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)         在org.apache.spark.rdd.RDD.iterator(RDD.scala:244)         在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)         在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)         在org.apache.spark.scheduler.Task.run(Task.scala:70)         在org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:213)         在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:615)         在java.lang.Thread.run(Thread.java:745)

0 个答案:

没有答案