在YARN上运行SparkR会输出“Rscript execution”错误

时间:2015-09-02 08:34:03

标签: r yarn sparkr

我在Hadoop 2.7群集上安装了Spark 1.4.1。

  1. 我已经启动了SparkR shell而没有错误:

    bin/sparkR --master yarn-client
    
  2. 我没有错误地运行R命令(来自spark.apache.org的介绍性示例):

    df <- createDataFrame(sqlContext, faithful)
    
  3. 当我运行命令时:

    head(select(df, df$eruptions))
    
  4. 我在执行程序节点上的15/09/02 10:08:29收到以下错误:

      

    “Rscript执行错误:没有这样的文件或目录”。

    任何提示都将不胜感激。 除了SparkR之外的Spark任务在我的YARN群集上运行正常。 已安装R 3.2.1并在驱动程序节点上运行正常。

    15/09/02 10:04:06 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
    15/09/02 10:04:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    15/09/02 10:04:10 INFO spark.SecurityManager: Changing view acls to: yarn,root
    15/09/02 10:04:10 INFO spark.SecurityManager: Changing modify acls to: yarn,root
    15/09/02 10:04:10 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); users with modify permissions: Set(yarn, root)
    15/09/02 10:04:11 INFO slf4j.Slf4jLogger: Slf4jLogger started
    15/09/02 10:04:12 INFO Remoting: Starting remoting
    15/09/02 10:04:12 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@datanode1.hp.com:46167]
    15/09/02 10:04:12 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 46167.
    15/09/02 10:04:12 INFO spark.SecurityManager: Changing view acls to: yarn,root
    15/09/02 10:04:12 INFO spark.SecurityManager: Changing modify acls to: yarn,root
    15/09/02 10:04:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); users with modify permissions: Set(yarn, root)
    15/09/02 10:04:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    15/09/02 10:04:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    15/09/02 10:04:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    15/09/02 10:04:12 INFO slf4j.Slf4jLogger: Slf4jLogger started
    15/09/02 10:04:12 INFO Remoting: Starting remoting
    15/09/02 10:04:13 INFO util.Utils: Successfully started service 'sparkExecutor' on port 47919.
    15/09/02 10:04:13 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@datanode1.hp.com:47919]
    15/09/02 10:04:13 INFO storage.DiskBlockManager: Created local directory at /data2/hadoop/yarn/local/usercache/root/appcache/application_1441180800595_0001/blockmgr-5e435e40-bd36-4746-9acd-8cf1619033ae
    15/09/02 10:04:13 INFO storage.DiskBlockManager: Created local directory at /data3/hadoop/yarn/local/usercache/root/appcache/application_1441180800595_0001/blockmgr-28dfabe6-8e0d-4e49-bc95-27b3428c10a0
    15/09/02 10:04:13 INFO storage.MemoryStore: MemoryStore started with capacity 534.5 MB
    15/09/02 10:04:13 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sparkDriver@192.1.1.1:45596/user/CoarseGrainedScheduler
    15/09/02 10:04:13 INFO executor.CoarseGrainedExecutorBackend: Successfully registered with driver
    15/09/02 10:04:13 INFO executor.Executor: Starting executor ID 2 on host datanode1.hp.com
    15/09/02 10:04:14 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34166.
    15/09/02 10:04:14 INFO netty.NettyBlockTransferService: Server created on 34166
    15/09/02 10:04:14 INFO storage.BlockManagerMaster: Trying to register BlockManager
    15/09/02 10:04:14 INFO storage.BlockManagerMaster: Registered BlockManager
    15/09/02 10:06:35 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
    15/09/02 10:06:35 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
    15/09/02 10:06:35 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
    15/09/02 10:06:35 INFO storage.MemoryStore: ensureFreeSpace(854) called with curMem=0, maxMem=560497950
    15/09/02 10:06:35 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 854.0 B, free 534.5 MB)
    15/09/02 10:06:35 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 159 ms
    15/09/02 10:06:35 INFO storage.MemoryStore: ensureFreeSpace(1280) called with curMem=854, maxMem=560497950
    15/09/02 10:06:35 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1280.0 B, free 534.5 MB)
    15/09/02 10:06:35 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 11589 bytes result sent to driver
    15/09/02 10:08:28 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1
    15/09/02 10:08:28 INFO executor.Executor: Running task 0.0 in stage 1.0 (TID 1)
    15/09/02 10:08:28 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 1
    15/09/02 10:08:28 INFO storage.MemoryStore: ensureFreeSpace(4022) called with curMem=0, maxMem=560497950
    15/09/02 10:08:28 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 3.9 KB, free 534.5 MB)
    15/09/02 10:08:28 INFO broadcast.TorrentBroadcast: Reading broadcast variable 1 took 13 ms
    15/09/02 10:08:28 INFO storage.MemoryStore: ensureFreeSpace(9536) called with curMem=4022, maxMem=560497950
    15/09/02 10:08:28 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 9.3 KB, free 534.5 MB)
    15/09/02 10:08:29 INFO r.BufferedStreamThread: Rscript execution error: No such file or directory
    15/09/02 10:08:39 ERROR executor.Executor: Exception in task 0.0 in stage 1.0 (TID 1)
    java.net.SocketTimeoutException: Accept timed out
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.spark.api.r.RRDD$.createRWorker(RRDD.scala:425)
        at org.apache.spark.api.r.BaseRRDD.compute(RRDD.scala:63)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
        at org.apache.spark.scheduler.Task.run(Task.scala:70)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
    

0 个答案:

没有答案