在本地模式而不是集群中运行的Mapreduce作业

时间:2018-12-06 11:04:16

标签: hadoop mapreduce cluster-computing yarn

已经完成配置,以便在纱线的顶部以群集模式运行mapreduce作业,但以本地模式运行。 无法找出问题所在。

下面是yarn-site.xml(在主节点上)

    <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>namenode:8031</value>
    </property>
     <property>
            <name>yarn.nodemanager.aux-services</name>    //node manager servi
            <value>mapreduce_shuffle</value>    //This will specify that how mapper reducer work
    </property>

    <property>
            <name>yarn.resourcemanager.scheduler.address</name>
            <value>namenode:8030</value>
    </property>

    <property>
            <name>yarn.resourcemanager.address</name>
            <value>namenode:8032</value>
    </property>

    <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>namenode</value>
    </property>

    <property>
            <name>yarn.nodemanager.resource.memory-mb</name>
            <value>2042</value>
    </property>

    <property>
            <name>yarn.nodemanager.vmem-check-enabled</name>
            <value>false</value>
    </property>

yarn-site.xml(在从节点上)

    <property>
            <name>yarn.nodemanager.aux-services</name>    //node manager service
            <value>mapreduce_shuffle</value>    //This will specify that how mapper reducer work
    </property>

    <property>
            <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
            <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>

    <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>namenode:8031</value>    //Tell the ip_address of resource tracker
    </property>

mapred-site.xml(在主节点和从节点)

    <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
    </property>
    <property>
            <name>yarn.app.mapreduce.am.resource.mb</name>
            <value>2048</value>
    </property>
    <property>
            <name>mapreduce.map.memory.mb</name>
            <value>2048</value>
    </property>

    <property>
            <name>mapreduce.reduce.memory.mb</name>
            <value>2048</value>
    </property>

提交后,作业输出如下。

18/12/06 16:20:43 INFO input.FileInputFormat: Total input paths to process : 1
18/12/06 16:20:43 INFO mapreduce.JobSubmitter: number of splits:2
18/12/06 16:20:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1556004420_0001
18/12/06 16:20:43 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
18/12/06 16:20:43 INFO mapreduce.Job: Running job: job_local1556004420_0001
18/12/06 16:20:43 INFO mapred.LocalJobRunner: OutputCommitter set in config null
18/12/06 16:20:43 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
18/12/06 16:20:43 INFO mapred.LocalJobRunner: Waiting for map tasks
18/12/06 16:20:43 INFO mapred.LocalJobRunner: Starting task: attempt_local1556004420_0001_m_000000_0
18/12/06 16:20:43 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
18/12/06 16:20:43 INFO mapred.MapTask: Processing split: hdfs://namenode:9001/all-the-news/articles1.csv:0+134217728
18/12/06 16:20:43 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
18/12/06 16:20:43 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
18/12/06 16:20:43 INFO mapred.MapTask: soft limit at 83886080
18/12/06 16:20:43 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
18/12/06 16:20:43 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
18/12/06 16:20:43 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
18/12/06 16:20:44 INFO mapreduce.Job: Job job_local1556004420_0001 running in uber mode : false
18/12/06 16:20:44 INFO mapreduce.Job:  map 0% reduce 0%
18/12/06 16:20:49 INFO mapred.LocalJobRunner: map > map
18/12/06 16:20:50 INFO mapreduce.Job:  map 1% reduce 0%
18/12/06 16:20:52 INFO mapred.LocalJobRunner: map > map
18/12/06 16:20:55 INFO mapred.LocalJobRunner: map > map
18/12/06 16:20:56 INFO mapreduce.Job:  map 2% reduce 0%
18/12/06 16:20:58 INFO mapred.LocalJobRunner: map > map
18/12/06 16:21:01 INFO mapred.LocalJobRunner: map > map
18/12/06 16:21:02 INFO mapreduce.Job:  map 3% reduce 0%
18/12/06 16:21:04 INFO mapred.LocalJobRunner: map > map

为什么它在本地模式下运行。 我正在200MB文件上运行此作业,该文件包含3个节点,2个datanode和1个namenode。

etc / hosts文件如下所示

127.0.0.1       localhost
127.0.1.1       anil-Lenovo-Product

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

192.168.8.98 namenode
192.168.8.99 datanode
192.168.8.100 datanode2

YarnUI image

1 个答案:

答案 0 :(得分:0)

  1. 首先检查以下配置是否有效: 默认为http:// {your-resource-manager-host}:8088 / conf或 您配置的用户界面地址:http://namenode:8088/conf

  2. 然后确保配置了以下属性:

    在mapred-site.xml

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

  <property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  </property>

  <property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  </property>

  <property>
    <name>mapreduce.reduce.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  </property>

在yarn-site.xml中

  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>

重新启动YARN服务并检查其是否有效。

作业由ClientProtocol接口提交,服务启动时将创建其两个实现之一:

  • LocalClientProtocolProvider 前缀为job_local
  • YarnClientProtocolProvider 前缀为job _

根据MRConfig.FRAMEWORK_NAME(值是“ mapreduce.framework.name”)配置,其有效选项为classicyarnlocal

祝你好运!