执行Map-Reduce程序时出错

时间:2018-05-30 01:10:56

标签: java hadoop mapreduce

我使用 hadoop3.1.0 在Ubuntu上运行Mapreduce WordCount程序,但它总是有这个信息。

我之前看到有人问这个类似的question,但这不起作用。

我想知道我应该修改哪个文件,或者我想念的东西。

我的java程序来自here

master@kevin-VirtualBox:~/MapReduceTutorial$ $HADOOP_HOME/bin/hadoop jar ProductSalePerCountry.jar /inputMapReduce /mapreduce_output_sales

$HADOOP_HOME/bin/hadoop jar ProductSalePerCountry.jar /inputMapReduce /mapreduce_output_sales
2018-05-20 00:58:37,856 INFO client.RMProxy: Connecting to ResourceManager at kevin-VirtualBox/127.0.1.1:8032
2018-05-20 00:58:38,511 INFO client.RMProxy: Connecting to ResourceManager at kevin-VirtualBox/127.0.1.1:8032
2018-05-20 00:58:38,980 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2018-05-20 00:58:39,058 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/master/.staging/job_1526748071526_0004
2018-05-20 00:58:39,579 INFO mapred.FileInputFormat: Total input files to process : 1
2018-05-20 00:58:39,773 INFO mapreduce.JobSubmitter: number of splits:2
2018-05-20 00:58:39,926 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2018-05-20 00:58:40,251 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1526748071526_0004
2018-05-20 00:58:40,254 INFO mapreduce.JobSubmitter: Executing with tokens: []
2018-05-20 00:58:40,742 INFO conf.Configuration: resource-types.xml not found
2018-05-20 00:58:40,744 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2018-05-20 00:58:40,930 INFO impl.YarnClientImpl: Submitted application application_1526748071526_0004
2018-05-20 00:58:41,095 INFO mapreduce.Job: The url to track the job: http://kevin-VirtualBox:8088/proxy/application_1526748071526_0004/
2018-05-20 00:58:41,097 INFO mapreduce.Job: Running job: job_1526748071526_0004

芯-site.xml中

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>Parent directory for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS </name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. </description>
</property>
</configuration>

HD​​FS-site.xml中

<configuration>
<property>
   <name>dfs.namenode.name.dir</name>
   <value>/home/master/hdfs/name</value>
</property>
<property>
   <name>dfs.datanode.data.dir</name>
   <value>/home/master/hdfs/data</value>
</property>
<property>
   <name>dfs.replication</name>
   <value>1</value>
</property>
<property>
   <name>dfs.permissions</name>
   <value>true</value>
</property>
</configuration>

纱-site.xml中

   <configuration>
        <property>
         <name>yarn.nodemanager.aux-services</name>
         <value>mapreduce_shuffle</value>
         </property>
        <property>
         <name>yarn.resourcemanager.hostname</name>
         <value>kevin-VirtualBox</value>
          </property>
    </configuration>

mapred-site.sml

<configuration>
   <property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
   </property>
<property>  
 <name>mapreduce.application.classpath</name>  
</configuration>

JPS

4948 Jps
2856 NodeManager
2088 NameNode
2731 ResourceManager
2207 DataNode

我跟踪作业的网址

enter image description here

enter image description here

enter image description here

提前致谢

1 个答案:

答案 0 :(得分:1)

谢谢@ cricket_007 我的问题是我没有给 YARN

留下记忆

设置YARN可以在yarn-site.xml中使用的最大内存

<name>yarn.nodemanager.resource.memory-mb</name>
<value>40960</value>

指定要分配的最小RAM单位

<name>yarn.scheduler.minimum-allocation-mb</name>
 <value>2048</value>