hadoop字数例子

时间:2016-04-29 23:25:24

标签: hadoop word-count

我是hadoop的新手,我刚刚安装了hadoop 2.6。

似乎系统启动正常。我正在尝试运行单词count exmaple,问题是everthing似乎运行,输出文件夹是用2个文件创建的:

-rw-r - r-- 1 yoni supergroup 0 2016-04-30 02:11 / user / yoni / output100 / _SUCCESS -rw-r - r-- 1 yoni supergroup 0 2016-04-30 02:11 / user / yoni / output100 / part-r-00000

但该文件为空-part-r-00000。问题是我不知道要找到问题,

这是工作的日志:

16/04/30 20:30:33 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/04/30 20:30:34 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
16/04/30 20:30:34 INFO input.FileInputFormat: Total input paths to process : 1
16/04/30 20:30:34 INFO mapreduce.JobSubmitter: number of splits:1
16/04/30 20:30:34 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1461971181442_0005
16/04/30 20:30:34 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
16/04/30 20:30:34 INFO impl.YarnClientImpl: Submitted application application_1461971181442_0005
16/04/30 20:30:34 INFO mapreduce.Job: The url to track the job: http://yoni-Lenovo-Z40-70:8088/proxy/application_1461971181442_0005/
16/04/30 20:30:34 INFO mapreduce.Job: Running job: job_1461971181442_0005
16/04/30 20:30:41 INFO mapreduce.Job: Job job_1461971181442_0005 running in uber mode : false
16/04/30 20:30:41 INFO mapreduce.Job:  map 0% reduce 0%
16/04/30 20:30:46 INFO mapreduce.Job:  map 100% reduce 0%
16/04/30 20:30:51 INFO mapreduce.Job:  map 100% reduce 100%
16/04/30 20:30:52 INFO mapreduce.Job: Job job_1461971181442_0005 completed successfully
16/04/30 20:30:52 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=6
        FILE: Number of bytes written=211511
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=170
        HDFS: Number of bytes written=86
        HDFS: Number of read operations=6
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=2923
        Total time spent by all reduces in occupied slots (ms)=2526
        Total time spent by all map tasks (ms)=2923
        Total time spent by all reduce tasks (ms)=2526
        Total vcore-seconds taken by all map tasks=2923
        Total vcore-seconds taken by all reduce tasks=2526
        Total megabyte-seconds taken by all map tasks=2993152
        Total megabyte-seconds taken by all reduce tasks=2586624
    Map-Reduce Framework
        Map input records=1
        Map output records=0
        Map output bytes=0
        Map output materialized bytes=6
        Input split bytes=116
        Combine input records=0
        Combine output records=0
        Reduce input groups=0
        Reduce shuffle bytes=6
        Reduce input records=0
        Reduce output records=0
        Spilled Records=0
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=166
        CPU time spent (ms)=1620
        Physical memory (bytes) snapshot=426713088
        Virtual memory (bytes) snapshot=3818450944
        Total committed heap usage (bytes)=324009984
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=54
    File Output Format Counters 
        Bytes Written=86
16/04/30 20:30:52 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/04/30 20:30:52 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
16/04/30 20:30:52 INFO input.FileInputFormat: Total input paths to process : 1
16/04/30 20:30:52 INFO mapreduce.JobSubmitter: number of splits:1
16/04/30 20:30:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1461971181442_0006
16/04/30 20:30:52 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
16/04/30 20:30:52 INFO impl.YarnClientImpl: Submitted application application_1461971181442_0006
16/04/30 20:30:52 INFO mapreduce.Job: The url to track the job: http://yoni-Lenovo-Z40-70:8088/proxy/application_1461971181442_0006/
16/04/30 20:30:52 INFO mapreduce.Job: Running job: job_1461971181442_0006
16/04/30 20:31:01 INFO mapreduce.Job: Job job_1461971181442_0006 running in uber mode : false
16/04/30 20:31:01 INFO mapreduce.Job:  map 0% reduce 0%
16/04/30 20:31:07 INFO mapreduce.Job:  map 100% reduce 0%
16/04/30 20:31:12 INFO mapreduce.Job:  map 100% reduce 100%
16/04/30 20:31:13 INFO mapreduce.Job: Job job_1461971181442_0006 completed successfully
16/04/30 20:31:13 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=6
        FILE: Number of bytes written=210495
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=216
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=7
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=3739
        Total time spent by all reduces in occupied slots (ms)=3133
        Total time spent by all map tasks (ms)=3739
        Total time spent by all reduce tasks (ms)=3133
        Total vcore-seconds taken by all map tasks=3739
        Total vcore-seconds taken by all reduce tasks=3133
        Total megabyte-seconds taken by all map tasks=3828736
        Total megabyte-seconds taken by all reduce tasks=3208192
    Map-Reduce Framework
        Map input records=0
        Map output records=0
        Map output bytes=0
        Map output materialized bytes=6
        Input split bytes=130
        Combine input records=0
        Combine output records=0
        Reduce input groups=0
        Reduce shuffle bytes=6
        Reduce input records=0
        Reduce output records=0
        Spilled Records=0
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=125
        CPU time spent (ms)=1010
        Physical memory (bytes) snapshot=427823104
        Virtual memory (bytes) snapshot=3819626496
        Total committed heap usage (bytes)=324534272
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=86
    File Output Format Counters 
        Bytes Written=0

我正在运行hadoop instalation附带的wordcount示例

hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep / user / yoni / input / user / yoni / output101' dfs [az。] +&#39 ;

和伪分布式模式中的设置,就像在所有基本的测试中一样

1 个答案:

答案 0 :(得分:0)

在此示例中,您应将xml下的所有hadoop-2.6.4/etc/hadoop文件放入正确用户主目录中名为'input'的HDFS文件夹中,该文件名为'yoni here。

首先通过探索HDFS(默认情况下)检查http://localhost:50070守护程序进程状态。

其次,请bin/hdfs dfs -ls /user/yoni/inputbin/hdfs fsck / -files -blocks检查文件的状态。

如果一切顺利,它应该有用。

Hadoop MapReduce Next Generation - Setting up a Single Node Cluster