hadoop mapreduce wordcount program

时间:2015-12-23 10:48:44

标签: eclipse hadoop

我试图在eclipse中运行一个字数统计程序。在浏览器中输出目录正在创建,但我没有得到任何输出。我收到以下错误。请帮助我。请。我是最后一点的结构。

我给出了以下命令来执行程序。

hduser @niki-B85M-D3H:/ home / niki / workspace $ hadoop jar wordcount.jar WordCount /user/hadoop/dir1/file.txt wordcountoutput

创建名为wordcountoutput的输出文件,但错误显示如下。

        I tried to run a word count program in eclipse. in the browser the output directories are being created but i am not getting any out put. I am getting the following error. plz help me out. please. I am struck at the last point. 

    I gave the following command to execute the program.

    hduser@niki-B85M-D3H:/home/niki/workspace$ hadoop jar wrd.jar WordCount /user/hduser/outputwc /user/hduser/outputwc/

    The output file named wordcountoutput is created but the error is displayed as follows.

    5/12/24 15:15:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2
        at WordCount.run(WordCount.java:29)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at WordCount.main(WordCount.java:48)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

    this is the exception i am getting now at the final step

WordCount.java

import java.io.IOException;

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider.Text;

public class WordCount extends Configured implements Tool{

    @Override
    public int run(String[] args) throws Exception {
        if(args.length<2){
            System.out.println("plz give input and output directories");
            return -1;

        }
        JobConf conf = new JobConf();

        //conf.setJarByClass(WordCount.class);
        conf.setJar("wrd.jar");

        FileInputFormat.setInputPaths(conf,new Path(args[1]));
        FileOutputFormat.setOutputPath(conf,new Path(args[2]));

        conf.setMapperClass(WordMapper.class);
        conf.setReducerClass(WordReducer.class);

        conf.setMapOutputKeyClass(Text.class);
        conf.setMapOutputValueClass(IntWritable.class);

        conf.setOutputKeyClass(Text.class);
        conf.setOutputValueClass(IntWritable.class);


            JobClient.runJob(conf);



        return 0;
    }
public static void main(String args[]) throws Exception{
    int exitCode =ToolRunner.run(new WordCount(), args);
    System.exit(exitCode);

}

}

Word Reducer.java

import java.io.IOException;
import java.util.Iterator;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class WordReducer extends MapReduceBase implements Reducer<Text,IntWritable,Text,IntWritable>{

    @Override
    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter r)
            throws IOException {
        int count=0;
        while(values.hasNext()){
            IntWritable i= values.next();
            count+=i.get();
        }

        output.collect(key,new IntWritable(count));

    }

}

WordMapper.java

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

public class WordMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {

    @Override
    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter r)
            throws IOException {
        String s = value.toString();
        for(String word:s.split(" ")){
            if(word.length()>0){

                output.collect(new Text(word), new IntWritable(1));


            }

        }

    }

}

1 个答案:

答案 0 :(得分:0)

试试这个:

public static void main(String args[]) throws Exception{
    int exitCode =ToolRunner.run(new Configuration(), new WordCount(), args);
    System.exit(exitCode);}

根据this