Question

使用job.waitForCompletion（true）方法和通过ToolRunner.run（new MyClass（），args）在Apache Hadoop中启动map reduce作业有什么区别？

我有以下两种方式执行MapReduce作业：

首先如下：

public class MaxTemperature extends Configured implements Tool {
  public static void main(String[] args) throws Exception {
      int exitCode = ToolRunner.run(new MaxTemperature(), args);
      System.exit(exitCode);
  }

  @Override
    public int run(String[] args) throws Exception {
        if (args.length != 2) {
              System.err.println("Usage: MaxTemperature <input path> <output path>");
              System.exit(-1);
            }
        System.out.println("Starting job");
        Job job = new Job();
        job.setJarByClass(MaxTemperature.class);
        job.setJobName("Max temperature");

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setMapperClass(MaxTemperatureMapper.class);
        job.setReducerClass(MaxTemperatureReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        int returnValue = job.waitForCompletion(true) ? 0:1;

        if(job.isSuccessful()) {
            System.out.println("Job was successful");
        } else if(!job.isSuccessful()) {
            System.out.println("Job was not successful");           
        }
        return returnValue;
    }
}

其次是：

public class MaxTemperature{

    public static void main(String[] args) throws Exception {

        if (args.length != 2) {
              System.err.println("Usage: MaxTemperature <input path> <output path>");
              System.exit(-1);
            }
        System.out.println("Starting job");
        Job job = new Job();
        job.setJarByClass(MaxTemperature.class);
        job.setJobName("Max temperature");

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setMapperClass(MaxTemperatureMapper.class);
        job.setReducerClass(MaxTemperatureReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        int returnValue = job.waitForCompletion(true) ? 0:1;

        if(job.isSuccessful()) {
            System.out.println("Job was successful");
        } else if(!job.isSuccessful()) {
            System.out.println("Job was not successful");   

    }
}

两种方式的输出相同。但我没有得到两者之间的区别？哪一个比另一个更优选？

Answer 1

这篇文章在解释ToolRunner的使用方面做得非常不错：ToolRunner

启动MapReduce作业的不同方法

1 个答案: