【问题标题】:Different ways of Starting a MapReduce Job启动 MapReduce 作业的不同方式
【发布时间】:2017-06-01 12:31:16
【问题描述】:

在 Apache Hadoop 中仅使用 job.waitForCompletion(true) 方法和通过 ToolRunner.run(new MyClass(), args) 启动 map reduce 作业有什么区别?

我有一个 MapReduce 作业通过以下两种方式执行:

首先如下:

public class MaxTemperature extends Configured implements Tool {
  public static void main(String[] args) throws Exception {
      int exitCode = ToolRunner.run(new MaxTemperature(), args);
      System.exit(exitCode);
  }

  @Override
    public int run(String[] args) throws Exception {
        if (args.length != 2) {
              System.err.println("Usage: MaxTemperature <input path> <output path>");
              System.exit(-1);
            }
        System.out.println("Starting job");
        Job job = new Job();
        job.setJarByClass(MaxTemperature.class);
        job.setJobName("Max temperature");

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setMapperClass(MaxTemperatureMapper.class);
        job.setReducerClass(MaxTemperatureReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        int returnValue = job.waitForCompletion(true) ? 0:1;

        if(job.isSuccessful()) {
            System.out.println("Job was successful");
        } else if(!job.isSuccessful()) {
            System.out.println("Job was not successful");           
        }
        return returnValue;
    }
}

第二个为:

public class MaxTemperature{

    public static void main(String[] args) throws Exception {

        if (args.length != 2) {
              System.err.println("Usage: MaxTemperature <input path> <output path>");
              System.exit(-1);
            }
        System.out.println("Starting job");
        Job job = new Job();
        job.setJarByClass(MaxTemperature.class);
        job.setJobName("Max temperature");

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setMapperClass(MaxTemperatureMapper.class);
        job.setReducerClass(MaxTemperatureReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        int returnValue = job.waitForCompletion(true) ? 0:1;

        if(job.isSuccessful()) {
            System.out.println("Job was successful");
        } else if(!job.isSuccessful()) {
            System.out.println("Job was not successful");   

    }
}

两种方式的输出是相同的。但我不明白两者之间有什么区别? 哪一个比另一个更受欢迎?

【问题讨论】:

    标签: hadoop java-8 mapreduce bigdata


    【解决方案1】:

    这篇文章很好地解释了 ToolRunner 的使用:ToolRunner

    【讨论】:

    • 请避免仅链接的答案。相反,请尝试在此处提供答案的基础知识,然后提供链接以获取更多详细信息。如果链接失效,你的答案也会失效
    • @vefthym 你是对的。链接失效了。我和这个问题的作者有同样的问题。有人可以解释一下。谢谢!
    猜你喜欢
    • 2012-06-29
    • 1970-01-01
    • 2018-11-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多