【问题标题】:MapReduce Avro Output is Creating Text File InsteadMapReduce Avro 输出正在创建文本文件
【发布时间】:2015-06-12 16:42:38
【问题描述】:

我有一个 MapReduce 作业,它读取 avro 数据,然后应该输出 avro 数据。但是,当我在作业成功时检查输出文件时,它们没有 .avro 扩展名,我可以使用简单的文本编辑器查看它们。

我的驱动程序配置为输出 avro,所以我不确定问题出在哪里,非常感谢任何帮助。

这是我的驱动程序类:

public class Driver extends Configured implements Tool{

public static void main(String[] args) throws Exception {
    int res =
            ToolRunner.run(new Configuration(), new Driver(), args);
    System.exit(res);
}

@Override
public int run(String[] args) throws Exception {
    Job job = new Job(getConf());
    job.setJarByClass(Driver.class);
    job.setJobName("nearestpatient");


    AvroJob.setOutputKeySchema(job, Pair.getPairSchema(Schema.create(Schema.Type.LONG), Schema.create(Schema.Type.STRING)));
    job.setOutputValueClass(NullWritable.class);

    job.setMapperClass(PatientMapper.class);
    job.setReducerClass(PatientReducer.class);

    job.setInputFormatClass(AvroKeyInputFormat.class);
    AvroJob.setInputKeySchema(job, PatientAvro.getClassSchema());

    job.setMapOutputKeyClass(LongWritable.class);
    job.setMapOutputValueClass(LongWritable.class);


    FileInputFormat.setInputPaths(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.waitForCompletion(true);

    return 0;
}
}

这是我的 Reducer 类:

public class PatientReducer extends Reducer<LongWritable, LongWritable, AvroWrapper<Pair<Long, String>>, NullWritable> {

    @Override
    public void reduce(LongWritable providerKey, Iterable<LongWritable> patients, Context context) throws IOException, InterruptedException {

        String outputList = "[";
   `enter code here` List<Long> patientList = new ArrayList<>();
    for (LongWritable patientKey : patients) {
        outputList += new LongWritable(patientKey.get()) + ", ";
    }
    outputList = outputList.substring(0, outputList.length() - 2);
    outputList += "]";
    context.write(new AvroWrapper<Pair<Long, String>>(new Pair<Long, String> (providerKey.get(), outputList)), NullWritable.get());
}
}

【问题讨论】:

    标签: java hadoop mapreduce avro


    【解决方案1】:

    在你的 run() 方法中,你需要添加以下内容

    job.setOutputFormatClass(AvroKeyValueOutputFormat.class);
    

    【讨论】:

      【解决方案2】:

      在你的代码替换行

      FileOutputFormat.setOutputPath(job, new Path(args[1]));
      

      job.setOutputFormatClass(AvroKeyOutputFormat.class);
      AvroKeyOutputFormat.setOutputPath(job, new Path(args[1]));
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2017-09-10
        • 1970-01-01
        • 2012-08-16
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多