【问题标题】:Hadoop Mapreduce job stuck at map 100% reduce 51%Hadoop Mapreduce 作业卡在地图上 100% 减少 51%
【发布时间】:2014-06-06 19:30:01
【问题描述】:

所以,我正在某处寻找无限循环,我不知道是否还有其他可能导致这种情况的原因。我正在使用四个集群节点,所以我很确定不会缺少 RAM,正如在其他同类问题中所建议的那样。

我的代码:

package org.myorg;

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import util.hashing.*;



public class LatLong {


 public static class Map extends Mapper<Object, Text, Text, Text> {
    //private final static IntWritable one = new IntWritable(1);


    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        String line = value.toString();
        String[] longLatArray = line.split(",");
        double longi = Double.parseDouble(longLatArray[0]);
        double lat = Double.parseDouble(longLatArray[1]);
        //List<Double> origLatLong = new ArrayList<Double>(2);
        //origLatLong.add(lat);
        //origLatLong.add(longi);
        Geohash inst = Geohash.getInstance();
        //encode is the library's encoding function
        String hash = inst.encode(lat,longi);
        //Using the first 5 characters just for testing purposes
        //Need to find the right one later
        int accuracy = 4;
        //hash of the thing is shortened to whatever I figure out
        //to be the right size of each tile
        Text shortenedHash = new Text(hash.substring(0,accuracy));
        Text origHash = new Text(hash);

        context.write(shortenedHash, origHash);
    }
 } 

 public static class Reduce extends Reducer<Text, Text, Text, Text> {

     private IntWritable totalTileElementCount = new IntWritable();
     private Text latlongimag = new Text();
     private Text dataSeparator = new Text();

     @Override
     public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
      int elementCount = 0;
      boolean first = true;
      Iterator<Text> it = values.iterator();
      String lat = new String();
      String longi = new String();
      Geohash inst = Geohash.getInstance();

      while (it.hasNext()) {
       elementCount = elementCount+1;
       if(first)
       {
           double[] doubleArray = (inst.decode(it.next().toString()));
           lat = Double.toString(doubleArray[0]);
           longi = Double.toString(doubleArray[1]);
           first = false;

       }



      }
      totalTileElementCount.set(elementCount);
      //Geohash inst = Geohash.getInstance();

      String mag = totalTileElementCount.toString();

      latlongimag.set(lat+","+ longi +","+mag+",");
      dataSeparator.set("");
      context.write(latlongimag, dataSeparator );
     }
 }

 public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    Job job = new Job(conf, "wordcount");
    job.setJarByClass(LatLong.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);

    job.setMapperClass(Map.class);
    job.setReducerClass(Reduce.class);

    job.setInputFormatClass(TextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.waitForCompletion(true);
 }

}       

【问题讨论】:

  • Java MR 有很多样板,你会发现 Scalding 会为你节省很多时间:)

标签: java hadoop


【解决方案1】:

里面

while (it.hasNext()) {
       elementCount = elementCount+1;
       if(first)
       {
           double[] doubleArray = (inst.decode(it.next().toString()));
           lat = Double.toString(doubleArray[0]);
           longi = Double.toString(doubleArray[1]);
           first = false;
       }
  }

你设置了first = false;,所以在下一个while (it.hasNext())循环迭代中,if(first)不会被输入,it.next()也不会再次被调用,所以如果it有多个元素it.hasNext()将总是返回true并且你永远不会离开这个while 循环。

【讨论】:

  • 好的,这绝对是这里的问题.. 非常感谢!但是,作为后续,没关系,如果我只有一个 else {String blah = it.next().toString();},对吧?
猜你喜欢
  • 2014-01-13
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多