Map Reduce 数组越界异常答案

【问题标题】：Map Reduce Array Out of Bounds ExceptionMap Reduce 数组越界异常
【发布时间】：2019-07-18 12:01:25
【问题描述】：

我很困惑为什么会这样。我已经为此工作了一段时间，我只是不明白。

我的地图代码工作正常，因为我能够验证它所在目录中的输出。

这是方法：

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

        String stateKeyword = value.toString();
        String[] pieces = new String[] {stateKeyword};

        for (String element : pieces) {
            String name = element.split(":")[0].trim();
            String id = element.split(":")[1].trim();
            Integer rank = Integer.parseInt(element.split(":")[2].trim());
            context.write(new Text(name), new Text(id + ":" + rank));
        }   
    }

所以我的Output 将连接 id 和 rank 字段。如果我正常打印该值，我可以在输出文件中看到它。

但是，我执行的任何split 操作都会抛出ArrayOutOfBoundsException，我不明白为什么。我什至检查该值是否包含“：”并且它会打印但不会拆分。但是当我不做这个检查时，我得到了异常。

这是我的减少：

public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException {

        List<String> elements = new ArrayList<String>();
        Text word = new Text();
        for (Text val : values) {
            if (val.toString().contains(":")) {
                String state = val.toString().split(":")[0];
                word.set(state);
            }
            context.write(key, word);
        }
    }

我的文件中的输出如下所示：

Name   id:rank
Name   id:rank
Name   id:rank

...
...
...

但是为什么我不能拆分id and rank?

【问题讨论】：

标签： java apache hadoop mapreduce

【解决方案1】：

为避免 ArrayOutOfBoundsException，请在从数组中获取值之前检查数组大小。这样的东西会更合适：

    String[] temp = element.split(":"); 
    if(element.size()==2){
       String name = temp[0].trim(); 
       String id = temp[1].trim();
     }

【讨论】：

不过，我的代码似乎在 reduce 函数中失败了。