使用reduce方法计算文件Java8中的单词答案

【问题标题】：Count word in file Java8 using reduce method使用reduce方法计算文件Java8中的单词
【发布时间】：2018-08-09 03:41:50
【问题描述】：

我想在 java8 中使用 reduce 计算文件中单词的出现次数

filecontent.flatMap(line->Stream.of(line.split("\\s+")))
                       .map(String::toLowerCase)
                       .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));   
  //here I want to replace colect() with reduce

【问题讨论】：

当collect() 工作正常时，为什么要使用reduce()？
我只是想让@daniu 像谷歌的地图一样使用它并减少
我没有那种动机。但是，是的，如果我查看答案，似乎人们已经找到了方法。

标签： java java-stream reduce

【解决方案1】：

首先，您应该在问题中提供一个完整的可编译示例，例如：

Stream<String> filecontent = Stream.of("foo in bar is foo", "bar in bar is not foo");
Map<String, Long> result = filecontent.flatMap(line -> Stream.of(line.split("\\s+")))
    .map(String::toLowerCase)
    .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

然后要reduce，你需要使用Map对象来reduce（为简洁起见使用Java HashMap，这不是这种情况下最有效的数据结构）：

Stream<String> filecontent = Stream.of("foo in bar is foo", "bar in bar is not foo");
Map<String, Long> result = filecontent.flatMap(line -> Stream.of(line.split("\\s+")))
    .map((word) -> singletonMap(word.toLowerCase(), 1L))
    .reduce(new HashMap<>(), (map1, map2) -> {
        for (Map.Entry<String, Long> entry: map2.entrySet()) {
            map1.put(entry.getKey(), map1.getOrDefault(entry.getKey(), 0L) + entry.getValue());
        }
        return map1;
    });

这将首先创建一个新的空 HashMap，然后为每个新单词创建一个单例 Hashmap，并在每个 reduce 步骤中将这样的单例映射合并到原始 hashmap 中。如果你想使用并行流来做到这一点，你需要在 reduce 步骤中创建一个新的 map：

        Map<String, Long> tempResult = new HashMap<>(map1);
        for (Map.Entry<String, Long> entry: map2.entrySet()) {
            tempResult.put(entry.getKey(), map1.getOrDefault(entry.getKey(), 0L) + entry.getValue());
        }
        return tempResult;

【讨论】：

而不是Arrays.asList(...).stream()直接使用Stream.of(...)

【解决方案2】：

受@tkruse 回答的启发，我想出了以下 sn-p：

Stream<String> fileContent = Stream.of("foo in bar is foo", "bar in bar is not foo");
Pattern pattern = Pattern.compile("\\s+");
Map<String, Long> result = fileContent
    .flatMap(pattern::splitAsStream)
    .reduce(new HashMap<String, Long>(), (map, word) -> {
        map.merge(word, 1L, Long::sum);
        return map;
    }, (left, right) -> {
        right.forEach((key , count ) -> left.merge(key, count, Long::sum));
        return left;
    });

注意第二行，它创建了一个模式，然后在流中使用该模式来分割行。

【讨论】：