【问题标题】:Count how many times pair of words appeared in java计算一对单词在java中出现的次数
【发布时间】:2016-02-25 04:33:42
【问题描述】:

如何获取给定字符串上的一对单词示例

快速,快速的棕色,棕色狐狸,狐狸跳跃 跳过等等……

然后数一数它出现了多少次?

下面的代码只能计算单个单词。

 import java.util.*;
    import java.util.Map;
    import java.util.HashMap;

    public class Tokenizer

    {
        public static void main(String[] args)
        {
            int index = 0; int tokenCount; int i =0;
            Map<String,Integer> wordCount = new HashMap<String,Integer>();
            Map<Integer,Integer> letterCount = new HashMap<Integer,Integer>();
            String message="The Quick brown fox jumps over the lazy brown dog the quick";

            StringTokenizer string = new StringTokenizer(message);


            tokenCount = string.countTokens();
            System.out.println("Number of tokens = " + tokenCount);
            while (string.hasMoreTokens()) {
                String word = string.nextToken().toLowerCase();
                Integer count = wordCount.get(word);
                Integer lettercount = letterCount.get(word);

                if(count == null) {
                    wordCount.put(word, 1);
                }
                else {
                    wordCount.put(word, count + 1);
                }
            }
            for (String words : wordCount.keySet())
            {System.out.println("Word : " +  words + " has count :" +wordCount.get(words));


            }
            int first ,second;
            first = second = Integer.MIN_VALUE;
            String firstword ="";
            String secondword="";


            for(Map.Entry<String, Integer> entry : wordCount.entrySet())
            {

                int count = entry.getValue();
                String word = entry.getKey();
                if(count>first){
                    second = first;
                    secondword = firstword;
                    first = count;
                    firstword = word;

                }
                else if(count>second && count ==first){
                    second = count;
                    secondword = word;
                }
            }
            System.out.println(firstword + "" + first);
            System.out.println(secondword + " " + second);

            for(i = 0; i < message.length(); i++){
                char c = message.charAt(i);
                if (c != ' ') {

                    int value = letterCount.getOrDefault((int) c, 0);
                    letterCount.put((int) c, value + 1);
                }
            }

            for(int key : letterCount.keySet()) {
                System.out.println((char) key + ": " + letterCount.get(key));
            }
        }

    }

【问题讨论】:

  • 我看到计数是正确的。怎么了?
  • 应该按单词对而不是按单个单词算,反正我已经开始工作了,谢谢

标签: java tokenize words


【解决方案1】:

好的,所以从问题中我了解到,您需要检查字符串中的一对单词是否必须计入整个字符串。我看到你的代码,觉得它比要求的要复杂得多。请看下面的sn-p。

  1. 以空格为分隔符分割源字符串
  2. 连接相邻的字符串,用空格分隔它们
  3. 在源字符串中搜索连接的字符串
  4. 如果找不到,则添加到 Map 中,键为词对,值为 1。
  5. 如果找到,则从映射中获取单词对的值并递增并重新设置。

    String message = "The Quick brown fox jumps over the lazy brown dog the quick";
    String[] split = message.split(" ");
    Map<String, Integer> map = new HashMap<>();
    int count = 0;
    for (int i = 0; i < split.length - 1; i++) {
        String temp = split[i] + " " + split[i + 1];
        temp = temp.toLowerCase();
        if (message.toLowerCase().contains(temp)) {
            if (map.containsKey(temp))
                map.put(temp, map.get(temp) + 1);
            else
                map.put(temp, 1);
        }
    
    }
    System.out.println(map);
    

【讨论】:

  • 很高兴。我已经编辑了,所以用文字也很清楚:)
【解决方案2】:

这是完整的主方法代码, 如果有任何疑问,请告诉我。

public static void main(String[] args)
     {

         int index = 0; int tokenCount; int i =0;
         Map<String,Integer> wordCount = new HashMap<String,Integer>();
         Map<Integer,Integer> letterCount = new HashMap<Integer,Integer>();
         String message="The Quick brown fox jumps over the lazy brown dog the quick";

         StringTokenizer string = new StringTokenizer(message);


         tokenCount = string.countTokens();
         System.out.println("Number of tokens = " + tokenCount);

         while (string.hasMoreTokens()) {
             String word = string.nextToken().toLowerCase();
             Integer count = wordCount.get(word);
             Integer lettercount = letterCount.get(word);
             System.out.println("Count : " + count);
             if(count == null) {
                 wordCount.put(word, 1);
             }
             else {
                 wordCount.put(word, count + 1);
             }
         }
         for (String words : wordCount.keySet())
         {
             System.out.println("Word : " +  words + " has count :" +wordCount.get(words));
         }

     }

【讨论】:

    【解决方案3】:
    while (string.hasMoreTokens()) {
    
          String word = string.nextToken().toLowerCase();
    
          if (string.hasMoreTokens())
            word += " "+string.nextToken().toLowerCase();
    
          Integer count = wordCount.get(word);
          Integer lettercount = letterCount.get(word);
    
          if(count == null) {
            wordCount.put(word,  1);
          }
          else {
            wordCount.put(word,  count + 1);
          }
        }
    

    【讨论】:

    • 点击我的编辑上方的链接,就可以看到了。它只是格式化以在代码框中获取第一行。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-02-07
    • 2012-08-09
    • 2021-11-17
    • 1970-01-01
    • 2011-06-20
    相关资源
    最近更新 更多