【问题标题】:Find specific word in text file and count it在文本文件中查找特定单词并计算它
【发布时间】:2012-10-13 08:30:10
【问题描述】:

有人可以帮我写代码吗? 如何在文本文件中搜索任何单词并计算它重复了多少?

例如test.txt:

hi
hola
hey
hi
bye
hoola
hi

如果我想知道在 test.txt 中重复了多少次单词“Hi”,程序必须说“重复 3 次”

我希望你明白我想要什么,谢谢你的回答。

【问题讨论】:

标签: java string file search count


【解决方案1】:
public class Wordcount 
{
   public static void main(String[] args)
   {       
       int count=0;

       String str="hi this is is is line";

       String []s1=str.split(" ");

       for(int i=0;i<=s1.length-1;i++)
       {
          if(s1[i].equals("is"))
           {
               count++; 
           }
       }

       System.out.println(count);
   }
}

【讨论】:

  • 嗨,欢迎来到 SO,为旧问题发布新的、更新的解决方案总是好的,但请尝试使这些答案尽可能地提供信息和清晰。尝试在您的代码中添加描述并确保其格式正确。也请尽量避免无用的 cmets。
  • 这个答案如何为 3 岁的帖子增加价值?这里还有其他类似的答案。
【解决方案2】:
public int occurrencesOfHi()
{
    String newText = Text.replace("Hi","");
    return (Text.length() - newText.length())/2;
}

【讨论】:

  • 考虑在你的答案中加入一些 cmets。
【解决方案3】:
package somePackage;   
public static void main(String[] args) {

            String path = ""; //ADD YOUR PATH HERE
            String fileName = "test2.txt";
            String testWord = "Macbeth"; //CHANGE THIS IF YOU WANT
            int tLen = testWord.length();
            int wordCntr = 0;
            String file = path + fileName;
            boolean check;

            try{
                FileInputStream fstream = new FileInputStream(file);
                BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
                String strLine;        
                //Read File Line By Line
                while((strLine = br.readLine()) != null){                
                    //check to see whether testWord occurs at least once in the line of text
                    check = strLine.toLowerCase().contains(testWord.toLowerCase());
                    if(check){                    
                        //get the line, and parse its words into a String array
                        String[] lineWords = strLine.split("\\s+");                    
                        for(String w : lineWords){
                            //first see if the word is as least as long as the testWord
                            if(w.length() >= tLen){
                                /*
                                1) grab the specific word, minus whitespace
                                2) check to see whether the first part of it having same length
                                    as testWord is equivalent to testWord, ignoring case
                                */
                                String word = w.substring(0,tLen).trim();                                                        
                                if(word.equalsIgnoreCase(testWord)){                                
                                    wordCntr++;
                                }                            
                            }
                        }                    
                    }   
                }            
                System.out.println("total is: " + wordCntr);
            //Close the input stream
            br.close();
            } catch(Exception e){
                e.printStackTrace();
            }
        }

【讨论】:

  • 我抓取了 Macbeth 的文本并将其存储在一个名为 text2.txt 的文件中
【解决方案4】:
package com.test;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.Scanner;

public  class Test {

    public static void main(String[] args)  throws Exception{

        BufferedReader bf= new BufferedReader(new FileReader("src/test.txt"));
        Scanner sc = new Scanner(System.in);
        String W=sc.next();
        //String regex ="[\\w"+W+"]";
        int count=0;

        //Pattern p = Pattern.compile();
        String line=bf.readLine();
        String s[];
        do
        {
            s=line.split(" ");
            for(String a:s)
            {
                if(a.contains(W))
                    count++;

            }


            line=bf.readLine();


        }while(line!=null);
        System.out.println(count);
    }



}

【讨论】:

    【解决方案5】:
    public int countWord(String word, File file) {
    int count = 0;
    Scanner scanner = new Scanner(file);
    while (scanner.hasNextLine()) {
        String nextToken = scanner.next();
        if (nextToken.equalsIgnoreCase(word))
        count++;
    }
    return count;
    }
    

    【讨论】:

    • 如果文件在行尾包含空格,则会出现异常。
    • 当单词在点之后时会失败。例如。意大利是国家。意大利是个好地方。 italy 后面的点使它成为一个完整的单词“.Italy”,所以意大利的计数将给出 1
    【解决方案6】:
    import java.io.*;
    import java.util.*;
    
    class filedemo
    {
    public static void main(String ar[])throws Exception
    BufferedReader br=new BufferedReader(new FileReader("c:/file.txt"));
     System.out.println("enter the string which you search");
     Scanner ob=new Scanner(System.in);
     String str=ob.next();
     String str1="",str2="";
     int count=0;
    while((str1=br.readLine())!=null)
     {
     str2 +=str1;
    
    }  
    
     int index = str2.indexOf(str);
    
     while (index != -1) {
     count++;
     str2 = str2.substring(index + 1);
     index = str2.indexOf(str);
    }
    
    System.out.println("Number of the occures="+count);
    }
    }  
    

    【讨论】:

      【解决方案7】:
      package File1;
      
      import java.io.BufferedReader;
      import java.io.FileReader;
      
      public class CountLineWordsDuplicateWords {
      
      public static void main(String[] args) {
          FileReader fr = null;
          BufferedReader br =null;
      
          String [] stringArray;
          int counLine = 0;
          int arrayLength ;
          String s="";
          String stringLine="";
          try{
              fr = new FileReader("F:/Line.txt");
              br = new BufferedReader(fr);
              while((s = br.readLine()) != null){
                  stringLine = stringLine + s;
                  stringLine = stringLine + " ";/*Add space*/
                  counLine ++;
              }
              System.out.println(stringLine);
      
              stringArray = stringLine.split(" ");
              arrayLength = stringArray.length;
                           System.out.println("The number of Words is "+arrayLength);
              /*Duplicate String count code */
              for (int i = 0; i < arrayLength; i++) {
                  int c = 1 ;
                  for (int j = i+1; j < arrayLength; j++) {
                      if(stringArray[i].equalsIgnoreCase(stringArray[j])){
                          c++;
                          for (int j2 = j; j2 < arrayLength; j2++) {
                              stringArray[j2] = stringArray[j2+1];
                              arrayLength = arrayLength - 1;
                          }
      
                      }//End of If block
                  }//End of Inner for block
              System.out.println("The "+stringArray[i]+" present "+c+" times .");
              }//End of Outer for block
              System.out.println("The number of Line is "+counLine);
              System.out.println();
              fr.close();
              br.close();
          }catch (Exception e) {
              e.printStackTrace();
          }
      }//End of main() method 
      }//End of class CountLineWordsDuplicateWords
      

      【讨论】:

        【解决方案8】:

        PatternMatcher 试试这个方法。

        import java.io.BufferedReader;
        import java.io.File;
        import java.io.FileNotFoundException;
        import java.io.FileReader;
        import java.util.regex.Matcher;
        import java.util.regex.Pattern;
        
        public class Dem {
        
            public static void main(String[] args){
        
                try {
                    File f = new File("d://My.txt");
                    FileReader fr = new FileReader(f);
                    BufferedReader br = new BufferedReader(fr);
                    String s = new String();
        
                    while((s=br.readLine())!=null){
        
                        s = s + s;
        
                    }
        
                    int count = 0;
                    Pattern pat = Pattern.compile("it*");
                    Matcher mat = pat.matcher(s);
        
                    while(mat.find()){
        
                          if(mat.find()){
        
                              mat.start();
                              count++;
        
                          }
        
                    }
        
                    System.out.println(count);
                } catch (Exception e) {
        
                    e.printStackTrace();
                }
            }
        
        }
        

        【讨论】:

          【解决方案9】:

          使用来自google guava libraryMultiSet 集合。

          Multiset<String> wordsMultiset = HashMultiset.create();
          Scanner scanner = new Scanner(fileName);
          while (scanner.hasNextLine()) {
              wordsMultiset.add(scanner.nextLine());
          }
          for(Multiset.Entry<String> entry : wordsMultiset ){
               System.out.println("Word : "+entry.getElement()+" count -> "+entry.getCount());
          }
          

          【讨论】:

            【解决方案10】:

            尝试使用 java.util.Scanner。

            public int countWords(String w, String fileName) {
            int count = 0;
            Scanner scanner = new Scanner(inputFile);
            scanner.useDelimiter("[^a-zA-Z]"); // non alphabets act as delimeters
            String word = scanner.next();
            if (word.equalsIgnoreCase(w))
                count++;
               return count;
            }
            

            【讨论】:

              【解决方案11】:

              您可以逐行读取文本文件。我假设每一行可以包含多个单词。对于每一行,您调用:

              String[] words = line.split(" "); 
              for(int i=0; i<words.length; i++){
                 if(words[i].equalsIgnoreCase(searhedWord))
                       count++;
              }
              

              【讨论】:

                【解决方案12】:

                Apache Commons - StringUtils.countMatches()

                【讨论】:

                  【解决方案13】:
                  HashMap h=new HashMap();                        
                  FileInputStream fin=new FileInputStream("d:\\file.txt");
                  BufferedReader br=new BufferedReader(new InputStreamReader(fin));
                  String n;
                  while((n=br.readLine())!=null)
                  {
                      if(h.containsKey(n))
                      {
                      int i=(Integer)h.get(n);
                      h.put(n,(i+1));
                      }
                      else
                      h.put(n, 1);
                  }
                  

                  现在遍历此映射以使用每个单词作为映射值的键来获取每个单词的计数

                  【讨论】:

                    猜你喜欢
                    • 2012-10-31
                    • 1970-01-01
                    • 1970-01-01
                    • 1970-01-01
                    • 2011-05-30
                    • 2013-08-05
                    • 1970-01-01
                    • 2021-02-01
                    • 1970-01-01
                    相关资源
                    最近更新 更多