【问题标题】:Converting ArrayList to Hashset in java在java中将ArrayList转换为Hashset
【发布时间】:2018-01-24 11:02:07
【问题描述】:

我有这段代码可以读取并计算 txt 文件中的每个单词,但是我只希望它对一行中的每个单词进行一次计数,所以我正在尝试创建一个 HashSet 但是我在转换ArrayList 到 HashSet。这是我的代码:

try {
    List<String> list = new ArrayList<String>();
    int totalWords = 0;
    int uniqueWords = 0;
    File fr = new File("filename.txt");
    Scanner sc = new Scanner(fr);
    while (sc.hasNext()) {
        String words = sc.next();
        String[] space = words.split(" ");
        Set<String> set = new HashSet<String>(Arrays.asList(space));
        for (int i = 0; i < set.length; i++) {
            list.add(set[i]);
        }
        totalWords++;
    }
    System.out.println("Words with their frequency..");
    Set<String> uniqueSet = new HashSet<String>(list);
    for (String word : uniqueSet) {
        System.out.println(word + ": " + Collections.frequency(list,word));
    }
} catch (Exception e) {

    System.out.println("File not found");

  }  

如果有人可以帮助解释为什么长度“无法解析或不是字段”,以及为什么我在“set[i]”上出现错误,告诉我它必须解析为字符串。谢谢你

【问题讨论】:

  • 记住 Java 不支持运算符重载。你不能在任何非数组对象上使用[]
  • 使用 for in range 循环遍历集合的每个元素。
  • 如果文件在不同的行中多次包含相同的单词,应该多久计算一次?
  • @XtremeBaumer 例如,如果单词“dog”在第 1 行出现两次,在第 2 行出现一次,则应该只计算两次,因为它出现在两行。
  • 所以你根本不关心第三次出现而忽略它(不在任何地方计算)?

标签: java arraylist hashset


【解决方案1】:

正如您在 cmets 中被告知的那样,您不能使用 []length,因为任何 SetCollection 而不是数组:

你可以试试这个方法:

try {
    List<String> list = new ArrayList<String>();
    int totalWords = 0;
    int uniqueWords = 0;
    File fr = new File("filename.txt");
    Scanner sc = new Scanner(fr);
    while (sc.hasNext()) {
         String words = sc.next();
         String[] space = words.split(" ");
         Set<String> set = new HashSet<String>(Arrays.asList(space));
         for(String element : set){
              list.add(element);
         }
         totalWords++;
    }
    System.out.println("Words with their frequency..");
    Set<String> uniqueSet = new HashSet<String>(list);
    for (String word : uniqueSet) {
         System.out.println(word + ": " + Collections.frequency(list,word));
    }
} catch (Exception e) {
    System.out.println("File not found");
} 

【讨论】:

  • 您好,感谢您的回复。此方法仍会在每次单词出现在文档中时计数,而不是它出现的行数。我一直在尝试使用 HashSet,因为它删除了重复项,但是在这种情况下出现了问题
  • 我没有审查算法本身,只有编译器投诉。
【解决方案2】:

我使用了地图数据结构来存储和更新单词及其各自的频率..

根据您的要求:每个单词都应只计算一次,即使它们在一行中出现多次。

遍历每一行:

 Store all the words in the set.

 Now just iterate over this set and update the map data structure.

因此,最终地图中单词对应的值将是所需的频率。

你可以看看我下面的代码:

import java.io.File;
import java.util.*;

public class sol {
    public static void main(String args[]) {
        try {
            File fr = new File("file.txt");
            Scanner sc = new Scanner(fr);

            // to maintain frequency of each word after reading each line..
            HashMap<String, Integer> word_frequency = new HashMap<String, Integer>();

            while(sc.hasNextLine()) {
                // input the line..
                String line = sc.nextLine();
                String words[] = line.split(" ");

                // just store which unique words are there in this line..
                HashSet<String> unique_words = new HashSet<String>();

                for(int i=0;i<words.length;i++) {
                    unique_words.add(words[i]);     // add it to set..
                }

                // Iterate on the set now to update the frequency..
                Iterator itr = unique_words.iterator();

                while(itr.hasNext()) {
                    String word = (String)itr.next();   

                    // If this word is already there then just increment it..
                    if(word_frequency.containsKey(word)) {
                        int old_frequency = word_frequency.get(word);
                        int new_frequency = old_frequency + 1;
                        word_frequency.put(word, new_frequency);
                    }
                    else {
                        // If this word is not there then put this 
                        // new word in the map with frequency 1..

                        word_frequency.put(word, 1);
                    }
                }
            }

            // Now, you have all the words with their respective frequencies..
            // Just print the words and their frequencies..
            for(Map.Entry obj : word_frequency.entrySet()) {
                String word = (String)obj.getKey();
                int frequency = (Integer)obj.getValue();

                System.out.println(word+": "+frequency);
            }
        }
        catch(Exception e) {
            // throw whatever exception you wish.. 
        }
    }
}

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2019-03-28
    • 2021-07-11
    • 1970-01-01
    • 2012-04-13
    • 1970-01-01
    • 2021-07-25
    相关资源
    最近更新 更多