【发布时间】:2012-04-03 01:00:33
【问题描述】:
对于家庭作业,我们将把basicCompare 方法转换为比较两个文本文档并查看它们是否涉及相似主题的方法。基本上,该程序将删除所有长度小于五个字符的单词,并为我们留下列表。我们应该比较这些列表,并且如果两个文档之间使用的词足够多(假设相似度为 80%),该方法返回 true 并表示“匹配”。
但是,我对所有 cmets 在方法底部的位置感到困惑。我想不出或找到一种方法来比较两个列表并找出两个列表中单词的百分比。也许我想错了,需要过滤掉不在两个列表中的单词,然后只计算剩下的单词数。用于定义输入文档是否匹配的参数完全由我们决定,因此可以根据需要进行设置。如果各位好心的女士们先生们能指出我正确的方向,甚至指向某个功能的 Java 文档页面,我相信我可以完成剩下的工作。我只需要知道从哪里开始。
import java.util.Collections;
import java.util.List;
public class MyComparator implements DocumentComparator {
public static void main(String args[]){
MyComparator mc = new MyComparator();
if(mc.basicCompare("C:\\Users\\Quinncuatro\\Desktop\\MatchLabJava\\LabCode\\match1.txt", "C:\\Users\\Quinncuatro\\Desktop\\MatchLabJava\\LabCode\\match2.txt")){
System.out.println("match1.txt and match2.txt are similar!");
} else {
System.out.println("match1.txt and match2.txt are NOT similar!");
}
}
//In the basicCompare method, since the bottom returns false, it results in the else statement in the calling above, saying they're not similar
//Need to implement a thing that if so many of the words are shared, it returns as true
public boolean basicCompare(String f1, String f2) {
List<String> wordsFromFirstArticle = LabUtils.getWordsFromFile(f1);
List<String> wordsFromSecondArticle = LabUtils.getWordsFromFile(f2);
Collections.sort(wordsFromFirstArticle);
Collections.sort(wordsFromSecondArticle);//sort list alphabetically
for(String word : wordsFromFirstArticle){
System.out.println(word);
}
for(String word2 : wordsFromSecondArticle){
System.out.println(word2);
}
//Find a way to use common_words to strip out the "noise" in the two lists, so you're ONLY left with unique words
//Get rid of words not in both lists, if above a certain number, return true
//If word1 = word2 more than 80%, return true
//Then just write more whatever.basicCompare modules to compare 2 to 3, 1 to 3, 1 to no, 2 to no, and 3 to no
//Once you get it working, you don't need to print the words, just say whether or not they "match"
return false;
}
public boolean mapCompare(String f1, String f2) {
return false;
}
}
【问题讨论】:
-
虽然你展示了代码,但最好能在问题的主要内容上展示你的努力,而不是仅仅提供围绕它的脚手架代码。跨度>
标签: java list collections comparison