【发布时间】:2011-06-28 21:26:26
【问题描述】:
我正在为我的项目使用 java wordnet 库(jwnl)。我需要在处理之前找到一个单词的基本形式。例如,如果我给出“sent”,那么基本形式的词应该是“send”。明智地对于“dispatched”,基本词应该是“dispatch”。我已经阅读了 jwnl 文档,但它让我感到困惑。请提供一段代码来查找基本词。谢谢期待。
【问题讨论】:
我正在为我的项目使用 java wordnet 库(jwnl)。我需要在处理之前找到一个单词的基本形式。例如,如果我给出“sent”,那么基本形式的词应该是“send”。明智地对于“dispatched”,基本词应该是“dispatch”。我已经阅读了 jwnl 文档,但它让我感到困惑。请提供一段代码来查找基本词。谢谢期待。
【问题讨论】:
我使用 JAWS,因为我发现它比 JWNL 检查此代码更好,以找到它的基础和光泽
import java.io.*;
import edu.smu.tspell.wordnet.*;
/**
* Displays word forms and definitions for synsets containing the word form
* specified on the command line. To use this application, specify the word
* form that you wish to view synsets for, as in the following example which
* displays all synsets containing the word form "airplane":
* <br>
* java TestJAWS airplane
*/
public class start
{
/**
* Main entry point. The command-line arguments are concatenated together
* (separated by spaces) and used as the word form to look up.
*/
public static void main(String[] args)
{
while(true)
{
if (args.length == 0)
{
StringBuffer buffer = new StringBuffer();
String wordForm = null;//"fast";//buffer.toString();
System.out.print("\n");
System.out.print("Enter your query: ");
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
try {
wordForm = br.readLine();
} catch (IOException e) {
System.out.println("Error!");
System.exit(1);
}
System.out.println("Your looking for: " + wordForm);
System.setProperty("wordnet.database.dir", "/home/dell/workspace/wordnet/WordNet-3.0/dict");
WordNetDatabase database = WordNetDatabase.getFileInstance();
Synset[] synsets = database.getSynsets(wordForm);
// Display the word forms and definitions for synsets retrieved
if (synsets.length > 0)
{
System.out.println("The following synsets contain '" +
wordForm + "' or a possible base form " +
"of that text:");
for (int i = 0; i < synsets.length; i++)
{
System.out.println("");
String[] wordForms = synsets[i].getWordForms();
for (int j = 0; j < wordForms.length; j++)
{
System.out.print((j > 0 ? ", " : "") +
wordForms[j]);
}
System.out.println(": " + synsets[i].getDefinition());
}
}
else
{
System.err.println("No synsets exist that contain " +
"the word form '" + wordForm + "'");
}
}
else
{
System.err.println("You must specify " +
"a word form for which to retrieve synsets.");
}
}
}
}
【讨论】:
我建议尝试使用 Porter stemmer 算法而不是 wordnet,您可以在大多数语言中找到实现 - including java here
这应该可以得到你想要的
【讨论】: