【发布时间】:2012-01-31 22:18:20
【问题描述】:
我正在尝试在我自己的 java 类中使用来自 http://code.google.com/p/berkeleyaligner/ 的 BerkeleyAligner.jar 文件中的字对齐。
我已经将 .jar 文件添加到我的构建路径中。
edu.berkeley.nlp.wordAlignment.combine.CombinedAligner 采用什么参数?
edu.berkeley.nlp.wordAlignment.combine.CombinedAligneroutput 是什么意思?
我有 2 个已经句子对齐的输入文件;即来自 sourceFile 的第 X 行的句子与来自 targetFile 的第 X 行的句子相同(但使用不同的语言)。
import edu.berkeley.*;
import edu.berkeley.nlp.wa.mt.Alignment;
import edu.berkeley.nlp.wa.mt.SentencePair;
public class TestAlign {
BufferedReader brSrc = new BufferedReader(new FileReader ("sourceFile"));
BufferedReader brTrg = new BufferedReader(new FileReader ("targetFile"));
String currentSrcLine;
while ((currentSrcLine = brSrc.readLine()) !=null) {
String currentTrgLine = brTrg.readline();
// Reads into BerkeleyAligner SentencePair format.
SentencePair src2trg = new SentencePair(sentCounter, params.get("source"),
Arrays.asList(srcLine.split(" ")), Arrays.asList(trgLine.split(" ")));
// How do i call the BerkeleyAligner??
// -What parameters does the CombinedAligner takes?
// -What does the function/class returns?
// I assume it returns a list of strings.
// Is there a class in BerkeleyAligner to read the output?
// Please provide some example, thank you!!
Alignment output = edu.berkeley.nlp.wordAlignment.combine.CombinedAligner
.something.something(currentSrcLine, currentTrgLine);
}
}
例如源文件:
this is the first line in the textfile.
that is the second line.
foo bar likes to eat bar foo.
例如目标文件:
Dies ist die erste Textzeile in der Datei.
das ist die zweite Zeile.
foo bar gerne bar foo essen.
【问题讨论】:
标签: java text word translation alignment