斯坦福 NLP 注释排名或分数答案

【问题标题】：Stanford NLP annotation ranking or score斯坦福 NLP 注释排名或分数
【发布时间】：2018-03-12 20:19:54
【问题描述】：

我正在使用斯坦福 CoreNLP 管道，并从 SentencesAnnotation 获得 TreeAnnotation 和 BasicDependenciesAnnotation。
我正在寻找一种方法来判断解析器对 POS 标签和依赖结构的确定程度。

我记得早些时候在修改斯坦福 NLP 库时，我在某处看到多个具有不同排名的树为同一个句子返回。我找不到有关如何从解析器或管道中获取此信息的任何信息。

据我所知，DependencyScoring 类似乎在 TypedDependency 上运行，而不是管道作为注释过程的一部分产生的东西。

编辑：代码详情：

Annotation document = new Annotation("This is my sentence");
pipeline.annotate(document); 
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
...
Tree tree = sentence1.get(TreeAnnotation.class);
SemanticGraph dependencies = sentence1.get(CollapsedCCProcessedDependenciesAnnotation.class);

【问题讨论】：

你是如何产生依赖解析的？你是从parse 注释器那里得到它们的吗？如果是这种情况，依赖关系实际上是由确定性转换产生的——您唯一的概率度量将来自转换开始的 PCFG 解析。如果确实如此，我可以提供更多细节。
基本上我做“注释文档 = new Annotation("This is my sentence"); pipeline.annotate(document); List sentence = document.get(SentencesAnnotation.class);"然后得到 TreeAnnotation 和 Dependency Graph。是的，请详细说明 PCFG 方法。
@JonGauthier 另外，是否有可能在依赖关系中看到单词对的可能性/概率？例如。遇到“MD”->“JJ”或“will”->“able”关系的概率有多大？如果您愿意，我可以将其作为单独的问题发布。

标签： nlp stanford-nlp

【解决方案1】：

如果您使用默认的 CoreNLP 管道（即，使用 parse 注释器而不是 depparse），您看到的依赖解析来自句子的选区解析的确定性转换。您可以在此处获得的最佳“分数”是查看最终产生依赖项解析的候选选区解析（转换后）。

但是，您需要跳出 CoreNLP 管道来完成这项特定工作。如果您有一个 LexicalizedParser 实例，您可以获得 k 最佳解析（带有附加分数），如下所示：

List<CoreLabel> mySentence = ...

LexicalizedParser parser = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");
ParserQuery pq = parser.parserQuery();
if (pq.parse(mySentence)) {
  // Get best parse and associated score
  Tree parse = pq.getBestPCFGParse();
  double score = pq.getPCFGScore();

  // Print parse
  parse.pennPrint();

  // ----
  // Get collection of best parses
  List<ScoredObject<Tree>> bestParses = pq.getBestPCFGParses();

  // ----
  // Convert a constituency parse to dependency representation
  GrammaticalStructure gs = parser.treebankLanguagePack()
      .grammaticalStructureFactory().newGrammaticalStructure(parse);
  List<TypedDependency> dependencies = gs.typedDependenciesCCprocessed();
  System.out.println(dependencies);
}