从斯坦福依赖解析器获取python中的编号依赖三元组答案

【问题标题】：Getting numbered dependency triples in python from stanford dependency parser从斯坦福依赖解析器获取python中的编号依赖三元组
【发布时间】：2017-09-19 16:28:35
【问题描述】：

这个问题让我思考了很长一段时间，但我没有找到令人满意的解决方案。

好吧，现在我在 Python 中使用斯坦福依赖解析器，下面的代码给了我这个输出。

phrase="If there is a moose in the oven, is there also an elephant?"
dependency_parser = StanfordDependencyParser(path_to_jar=path_to_jar, path_to_models_jar=path_to_models_jar)
test = dependency_parser.raw_parse(phrase)
dep= test.next()


list(dep.triples())

((u'is', u'VBZ'), u'advcl', (u'is', u'VBZ'))

((u'is', u'VBZ'), u'mark', (u'If', u'IN'))

((u'is', u'VBZ'), u'expl', (u'there', u'EX'))

等等……

但我真正需要的是一些表示，其中包括原始句子中出现的次数，因为最终的应用程序将包含相同单词多次出现的长句子。比如：

标记(is-3, If-1)

提前感谢您对如何生成此类输出的任何想法！

【问题讨论】：

标签： python dependencies nlp stanford-nlp

【解决方案1】：

如果您使用 Java 服务器并通过 Python 客户端访问它，您可以在返回的 JSON 中获取令牌索引。

以下是有关启动 Java Stanford CoreNLP 服务器的信息：

https://stanfordnlp.github.io/CoreNLP/corenlp-server.html

我建议安装 stanza Python 模块并使用它为斯坦福 CoreNLP 服务器提供的客户端。

安装和使用stanza的信息可以在这里找到：

https://github.com/stanfordnlp/stanza

返回的带有依赖关系的 JSON 将具有标记的索引。

【讨论】：

非常感谢！ :)