CoreNLP：它可以判断一个名词是否指代一个人吗？答案

【问题标题】：CoreNLP: Can it tell whether a noun refers to a person?CoreNLP：它可以判断一个名词是否指代一个人吗？
【发布时间】：2019-09-01 11:38:08
【问题描述】：

CoreNLP 能否确定普通名词（相对于专有名词或专有名称）是否指的是开箱即用的人？或者，如果我需要为此任务训练模型，我该怎么做？

首先，我不是在寻找共指解析，而是寻找它的构建块。定义上的共指取决于上下文，而我试图评估一个词 inisolation 是“人”还是“人”的子集。例如：

is_human('effort') # False
is_human('dog') # False
is_human('engineer') # True

我天真地尝试使用 Gensim 和 spaCy 的预训练词向量未能将“工程师”排在其他两个词之上。

import gensim.downloader as api
word_vectors = api.load("glove-wiki-gigaword-100") 
for word in ('effort', 'dog', 'engineer'):
    print(word, word_vectors.similarity(word, 'person'))

# effort 0.42303842
# dog 0.46886832
# engineer 0.32456854

我从CoreNLP 中发现以下列表很有希望。

dcoref.demonym                   // The path for a file that includes a list of demonyms 
dcoref.animate                   // The list of animate/inanimate mentions (Ji and Lin, 2009)
dcoref.inanimate 
dcoref.male                      // The list of male/neutral/female mentions (Bergsma and Lin, 2006) 
dcoref.neutral                   // Neutral means a mention that is usually referred by 'it'
dcoref.female 
dcoref.plural                    // The list of plural/singular mentions (Bergsma and Lin, 2006)
dcoref.singular

这些对我的任务有用吗？如果是这样，我将如何从Python wrapper 访问它们？谢谢。

【问题讨论】：

标签： nlp stanford-nlp pycorenlp

【解决方案1】：

我建议改用WordNet 看看：

如果 WordNet 涵盖了您的足够多的条款，并且
如果您想要的术语是person.n.01 的下义词。

您必须稍微扩展一下以涵盖多种感官，但要点是：

from nltk.corpus import wordnet as wn

# True
wn.synset('person.n.01') in wn.synset('engineer.n.01').lowest_common_hypernyms(wn.synset('person.n.01'))

# False
wn.synset('person.n.01') in wn.synset('dog.n.01').lowest_common_hypernyms(wn.synset('person.n.01'))

请参阅lowest_common_hypernym 的 NLTK 文档：http://www.nltk.org/howto/wordnet_lch.html

【讨论】：

这正是我所需要的。谢谢。