【问题标题】:Word sense disambiguation with WordNet. How to select the words related to the same meaning?使用 WordNet 进行词义消歧。如何选择与相同含义相关的单词?
【发布时间】:2017-09-25 03:33:26
【问题描述】:

我正在使用 WordNet 和 NLTK 进行词义消歧。我对所有与声音有关的单词都感兴趣。我有一个这样的单词列表,“roll”就是其中之一。然后我检查我的任何句子是否包含这个词(我也会根据 POS 检查它)。如果是,我只想选择与声音相关的句子。在下面的示例中,它将是第二句话。我现在的想法就是选择这样的词,他们的定义中有一个词“声音”作为“快速连续敲打的鼓(尤其是小军鼓)的声音”。但我怀疑还有一种更优雅的方式。任何想法将不胜感激!

from nltk.wsd import lesk
from nltk.corpus import wordnet as wn

samples = [('The van rolled along the highway.','n'),
('The thunder rolled and the lightning striked.','n')]

word = 'roll'
for sentence, pos_tag in samples:
    word_syn = lesk(word_tokenize(sentence.lower()), word, pos_tag)
    print 'Sentence:', sentence
    print 'Word synset:', word_syn
    print  'Corresponding definition:', word_syn.definition()

输出:

Sentence: The van rolled along the highway.
Word synset: Synset('scroll.n.02')
Corresponding definition: a document that can be rolled up (as for storage)
Sentence: The thunder rolled and the lightning striked.
Word synset: Synset('paradiddle.n.01')
Corresponding definition: the sound of a drum (especially a snare drum) beaten rapidly and continuously

【问题讨论】:

  • adapted lesk

标签: python nltk wordnet word-sense-disambiguation


【解决方案1】:

您可以使用 WordNet 上位词(具有更一般含义的同义词集)。我的第一个想法是从当前的同义词集向上(使用synset.hypernyms())并继续检查是否找到“声音”同义词集。当我点击根(没有上位词,即synset.hypernyms() 返回一个空列表)时,我会停下来。

现在对于您的两个示例,这会产生以下同义词序列:

Sentence:The van rolled along the highway .
Word synset:Synset('scroll.n.02')
[Synset('manuscript.n.02')]
[Synset('autograph.n.01')]
[Synset('writing.n.02')]
[Synset('written_communication.n.01')]
[Synset('communication.n.02')]
[Synset('abstraction.n.06')]
[Synset('entity.n.01')]

Sentence:The thunder rolled and the lightning striked .
Word synset:Synset('paradiddle.n.01')
[Synset('sound.n.04')]
[Synset('happening.n.01')]
[Synset('event.n.01')]
[Synset('psychological_feature.n.01')]
[Synset('abstraction.n.06')]
[Synset('entity.n.01')]

因此,您可能想要查找的同义词之一是sound.n.04。但是可能还有其他的,我认为您可以尝试其他示例并尝试提出一个列表。

【讨论】:

  • 这似乎是个好主意!谢谢!
猜你喜欢
  • 2015-01-08
  • 2011-10-13
  • 2015-04-20
  • 1970-01-01
  • 1970-01-01
  • 2012-10-10
  • 2014-08-11
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多