作为对 OP 请求的实际回答,这里是几个 WSD 方法的 python 实现,它以 NLTK 的 synset(s) 的形式返回意义,https://github.com/alvations/pywsd
包括
-
Lesk算法(包括原始Lesk、改编的Lesk和简单的Lesk)
-
基线算法(随机感觉、第一感觉、最常见感觉)
可以这样使用:
#!/usr/bin/env python -*- coding: utf-8 -*-
bank_sents = ['I went to the bank to deposit my money',
'The river bank was full of dead fishes']
plant_sents = ['The workers at the industrial plant were overworked',
'The plant was no longer bearing flowers']
print "======== TESTING simple_lesk ===========\n"
from lesk import simple_lesk
print "#TESTING simple_lesk() ..."
print "Context:", bank_sents[0]
answer = simple_lesk(bank_sents[0],'bank')
print "Sense:", answer
print "Definition:",answer.definition
print
print "#TESTING simple_lesk() with POS ..."
print "Context:", bank_sents[1]
answer = simple_lesk(bank_sents[1],'bank','n')
print "Sense:", answer
print "Definition:",answer.definition
print
print "#TESTING simple_lesk() with POS and stems ..."
print "Context:", plant_sents[0]
answer = simple_lesk(plant_sents[0],'plant','n', True)
print "Sense:", answer
print "Definition:",answer.definition
print
print "======== TESTING baseline ===========\n"
from baseline import random_sense, first_sense
from baseline import max_lemma_count as most_frequent_sense
print "#TESTING random_sense() ..."
print "Context:", bank_sents[0]
answer = random_sense('bank')
print "Sense:", answer
print "Definition:",answer.definition
print
print "#TESTING first_sense() ..."
print "Context:", bank_sents[0]
answer = first_sense('bank')
print "Sense:", answer
print "Definition:",answer.definition
print
print "#TESTING most_frequent_sense() ..."
print "Context:", bank_sents[0]
answer = most_frequent_sense('bank')
print "Sense:", answer
print "Definition:",answer.definition
print
[出]:
======== TESTING simple_lesk ===========
#TESTING simple_lesk() ...
Context: I went to the bank to deposit my money
Sense: Synset('depository_financial_institution.n.01')
Definition: a financial institution that accepts deposits and channels the money into lending activities
#TESTING simple_lesk() with POS ...
Context: The river bank was full of dead fishes
Sense: Synset('bank.n.01')
Definition: sloping land (especially the slope beside a body of water)
#TESTING simple_lesk() with POS and stems ...
Context: The workers at the industrial plant were overworked
Sense: Synset('plant.n.01')
Definition: buildings for carrying on industrial labor
======== TESTING baseline ===========
#TESTING random_sense() ...
Context: I went to the bank to deposit my money
Sense: Synset('deposit.v.02')
Definition: put into a bank account
#TESTING first_sense() ...
Context: I went to the bank to deposit my money
Sense: Synset('bank.n.01')
Definition: sloping land (especially the slope beside a body of water)
#TESTING most_frequent_sense() ...
Context: I went to the bank to deposit my money
Sense: Synset('bank.n.01')
Definition: sloping land (especially the slope beside a body of water)