【发布时间】:2012-04-01 15:47:15
【问题描述】:
我正在尝试使用 Python 库 NLTK 进行自然语言处理。
我的问题:我正在尝试执行词干提取;将单词简化为其规范化形式。但它没有产生正确的单词。我是否正确使用了词干类?我怎样才能得到我想要得到的结果?
我想规范化以下单词:
words = ["forgot","forgotten","there's","myself","remuneration"]
...进入这个:
words = ["forgot","forgot","there","myself","remunerate"]
我的代码:
from nltk import stem
words = ["forgot","forgotten","there's","myself","remuneration"]
for word in words:
print stemmer.stem(word)
#output is:
#forgot forgotten there' myself remuner
【问题讨论】: