【发布时间】:2018-07-06 21:43:27
【问题描述】:
对 Python/NLTK 相当陌生,如果这是一个基本问题,请原谅我。
分类器似乎正在运行/工作正常,但在尝试通过 nltk.classify.accuracy 检索准确度时,我遇到了 ValueError。
这是否与训练集包含在 [({xxx})] 中而测试集包含在 [xxx] 中有关?
错误状态:
results = classifier.classify_many([fs for (fs, l) in gold])
ValueError: too many values to unpack (expected 2)`
代码
train = [('train', 'train'),
('next train in', 'train'),
('When is the next train', 'train'),
('How long until the next train', 'train'),
("Where is the next train", 'train'),
('dart', 'train'),
('next dart in', 'train'),
('When is the next dart', 'train'),
('How long until the next dart', 'train'),
("Where is the next dart", 'train'),
("Show me where", 'map'),
("Directions to", 'map'),
('map', 'map')]
all_words = set(word.lower() for passage in train for word in word_tokenize(passage[0]))
t = [({word: (word in word_tokenize(x[0])) for word in all_words}, x[1]) for x in train]
classifier = nltk.NaiveBayesClassifier.train(t)
classifier.show_most_informative_features()
test_sentence = 'Whatever my message is, hopefully something about trains'
test_sent_features = {word.lower(): (word in word_tokenize(test_sentence.lower())) for word in all_words}
print(classifier.classify(test_sent_features))
print(nltk.classify.accuracy(classifier, test_sent_features))
我确信我忽略了一些简单的东西,但我似乎无法发现它。感谢您对此的任何意见,谢谢。
【问题讨论】: