多标签文本分类的分类报告？答案

【问题标题】：classification report for multilabel text classification?多标签文本分类的分类报告？
【发布时间】：2021-08-17 10:09:57
【问题描述】：

我正在研究多标签文本分类。我试图打印机器学习的分类报告，但它只打印每个类。我怎样才能把所有班级的分类报告放在一起？这部分代码

标签代码

categories = list(data_raw.columns.values)
categories = categories[1:]

评价：

def modelEvaluation(predictions, y_test_set):
    print("\nAccuracy on validation set: {:.4f}".format(accuracy_score(y_test_set, predictions)))
    print("\nClassification report : \n", metrics.classification_report(y_test_set, predictions))
    print("\nConfusion Matrix : \n", multilabel_confusion_matrix(y_test_set, predictions))

这适用于机器学习

from sklearn.svm import LinearSVC


SVC_pipeline = Pipeline([
                    ('clf', OneVsRestClassifier(LinearSVC(), n_jobs=1)),
            ])


for category in categories:
    printmd('**Processing {} comments...**'.format(category))
    
    # Training logistic regression model on train data
    SVC_pipeline.fit(x_train, train[category])
    
    # calculating test accuracy
    prediction = SVC_pipeline.predict(x_test)
    print('Test accuracy is {}'.format(accuracy_score(test[category], prediction)))
    print("\n")
    
    modelEvaluation(prediction, test[category])

如果我尝试像下面的代码一样单独打印分类报告，它会给我最后一课的结果

from sklearn.metrics import classification_report
print("\nClassification report : \n", metrics.classification_report(test[category], prediction))

【问题讨论】：

如果有帮助，请选择答案并点赞。

标签： python machine-learning scikit-learn classification multilabel-classification

【解决方案1】：

不使用 test[category] 并提供整个测试集，其中包含您为其构建模型的所有类。

print("\nClassification report : \n", metrics.classification_report(y_test, predictions))

其中y_test 是测试集X_test 的真实标签（真实输出）。

您正在通过测试集 (X_test) 而不是该测试集的标签 (y_test)。

【讨论】：

请检查问题的修改，看看类标签是如何确定的。顺便说一句，我尝试了您的解决方案，但它不起作用给了我一个错误。 TypeError: '<' not supported between instances of 'int' and 'str'
您应该评论解决方案是否有效，以便我们知道问题所在。
对于您提供的TypeError。请添加完整的 tracevack 错误以了解哪条线路有问题。再次获得所有类的分类报告，您需要包含所有这些类的测试集，而不仅仅是单个类/类别。
ValueError Traceback (most recent call last) <ipython-input-66-056fdfcdc434> in <module> 2 test = test.drop(labels = ['story'], axis=1) 3 ----> 4 print("\nClassification report : \n", metrics.classification_report(test, prediction)) 5 print(test) ~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs) 71 FutureWarning)@Anurag Dhadse
y_type, y_true, y_pred = _check_targets(y_true, y_pred) 1930 1931 labels_given = True ~\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py in _check_targets(y_true, y_pred) 89 if len(y_type) > 1: 90 raise ValueError("Classification metrics can't handle a mix of {0} " ---> 91 "and {1} targets".format(type_true, type_pred)) 92 93 # We can't have more than one value on y_type => The set is no more needed ValueError: Classification metrics can't handle a mix of multilabel-indicator and binary targets