如何识别在 weka 中错误分类的确切实例答案

【问题标题】：How to identifying the exact instances that are wrongly classified in weka如何识别在 weka 中错误分类的确切实例
【发布时间】：2015-06-22 16:40:49
【问题描述】：

这是我的代码，我正在使用 weka API。我想打印错误分类的实例和准确分类的实例。请帮助我，或者告诉我任何其他能够做我想做的事情的文本分类 java API。

    public void evaluation() throws Exception{
    BufferedReader reader=null;
    reader= new BufferedReader(new FileReader("SparseDTM.arff"));

    Instances train= new Instances(reader);
    train.setClassIndex(0);
    train.toSummaryString();
    reader.close();
    SMO svm=new SMO();
    svm.buildClassifier(train);

    NaiveBayes nB = new NaiveBayes();
    nB.buildClassifier(train);

    weka.classifiers.Evaluation eval= new weka.classifiers.Evaluation(train);
    eval.crossValidateModel(nB, train,10,new Random(1));
    //eval.crossValidateModel(nB, train,10,new Random(1), new Object[] { });

    System.out.println("\n\t************Results by Naive Bayes Classifier************\n");
    System.out.println(eval.toSummaryString("", true));
    System.out.println(eval.toClassDetailsString());
//  System.out.println("F Measure: "+eval.fMeasure(1) + " " + "Precision: "+eval.precision(1) + " " + "Precision: "+eval.recall(1));
//  System.out.println("Correct :" + eval.correct());
//  System.out.println("Weighted True Negative Rate: " + eval.weightedTrueNegativeRate());
//  System.out.println("Weighted False Positive Rate:" + eval.weightedFalsePositiveRate());
//  System.out.println("Weighted False Negative Rate:" + eval.weightedFalseNegativeRate());
//  System.out.println("Weighted True Positive Rate:" + eval.weightedTruePositiveRate());
    System.out.println(eval.toMatrixString());
    }

【问题讨论】：

标签： weka text-classification

【解决方案1】：

以下方法可以帮助您解决问题。因此，您可以对其进行编辑以达到您的目标。

public void showPredictions(  ){    

    BufferedReader reader=null;
    reader= new BufferedReader(new FileReader("SparseDTM.arff"));

    Instances data = new Instances(reader);

    double[] predictions;
    try {

        NaiveBayes classifier = new NaiveBayes();
        classifier.buildClassifier(data);

        predictions = eval.evaluateModel(classifier, data );

        int classIndex = data.numAttributes()-1;
        // getting the array of predictions for each instance
        System.out.println("predictions: ");
        for (int i=0; i < data.numInstances(); i++ ) {
            double realValue = testData.instance(i).classValue(); // test or train data.
            System.out.print("Real Value: " + testData.instance(i).stringValue( classIndex ));
            System.out.println("\tClassification predicted value: " + predictions[i]);

            if( realValue != predictions[i] ) {
                System.out.println("misclassified instance: " + testData.instance(i).toString());
            }
        }       
    } catch (Exception e) {
        e.printStackTrace();
    }
}

如果您可以观察到与训练集相关的错误分类实例，请将“testData”替换为“data”。否则，您必须提供测试集。

【讨论】：

这个方案是针对测试数据集的，我们建好模型之后，但是交叉验证呢？并感谢您的回答
您的解决方案向我展示了这样做的方法，非常感谢。我已经做到了...... :)
对于交叉验证，您必须在每个折叠中应用解决方案以观察错误分类的实例。在这种情况下，请查看此源代码 [链接] (weka.wikispaces.com/…