如何查看失败的机器学习记录答案

【问题标题】：How to see failed machine learning records如何查看失败的机器学习记录
【发布时间】：2019-08-20 09:43:01
【问题描述】：

我正在使用以下代码来创建我的机器学习模型。模型的准确度为 0.76。我只是想知道我的测试数据中的哪些记录失败了？有没有办法让我看到这些数据？

// 1. Load the dataset for training and testing
        var trainData = ctx.Data.LoadFromTextFile<SentimentData>(trainDataPath, hasHeader: true);
        var testData = ctx.Data.LoadFromTextFile<SentimentData>(testDataPath, hasHeader: true);

        // 2. Build a tranformer/estimator to transform input data so that Machine Learning algorithm can understand
        IEstimator<ITransformer> estimator = ctx.Transforms.Text.FeaturizeText("Features", nameof(SentimentData.Text));

        // 3. - set the training algorithm and create the pipeline for model builder
        var trainer = ctx.BinaryClassification.Trainers.SdcaLogisticRegression();
        var trainingPipeline = estimator.Append(trainer);

        // 4. - Train the model
        var trainedModel = trainingPipeline.Fit(trainData);

        // 5. - Perform the preditions on the test data
        var predictions = trainedModel.Transform(testData);

        // 6. - Evalute the model
        var metrics = ctx.BinaryClassification.Evaluate(data: predictions);

【问题讨论】：

您是否希望看到您的模型预测的内容与实际的基本事实是什么？如果够用的话，我对 python 解决方案很熟悉。
是的。您能否提供有关解决方案的更多详细信息？

标签： machine-learning ml.net

【解决方案1】：

通过使用GetColumn 和CreateEnumerable 方法，您可以找到模型没有正确预测的数据。

完成指标后，对来自测试数据集的预测使用GetColumn 方法来获取原始标签值。然后，使用CreateEnuemrable 方法获取将保存预测值的预测。您也可以选择获取情感文本。

var originalLabels = predictions.GetColumn<bool>("Label").ToArray();
var sentimentText = predictions.GetColumn<string>(nameof(SentimentData.SentimentText)).ToArray();
var predictedLabels = context.Data.CreateEnumerable<SentimentPrediction>(predictions, reuseRowObject: false).ToArray();

获取数据后，只需循环其中一个（我对原始标签进行了计数），您就可以在每次迭代时访问数据。从那里您可以检查实际标签是否不等于预测值，以仅打印出模型未正确获取的值。

for (int i = 0; i < originalLabels.Count(); i++)
{
    string outputText = String.Empty;

    if (originalLabels[i] != predictedLabels[i].Prediction)
    {
        outputText = $"Text - {sentimentText[i]} | ";
        outputText += $"Original - {originalLabels[i]} | ";
        outputText += $"Predicted - {predictedLabels[i].Prediction}";

        Console.WriteLine(outputText);
    }
}

这样您就拥有了所需的数据。 :)

希望有帮助！

【讨论】：

【解决方案2】：

从您的评论中，我相信您正在寻找的方法可以在 keras 库中找到。该方法应为keras.models.predict_classes，如在他们的documentation page 上找到。

这将为您提供一系列预测输出，然后您可以将其与基本事实进行比较。访问文档以查看参数。

希望这会有所帮助！

【讨论】：

不过，这是给 Keras 的。他想知道如何在 ML.NET 中做到这一点。
正确。我在 ML.NET 中没有找到任何相关的库/方法