MATLAB 中的 10 倍 SVM 分类示例答案

【问题标题】：Example of 10-fold SVM classification in MATLABMATLAB 中的 10 倍 SVM 分类示例
【发布时间】：2011-03-05 11:43:52
【问题描述】：

我需要一个描述性的示例来展示如何对两类数据集进行 10 倍 SVM 分类。 MATLAB 文档中只有一个示例，但不是 10 倍。有人可以帮我吗？

【问题讨论】：

标签： matlab machine-learning svm

【解决方案1】：

这是一个完整的示例，使用生物信息学工具箱中的以下函数：SVMTRAIN、SVMCLASSIFY、CLASSPERF、CROSSVALIND。

load fisheriris                              %# load iris dataset
groups = ismember(species,'setosa');         %# create a two-class problem

%# number of cross-validation folds:
%# If you have 50 samples, divide them into 10 groups of 5 samples each,
%# then train with 9 groups (45 samples) and test with 1 group (5 samples).
%# This is repeated ten times, with each group used exactly once as a test set.
%# Finally the 10 results from the folds are averaged to produce a single 
%# performance estimation.
k=10;

cvFolds = crossvalind('Kfold', groups, k);   %# get indices of 10-fold CV
cp = classperf(groups);                      %# init performance tracker

for i = 1:k                                  %# for each fold
    testIdx = (cvFolds == i);                %# get indices of test instances
    trainIdx = ~testIdx;                     %# get indices training instances

    %# train an SVM model over training instances
    svmModel = svmtrain(meas(trainIdx,:), groups(trainIdx), ...
                 'Autoscale',true, 'Showplot',false, 'Method','QP', ...
                 'BoxConstraint',2e-1, 'Kernel_Function','rbf', 'RBF_Sigma',1);

    %# test using test instances
    pred = svmclassify(svmModel, meas(testIdx,:), 'Showplot',false);

    %# evaluate and update performance object
    cp = classperf(cp, pred, testIdx);
end

%# get accuracy
cp.CorrectRate

%# get confusion matrix
%# columns:actual, rows:predicted, last-row: unclassified instances
cp.CountingMatrix

输出：

我们获得了99.33% 的准确度，只有一个“setosa”实例被错误分类为“non-setosa”

更新：SVM 函数已移至 R2013a 中的统计工具箱

【讨论】：

谢谢你的好例子。我有点困惑。假设我总共有 50 个条目。上面的代码将其分为 10 组，每组 5 个条目，然后在每次迭代中使用 9 进行训练和 1 进行测试。但是通常的流程可能会有所不同，即 1. 训练 2. 交叉验证重复上述操作然后测试？还是没有区别？
@MaxSteel：SVM 的核心是二进制分类算法，所以你不能有两个以上的类（我随意选择了 setosa 与非 setosa 类）。幸运的是，有一些方法可以扩展 SVM 以支持多类案例。请参阅此处的示例：stackoverflow.com/a/4980055/97160
@TARIQ：有点离题，但您可以简单地使用bar3 来绘制混淆矩阵。如果您有神经网络工具箱，则有 plotconfusion 功能，否则您可以手动执行此操作：stackoverflow.com/a/7081430/97160
@Pegah：您应该阅读CLASSPERF doc page，我对该函数的使用与文档中显示的示例相同。首先我们在循环之前初始化cp 对象。然后在循环中，我们用当前验证折叠的预测更新cp 对象。每次调用该函数都会累积结果。因此，当我们完成循环时，返回的结果将是 K 次折叠的平均值。顺便说一句，名字是 Amro 而不是 Arno :)
@Pegah: cp.CorrectRate 返回分类精度的当前运行平均值（即滚动），而不是当前折叠的分类精度。如果你想要后者，请使用cp.LastCorrectRate