SMO，WEKA中的顺序最小优化答案

【问题标题】：SMO,Sequential Minimal Optimization in WEKASMO，WEKA中的顺序最小优化
【发布时间】：2012-03-16 03:17:08
【问题描述】：

我是 Weka 的新手。我想在 WEKA 中使用顺序最小优化。谁能告诉我如何进行？这是我的 Java 代码，但它不起作用：

public class SVMTest {
public void test(File input) throws Exception{
File tmp = new File("tmp-file-duplicate-pairs.arff");
String path = input.getParent();
//tmp.deleteOnExit();
////removeFeatures(input,tmp,useType,useNames, useActivities, useOccupation,useFriends,useMailAndSite,useLocations);
Instances data = new weka.core.converters.ConverterUtils.DataSource(tmp.getAbsolutePath()).getDataSet();
data.setClassIndex(data.numAttributes() - 1);
Classifier c = null;        
String ctype = null;
boolean newmodel = false;

ctype ="SMO";
c = new SMO();
String[] options = {"-M"};
c.setOptions(options);
c.buildClassifier(data);
newmodel = true;
//c = loadClassifier(input.getParentFile().getParentFile(),ctype);
if(newmodel)
    saveModel(c,ctype, input.getParentFile().getParentFile());
Evaluation eval = new Evaluation(data);
eval.crossValidateModel(c, data, 10, new Random(1));

System.out.println(c);
System.out.println(eval.toSummaryString());
System.out.println(eval.toClassDetailsString());
System.out.println(eval.toMatrixString());

tmp.delete();
}
 private static void saveModel(Classifier c, String name, File path) throws Exception {

ObjectOutputStream oos = null;
try {
    oos = new ObjectOutputStream(
            new FileOutputStream(path.getAbsolutePath()+"/"+name+".model"));
} catch (FileNotFoundException e1) {
    e1.printStackTrace();
} catch (IOException e1) {
    e1.printStackTrace();
}
oos.writeObject(c);
oos.flush();
oos.close();

 }
}

我想知道如何提供.arff 文件？我的数据集是 XML 文件的形式。

【问题讨论】：

您创建SMO 的实例并将其用于交叉验证。如果这是您想要的（实际上不是分类），那么您的 SMO 是可以的并且标题是错误的。否则，请更清楚地说明您的问题：您在分类、文件转换、从 XML 读取或什么方面有问题吗？还要描述什么是 tmp 和 input 文件以及您认为它不起作用的原因 - 您是否遇到异常、错误行为或您的代码没有被编译。
我的问题是 SMO 的分类，而不是 SMO 的交叉验证。我认为 SMO 是顺序最小优化。不是吗？
SMO 是您需要的，但您根本没有对实例进行分类 - 您评估分类器。要对实例进行分类，您需要 classifyInstance() 方法。有关详细信息，请参阅 [文档](weka.wikispaces.com/… 实例)。并阅读更多关于分类本身的信息，现在你是在盲目地这样做。

标签： java weka svm

【解决方案1】：

您可以从以下行读取输入文件：

Instances training_data = new Instances(new BufferedReader(
        new FileReader("tmp-file-duplicate-pairs.arff")));
training_data.setClassIndex(training_data.numAttributes() - 1);

【讨论】：

【解决方案2】：

我想你现在已经想通了，但如果它对其他人有帮助，有一个关于它的 wiki 页面：

http://weka.wikispaces.com/Text+categorization+with+WEKA

要使用 SMO，假设您有一些火车实例“trainset”和一个测试集“testset” 构建分类器：

            // train SMO and output model
            SMO classifier = new SMO();
            classifier.buildClassifier(trainset);

例如使用交叉验证来评估它：

    Evaluation eval = new Evaluation(testset);
    Random rand = new Random(1); // using seed = 1
    int folds = 10;
    eval.crossValidateModel(classifier, testset, folds, rand);

然后 eval 保存所有统计数据等。

【讨论】：