【发布时间】:2019-03-31 22:51:11
【问题描述】:
背景:
如果我打开 Weka Explorer GUI,训练 J48 树并使用 NSL-KDD 训练和测试数据集进行测试,将生成修剪后的树。 Weka Explorer GUI 表达了算法推理,以根据诸如 src_bytes 之类的查询来说明某事物是否会被归类为异常。
Screenshot of Weka Explorer GUI showing pruned tree
问题:
参考 Weka Explorer GUI 生成的修剪树示例,我如何以编程方式让 weka 表达 Java 中每个实例分类的推理?
即实例 A 被归类为异常,因为 src_bytes
到目前为止,我已经能够:
在 NSL-KDD 数据集上训练和测试 J48 树。
在 Java 中输出 J48 树的描述。
将 J48 树作为 if-then 语句返回。
但是我根本不知道在测试阶段迭代每个实例时如何表达每个分类的推理;无需每次手动将 J48 树输出为 if-then 语句并添加大量 println 来表示每个触发的时间(我真的不想这样做,因为从长远来看这会大大增加人为干预的要求)。
其他截图:
Screenshot of the 'description of the J48 tree within Java'
Screenshot of the 'J48 tree as an if-then statement'
代码:
public class Junction_Tree {
String train_path = "KDDTrain+.arff";
String test_path = "KDDTest+.arff";
double accuracy;
double recall;
double precision;
int correctPredictions;
int incorrectPredictions;
int numAnomaliesDetected;
int numNetworkRecords;
public void run() {
try {
Instances train = DataSource.read(train_path);
Instances test = DataSource.read(test_path);
train.setClassIndex(train.numAttributes() - 1);
test.setClassIndex(test.numAttributes() - 1);
if (!train.equalHeaders(test))
throw new IllegalArgumentException("datasets are not compatible..");
Remove rm = new Remove();
rm.setAttributeIndices("1");
J48 j48 = new J48();
j48.setUnpruned(true);
FilteredClassifier fc = new FilteredClassifier();
fc.setFilter(rm);
fc.setClassifier(j48);
fc.buildClassifier(train);
numAnomaliesDetected = 0;
numNetworkRecords = 0;
int n_ana_p = 0;
int ana_p = 0;
correctPredictions = 0;
incorrectPredictions = 0;
for (int i = 0; i < test.numInstances(); i++) {
double pred = fc.classifyInstance(test.instance(i));
String a = "anomaly";
String actual;
String predicted;
actual = test.classAttribute().value((int) test.instance(i).classValue());
predicted = test.classAttribute().value((int) pred);
if (actual.equalsIgnoreCase(a))
numAnomaliesDetected++;
if (actual.equalsIgnoreCase(predicted))
correctPredictions++;
if (!actual.equalsIgnoreCase(predicted))
incorrectPredictions++;
if (actual.equalsIgnoreCase(a) && predicted.equalsIgnoreCase(a))
ana_p++;
if ((!actual.equalsIgnoreCase(a)) && predicted.equalsIgnoreCase(a))
n_ana_p++;
numNetworkRecords++;
}
accuracy = (correctPredictions * 100) / (correctPredictions + incorrectPredictions);
recall = ana_p * 100 / (numAnomaliesDetected);
precision = ana_p * 100 / (ana_p + n_ana_p);
System.out.println("\n\naccuracy: " + accuracy + ", Correct Predictions: " + correctPredictions
+ ", Incorrect Predictions: " + incorrectPredictions);
writeFile(j48.toSource(J48_if-then.java));
writeFile(j48.toString());
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
Junction_Tree JT1 = new Junction_Tree();
JT1.run();
}
}
【问题讨论】:
标签: java machine-learning tree artificial-intelligence weka