使用暗网训练后获得预测答案

【问题标题】：Getting predictions after training using darknet使用暗网训练后获得预测
【发布时间】：2017-08-16 23:08:44
【问题描述】：

我是 CNN 的新手，我正在尝试使用 CIFAR-10 数据集训练分类器。我跟着Pjreddie's Tutorial训练了一个10类数据集的简单分类器。

我使用下面的代码训练了模型，我得到了cifar_small.weights，后来我将其用于检测

./darknet classifier train cfg/cifar.data cfg/cifar_small.cfg

训练简单网络后，我尝试使用cifar_small.cfg 和cifar_small.weigths 进行检测

./darknet detect cfg/cifar_small.cfg cifar_small.weights data/dog.jpg

层过滤器大小输入输出
0 转化 32 3 x 3 / 1 28 x 28 x 3 -> 28 x 28 x 32
最大 1 个 2 x 2 / 2 28 x 28 x 32 -> 14 x 14 x 32
2 次转化 64 3 x 3 / 1 14 x 14 x 32 -> 14 x 14 x 64
最多 3 个 2 x 2 / 2 14 x 14 x 64 -> 7 x 7 x 64
4 次转化 128 3 x 3 / 1 7 x 7 x 64 -> 7 x 7 x 128
5 次转化 10 1 x 1 / 1 7 x 7 x 128 -> 7 x 7 x 10
6 平均 7 x 7 x 10 -> 10
7 softmax 10
8 成本 10 加载重量
来自 cifar_small.weights...完成！
data/dog.jpg：预测 0.007035 秒。
未使用 OpenCV 编译，改为保存到 predictions.png

它不预测终端中的值，也不在输出图像上绘制边界框。图像的输出与输入相同。

当我尝试使用yolo.cfg 和预训练的yolo.weights 对同一图像进行预测时，它的工作原理如下所示。

层过滤器大小输入输出
0 转化 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32
最多 1 个 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32
2 次转化 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64
最多 3 个 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64
4 次转化 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128
5 次转化 64 1 x 1 / 1 104 x 104 x 128 -> 104 x 104 x 64
6 次转化 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128
最多 7 个 2 x 2 / 2 104 x 104 x 128 -> 52 x 52 x 128
8 次转化 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256
9 转化 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128
10 转化 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256
最大 11 2 x 2 / 2 52 x 52 x 256 -> 26 x 26 x 256
12 转换 512 3 x 3 / 1 26 x 26 x 256 ->
26 x 26 x 512
13 转换 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256
14 转换 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
15 转换 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256
16 转换 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
最大 17 2 x 2 / 2 26 x 26 x 512 -> 13 x 13 x 512
18 转换 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
19 转换 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512
20 转换 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
21 转换 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512
22 转换 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
23 转换 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024
24 转换 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024
25 路线 16
26 重组 / 2
26 x 26 x 512 -> 13 x 13 x2048
27 路线 26 24
28 转换 1024 3 x 3 / 1 13 x 13 x3072 -> 13 x 13 x1024

29 次转化 425 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 425
30 次检测
从 yolo.weights 加载权重...完成！
data/dog.jpg：预测在 11.057513 秒内。
汽车：54%
自行车：51%
狗：56%

它在输出图像中与边界框一起按预期进行预测。

【问题讨论】：

标签： opencv neural-network deep-learning conv-neural-network

【解决方案1】：

我认为你应该使用这个命令：

./darknet classify cfg/cifar_small.cfg cifar_small.weights data/dog.jpg

在这里查看：https://pjreddie.com/darknet/tiny-darknet/

【讨论】：

【解决方案2】：

在 /examples/darknet.c（从第 422~500 行）中，您可以看到暗网框架对 './darknet' 之后的每个函数输入做了什么。在这种情况下，'/.darknet 分类 ~' 使函数 'predict_classifier' 在 examples/classifier.c 中运行。

预测结果由601~606行的部分打印出来（在classifier.c中）

for(i = 0; i < top; ++i)
{
    int index = indexes[i];
    printf("%5.2f%%: %s\n", predictions[index] * 100, names[index]);
}

【讨论】：