Caffe 中的回归：预测高度错误答案

【问题标题】：Regression in Caffe: Prediction is highly erroneousCaffe 中的回归：预测高度错误
【发布时间】：2017-02-16 16:09:51
【问题描述】：

我一直在 Caffe 中处理单标签回归问题。输入包含我使用不同图像独立生成的 5 个 hdf5 文件。我首先使用单个 hdf5 文件测试了我的网络，并使用大约 800 个训练图像（批量大小 64）运行了 10000 次迭代。最后，当我对相同的训练图像进行预测时，得到的结果如下：

但在测试图像上却是：

据我了解，这是由于训练数据量较少，并且测试数据与训练数据不太相似。

所以，我尝试将训练数据增加到大约 5500 张图像，将它们分成 5 个 hdf5 文件。使用 14,000 次迭代创建的模型对训练数据的预测输出为：

我不明白为什么预测更糟？ caffe 是如何选择批次的？（我的批次大小是 64）它是否从 5 个 hdf5 文件中随机选择一个批次？我的错误预测背后的原因可能是什么？我能做些什么来有效地训练我的模型？我应该添加更多的卷积层吗？任何建议都将非常挽救生命。这是我在神经网络和 caffe 方面的第一次尝试。我的网络是：

name: "Regression"
layer{
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "train_hdf5file.txt"
    batch_size: 64
    shuffle: true
  }
  include: { phase: TRAIN }
}
layer{
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "test_hdf5file.txt"
    batch_size: 30
  }
  include: { phase: TEST }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param { lr_mult: 1 }
  param { lr_mult: 2 }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "dropout1"
  type: "Dropout"
  bottom: "pool1"
  top: "pool1"
  dropout_param {
    dropout_ratio: 0.1
  }
}

layer{
  name: "fc1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "fc1"
  param { lr_mult: 1 decay_mult: 1 }
  param { lr_mult: 2 decay_mult: 0 }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "dropout2"
  type: "Dropout"
  bottom: "fc1"
  top: "fc1"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer{
  name: "fc2"
  type: "InnerProduct"
  bottom: "fc1"
  top: "fc2"
  param { lr_mult: 1 decay_mult: 1 }
  param { lr_mult: 2 decay_mult: 0 }
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
 }
}
layer{
  name: "loss"
  type: "EuclideanLoss"
  bottom: "fc2"
  bottom: "label"
  top: "loss"
}

【问题讨论】：

我对 caffe 和网络表示不是很熟悉。但我看到的唯一一种调节是来自辍学层。也许为您的权重添加一些 l1/l2 调节。我希望，规范化的概念对你来说很清楚，因为它在 ML 中非常重要。（如果没有任何规范，一个足够强大/足够大的网络将为您提供近乎完美的训练分数，但它主要是记住数据，并且根本无法保证其他数据（如您的测试数据）会发生什么。）
@sascha 谢谢你的回复。就我而言，结果并没有过拟合，对训练数据本身的预测还不够好。实际上，我对我正在使用的训练数据量或我的数据的使用方式以及网络结构是否足够好只有一个卷积层表示怀疑。还有 caffe 如何处理多个 hdfs 文件以及如何从中选择一个批次。我想知道我是应该在同一网络上增加我的数据量还是在增加我的数据之前先改进我的网络。

标签： machine-learning neural-network deep-learning caffe conv-neural-network

【解决方案1】：

尝试添加卷积层，并移除 dropout（如果您遇到过拟合问题，您可以使用它）。此外，您必须检查 Caffe 在训练期间打印的损失；基于此，您可能还需要更改求解器文件中的学习率等。

【讨论】：