C++ 单层多输出感知器怪异行为答案

【问题标题】：C++ Single Layer Multi Output Perceptron Weird BehaviourC++ 单层多输出感知器怪异行为
【发布时间】：2017-03-20 01:07:21
【问题描述】：

一些背景：

我用 C++ 编写了一个单层多输出感知器类。它使用典型的 WX + b 判别函数并允许用户定义激活函数。我已经彻底测试了所有内容，并且一切似乎都按我的预期工作。我注意到我的代码中有一个小的逻辑错误，当我尝试修复它时，网络的性能比以前差得多。错误如下：

我使用以下代码评估每个输出神经元的值：

output[i] =
          activate_(std::inner_product(weights_[i].begin(), weights_[i].end(),
                                       features.begin(), -1 * biases_[i]));

这里我将偏差输入视为固定的 -1，但是当我将学习规则应用于每个偏差时，我将输入视为 +1。

// Bias can be treated as a weight with a constant feature value of 1.
biases_[i] = weight_update(1, error, learning_rate_, biases_[i]);

所以我尝试通过更改对 weight_updated 的调用以与输出评估一致来纠正我的错误：

biases_[i] = weight_update(-1, error, learning_rate_, biases_[i]);

但是这样做会导致准确率下降 20%！在过去的几天里，我一直在努力寻找我的代码中的一些其他逻辑错误，这可能解释了这种奇怪的行为，但却空手而归。比我有更多知识的人可以对此提供任何见解吗？我在下面提供了整个课程以供参考。提前谢谢你。

#ifndef SINGLE_LAYER_PERCEPTRON_H
#define SINGLE_LAYER_PERCEPTRON_H

#include <cassert>
#include <functional>
#include <numeric>
#include <vector>
#include "functional.h"
#include "random.h"

namespace qp {
namespace rf {

namespace {

template <typename Feature>
double weight_update(const Feature& feature, const double error,
                     const double learning_rate, const double current_weight) {
  return current_weight + (learning_rate * error * feature);
}

template <typename T>
using Matrix = std::vector<std::vector<T>>;

}  // namespace

template <typename Feature, typename Label, typename ActivationFn>
class SingleLayerPerceptron {
 public:
  // For testing only.
  SingleLayerPerceptron(const Matrix<double>& weights,
                        const std::vector<double>& biases, double learning_rate)
      : weights_(weights),
        biases_(biases),
        n_inputs_(weights.front().size()),
        n_outputs_(biases.size()),
        learning_rate_(learning_rate) {}

  // Initialize the layer with random weights and biases in [-1, 1].
  SingleLayerPerceptron(std::size_t n_inputs, std::size_t n_outputs,
                        double learning_rate)
      : n_inputs_(n_inputs),
        n_outputs_(n_outputs),
        learning_rate_(learning_rate) {
    weights_.resize(n_outputs_);
    std::for_each(
        weights_.begin(), weights_.end(), [this](std::vector<double>& wv) {
          generate_back_n(wv, n_inputs_,
                          std::bind(random_real_range<double>, -1, 1));
        });

    generate_back_n(biases_, n_outputs_,
                    std::bind(random_real_range<double>, -1, 1));
  }

  std::vector<double> predict(const std::vector<Feature>& features) const {
    std::vector<double> output(n_outputs_);
    for (auto i = 0ul; i < n_outputs_; ++i) {
      output[i] =
          activate_(std::inner_product(weights_[i].begin(), weights_[i].end(),
                                       features.begin(), -1 * biases_[i]));
    }
    return output;
  }

  void learn(const std::vector<Feature>& features,
             const std::vector<double>& true_output) {
    const auto actual_output = predict(features);
    for (auto i = 0ul; i < n_outputs_; ++i) {
      const auto error = true_output[i] - actual_output[i];
      for (auto weight = 0ul; weight < n_inputs_; ++weight) {
        weights_[i][weight] = weight_update(
            features[weight], error, learning_rate_, weights_[i][weight]);
      }
      // Bias can be treated as a weight with a constant feature value of 1.
      biases_[i] = weight_update(1, error, learning_rate_, biases_[i]);
    }
  }

 private:
  Matrix<double> weights_;      // n_outputs x n_inputs
  std::vector<double> biases_;  // 1 x n_outputs
  std::size_t n_inputs_;
  std::size_t n_outputs_;
  ActivationFn activate_;
  double learning_rate_;
};

struct StepActivation {
  double operator()(const double x) const { return x > 0 ? 1 : -1; }
};

}  // namespace rf
}  // namespace qp

#endif /* SINGLE_LAYER_PERCEPTRON_H */

【问题讨论】：

标签： c++ machine-learning neural-network perceptron

【解决方案1】：

我终于明白了……

我的修复确实是正确的，准确性的损失只是拥有幸运（或不幸）数据集的结果。

【讨论】：