如何在 numpy 中编码反向传播（使用 SDG）答案

【问题标题】：How can to code backpropagation (with SDG) in numpy如何在 numpy 中编码反向传播（使用 SDG）
【发布时间】：2021-07-18 13:06:47
【问题描述】：

在这里，我试图通过从头开始编码来理解神经网络（仅在 numpy 中）。我成功地进行了前向传递（使用点积）。但我不知道我应该如何进行反向传递（关于每个可训练参数的偏导数并使用 SDG 方程进行更新）。例如，损失可以是均方误差。到目前为止，这是我的代码，我在描述剩余内容的代码下方添加了 cmets。

'''
I want to design a NN that has :
               input layer I of 4 neurons
               hidden layer H1 of 3 neurons
               hidden layer H2 of 3 neurons
               output layer O of 1 neurons

'''

import numpy as np

inputs = [1, 2, 3, 2.5]


# -------------- Hidden layers ---------------------------
wh1 = [[0.2, 0.8, -0.5, 1],
           [0.5, -0.91, 0.26, -0.5],
           [-0.26, -0.27, 0.17, 0.87]]
bh1 = [2, 3, 0.5]


wh2 = [[0.1, -0.14, 0.5],
            [-0.5, 0.12, -0.33],
            [-0.44, 0.73, -0.13]]
bh2 = [-1, 2, -0.5]

layer1_outputs = np.dot(wh1, np.array(inputs)) + bh1
layer2_outputs = np.dot(wh2, layer1_outputs,) + bh2


# ------------ output layer ------------------------------
who = [0.1, -0.14, 0.5]
bho = [4]
layer_out = np.dot(who, layer2_outputs,) + bho
# --------------------------------------------------------

print(layer_out)

true_outputs = np.sin(inputs)
# compute RMSE
# compute partial derivatives
# update weights

NN 的架构：

【问题讨论】：

看看this 博客。我认为您不应该在这里“如何做某事”。做一次尝试。如果您有错误或遇到一些困难，请提出问题。
Standford CS231n 有一个非常好的分步示例：cs231n.github.io/neural-networks-case-study

标签： python numpy neural-network backpropagation

【解决方案1】：

神经网络中的反向传播使用导数的链式法则，如果您希望实现反向传播，您必须找到实现该功能的方法。这是我的建议。

为您的神经网络创建一个类，这样您就可以为每个任务创建一个单独的函数。
使用循环从前到后通过您的网络，并使用链式法则计算每个级别的偏导数。从我的旧工作中添加示例代码，请参阅 GitHub 存储库以获取完整代码。

https://github.com/akash-agni/DeepLearning/blob/main/Neural_Network_From_Scratch_using_Numpy.ipynb

    def backpropogate(self, X, y):
        delta = list() #Empty list to store derivatives
        delta_w = [0 for _ in range(len(self.layers))] #stores weight updates
        delta_b = [0 for _ in range(len(self.layers))] #stores bias updates
        error_o = (self.layers[-1].z - y.T) #Calculate the the error at output layer.
        for i in reversed(range(len(self.layers) - 1)):
            error_i = np.multiply(self.layers[i+1].weights.T.dot(error_o), self.layers[i].activation_grad()) # mutliply error with weights transpose to get gradients
            delta_w[i+1] = error_o.dot(self.layers[i].a.T)/len(y) # store gradient for weights
            delta_b[i+1] = np.sum(error_o, axis=1, keepdims=True)/len(y) # store gradients for biases
            error_o = error_i # now make assign the previous layers error as current error and repeat the process.
        delta_w[0] = error_o.dot(X) # gradients for last layer
        delta_b[0] = np.sum(error_o, axis=1, keepdims=True)/len(y)
        return (delta_w, delta_b) return gradients.

【讨论】：