神经网络的前向传播和反向传播推导

神经网络的前向传播和反向传播推导
x1x_{1}x2x_{2}表示输入
wijw_{ij}表示权重
bijb_{ij}表示偏置
σi\sigma_{i}表示**函数,这里使用sigmoid**函数
outout表示输出
yy表示真实值
η\eta表示学习率

前向传播
h1=w11x1+w13x2+b11h_{1}=w_{11}x_{1}+w_{13}x_{2}+b_{11}α1=σ(h1)=11+eh1\alpha_{1}=\sigma(h1)=\frac{1}{1+e^{-h1}}

h2=w12x1+w14x2+b12h_{2}=w_{12}x_{1}+w_{14}x_{2}+b_{12}α2=σ(h2)=11+eh2\alpha_{2}=\sigma(h2)=\frac{1}{1+e^{-h2}}

z=w21α1+w22α2+b21z=w_{21}\alpha_{1}+w_{22}\alpha_{2}+b_{21}out=σ(z)=11+ezout=\sigma(z)=\frac{1}{1+e^{-z}}

损失函数

E=12(outy)2E=\frac{1}{2}(out-y)^2

反向传播
求导
w21=Ew21=Eoutoutzzw21=(outy)σ(z)(1σ(z))α1\bigtriangleup w_{21}=\frac{\partial E}{\partial w_{21}}=\frac{\partial E}{\partial out}\frac{{\partial out}}{\partial z}\frac{\partial z}{\partial w_{21}}=(out-y)\sigma(z)(1-\sigma(z))\alpha_{1}

w22=Ew22=Eoutoutzzw22=(outy)σ(z)(1σ(z))α2\bigtriangleup w_{22}=\frac{\partial E}{\partial w_{22}}=\frac{\partial E}{\partial out}\frac{{\partial out}}{\partial z}\frac{\partial z}{\partial w_{22}}=(out-y)\sigma(z)(1-\sigma(z))\alpha_{2}

b21=Eb21=Eoutoutzzb21=(outy)σ(z)(1σ(z))\bigtriangleup b_{21}=\frac{\partial E}{\partial b_{21}}=\frac{\partial E}{\partial out}\frac{{\partial out}}{\partial z}\frac{\partial z}{\partial b_{21}}=(out-y)\sigma(z)(1-\sigma(z))

更新w21w22b21w_{21}、w_{22}、b_{21}

w21=w21ηw21w_{21}=w_{21}-\eta \bigtriangleup w_{21}

w22=w22ηw22w_{22}=w_{22}-\eta \bigtriangleup w_{22}

b21=b21ηb21b_{21}=b_{21}-\eta \bigtriangleup b_{21}

求导

w12=α2h2h2w12=σ(h2)(1σ(h2))x1\bigtriangleup w_{12}=\frac{\partial \alpha_{2}}{\partial h_{2}}\frac{{\partial h_{2}}}{\partial w_{12}}=\sigma(h_{2})(1-\sigma(h_{2}))x_{1}

w14=α2h2h2w14=σ(h2)(1σ(h2))x2\bigtriangleup w_{14}=\frac{\partial \alpha_{2}}{\partial h_{2}}\frac{{\partial h_{2}}}{\partial w_{14}}=\sigma(h_{2})(1-\sigma(h_{2}))x_{2}

b12=α2h2h2b12=σ(h2)(1σ(h2))\bigtriangleup b_{12}=\frac{\partial \alpha_{2}}{\partial h_{2}}\frac{{\partial h_{2}}}{\partial b_{12}}=\sigma(h_{2})(1-\sigma(h_{2}))

w11=α1h1h1w11=σ(h1)(1σ(h1))x1\bigtriangleup w_{11}=\frac{\partial \alpha_{1}}{\partial h_{1}}\frac{{\partial h_{1}}}{\partial w_{11}}=\sigma(h_{1})(1-\sigma(h_{1}))x_{1}

w13=α1h1h1w13=σ(h1)(1σ(h1))x2\bigtriangleup w_{13}=\frac{\partial \alpha_{1}}{\partial h_{1}}\frac{{\partial h_{1}}}{\partial w_{13}}=\sigma(h_{1})(1-\sigma(h_{1}))x_{2}

b11=α1h1h1b11=σ(h1)(1σ(h1))\bigtriangleup b_{11}=\frac{\partial \alpha_{1}}{\partial h_{1}}\frac{{\partial h_{1}}}{\partial b_{11}}=\sigma(h_{1})(1-\sigma(h_{1}))

更新w12w14b12w_{12}、w_{14}、b_{12}

w12=w12ηw12w_{12}=w_{12}-\eta \bigtriangleup w_{12}

w14=w14ηw14w_{14}=w_{14}-\eta \bigtriangleup w_{14}

b12=b12ηb12b_{12}=b_{12}-\eta \bigtriangleup b_{12}

更新w11w13b11w_{11}、w_{13}、b_{11}

w11=w11ηw11w_{11}=w_{11}-\eta \bigtriangleup w_{11}

w13=w13ηw13w_{13}=w_{13}-\eta \bigtriangleup w_{13}

b11=b11ηb11b_{11}=b_{11}-\eta \bigtriangleup b_{11}

相关文章: