神经网络图解+公式

定义**函数sigmoid()\text{sigmoid}\left( \right)
g(z)=sigmoid(z)=(1+ez)1 g\left( z \right) =\text{sigmoid}\left( z \right) =\left( 1+e^{-z} \right) ^{-1}

图中第2层为隐藏层,其各个神经元入下所示,
a1(2)=g(Θ10(1)x0+Θ11(1)x1+Θ12(1)x2+Θ13(1)x3)a2(2)=g(Θ20(1)x0+Θ21(1)x1+Θ22(1)x2+Θ23(1)x3)a3(2)=g(Θ30(1)x0+Θ31(1)x1+Θ32(1)x2+Θ33(1)x3) a_1^{\left( 2 \right)}=g\left( \boldsymbol{\Theta }_{10}^{\left( 1 \right)}x_0+\boldsymbol{\Theta }_{11}^{\left( 1 \right)}x_1+\boldsymbol{\Theta }_{12}^{\left( 1 \right)}x_2+\boldsymbol{\Theta }_{13}^{\left( 1 \right)}x_3 \right) \\ a_2^{\left( 2 \right)}=g\left( \boldsymbol{\Theta }_{20}^{\left( 1 \right)}x_0+\boldsymbol{\Theta }_{21}^{\left( 1 \right)}x_1+\boldsymbol{\Theta }_{22}^{\left( 1 \right)}x_2+\boldsymbol{\Theta }_{23}^{\left( 1 \right)}x_3 \right) \\ a_3^{\left( 2 \right)}=g\left( \boldsymbol{\Theta }_{30}^{\left( 1 \right)}x_0+\boldsymbol{\Theta }_{31}^{\left( 1 \right)}x_1+\boldsymbol{\Theta }_{32}^{\left( 1 \right)}x_2+\boldsymbol{\Theta }_{33}^{\left( 1 \right)}x_3 \right)

将第2层神经元组成一个向量a(2)\boldsymbol{a}^{\left( 2 \right)}
a(2)=[a0(2)=1a1(2)a2(2)a3(2)] \boldsymbol{a}^{\left( 2 \right)}=\left[ \begin{array}{c} a_0^{\left( 2 \right)}=1\\ a_1^{\left( 2 \right)}\\ a_2^{\left( 2 \right)}\\ a_3^{\left( 2 \right)}\\ \end{array} \right]

将输入特征x,组成一组向量x\boldsymbol{x},注意多了一个默认的x0=1x_0=1
x=[x0=1x1x2x3] \boldsymbol{x}=\left[ \begin{array}{c} x_0=1\\ x_1\\ x_2\\ x_3\\ \end{array} \right]
将第jj层后的权重系数,组成矩阵Θ(j)\boldsymbol{\Theta }^{\left( j \right)},其维度是(第j+1j+1层的元素数量)×\times(第jj层的元素数量+1),其中的元素不包括偏置元素。例如下面的Θ(1)\boldsymbol{\Theta }^{\left( 1 \right)},维度是3×43\times4
Θ(1)=[Θ10(1)Θ11(1)Θ12(1)Θ13(1)Θ20(1)Θ21(1)Θ22(1)Θ23(1)Θ30(1)Θ31(1)Θ32(1)Θ33(1)] \boldsymbol{\Theta }^{\left( 1 \right)}=\left[ \begin{matrix}{} \boldsymbol{\Theta }_{10}^{\left( 1 \right)}& \boldsymbol{\Theta }_{11}^{\left( 1 \right)}& \boldsymbol{\Theta }_{12}^{\left( 1 \right)}& \boldsymbol{\Theta }_{13}^{\left( 1 \right)}\\ \boldsymbol{\Theta }_{20}^{\left( 1 \right)}& \boldsymbol{\Theta }_{21}^{\left( 1 \right)}& \boldsymbol{\Theta }_{22}^{\left( 1 \right)}& \boldsymbol{\Theta }_{23}^{\left( 1 \right)}\\ \boldsymbol{\Theta }_{30}^{\left( 1 \right)}& \boldsymbol{\Theta }_{31}^{\left( 1 \right)}& \boldsymbol{\Theta }_{32}^{\left( 1 \right)}& \boldsymbol{\Theta }_{33}^{\left( 1 \right)}\\ \end{matrix} \right]

以此类推,所以有入下公式,其中的1,为默认存在的偏置项。
a(2)=[1g(Θ(1)x)] \boldsymbol{a}^{\left( 2 \right)}=\left[ \begin{array}{c} 1\\ g\left( \boldsymbol{\Theta }^{\left( 1 \right)}\boldsymbol{x} \right)\\ \end{array} \right]

a(3)=[1g(Θ(2)a(2))] \boldsymbol{a}^{\left( 3 \right)}=\left[ \begin{array}{c} 1\\ g\left( \boldsymbol{\Theta }^{\left( 2 \right)}\boldsymbol{a}^{\left( 2 \right)} \right)\\ \end{array} \right]

y=g(Θ(3)a(3)) \boldsymbol{y}=g\left( \boldsymbol{\Theta }^{\left( 3 \right)}\boldsymbol{a}^{\left( 3 \right)} \right)

注意,只有在计算下一层时,才会给当前层添加一个隐藏的1,也就是说,a(2)\boldsymbol{a}^{\left( 2 \right)}有两个意思,一个是图中显示的
a(2)=[a1(2)a2(2)a3(2)]=g(Θ(1)x) \boldsymbol{a}^{\left( 2 \right)}=\left[ \begin{array}{c} a_1^{\left( 2 \right)}\\ a_2^{\left( 2 \right)}\\ a_3^{\left( 2 \right)}\\ \end{array} \right] =g\left( \boldsymbol{\varTheta }^{\left( 1 \right)}\boldsymbol{x} \right)

另一个意思是为了计算下一层神经元,添加的隐藏的偏置1。

a(2)=[a0(2)=1a1(2)a2(2)a3(2)]=[1g(Θ(1)x)] \boldsymbol{a}^{\left( 2 \right)}=\left[ \begin{array}{c} a_0^{\left( 2 \right)}=1\\ a_1^{\left( 2 \right)}\\ a_2^{\left( 2 \right)}\\ a_3^{\left( 2 \right)}\\ \end{array} \right] =\left[ \begin{array}{c} 1\\ g\left( \boldsymbol{\varTheta }^{\left( 1 \right)}\boldsymbol{x} \right)\\ \end{array} \right]

相关文章:

  • 2021-07-28
  • 2021-05-16
  • 2021-12-14
  • 2021-09-30
  • 2021-11-30
  • 2021-09-14
猜你喜欢
  • 2021-11-14
  • 2021-12-22
  • 2021-06-07
  • 2021-12-27
  • 2021-10-15
  • 2021-10-22
  • 2022-01-15
相关资源
相似解决方案