【读书1】【2017】MATLAB与深度学习——代价函数比较(1)

该程序的撰写方式几乎与第2章“SGD与批处理比较”中的SGDvsBatch.m文件的撰写方式相同。

The architecture of this file is almostidentical to that of the SGDvsBatch.m file in the “Comparison of the SGD andthe Batch” section in Chapter 2.

clear all

X = [ 0 0 1;

D = [ 0 0 1 1];

E1 = zeros(1000, 1);

E2 = zeros(1000, 1);

W11 = 2*rand(4, 3) - 1; % Cross entropy

W12 = 2*rand(1, 4) - 1; %

W21 = W11; % Sum of squared error

W22 = W12; %

for epoch = 1:1000

   [W11W12] = BackpropCE(W11, W12, X, D);

   [W21W22] = BackpropXOR(W21, W22, X, D);

   es1= 0;

   es2= 0;

   N = 4;

   fork = 1:N

          x= X(k, :)';

          d = D(k);

          v1 = W11*x;

          y1 = Sigmoid(v1);

          v = W12*y1;

          y = Sigmoid(v);

          es1 = es1 + (d - y)^2;

          v1 = W21*x;

          y1 = Sigmoid(v1);

          v = W22*y1;

          y = Sigmoid(v);

          es2= es2 + (d - y)^2;

   end

   E1(epoch)= es1 / N;

   E2(epoch)= es2 / N;

end

plot(E1, ‘r’)

hold on

plot(E2, ‘b:’)

xlabel(‘Epoch’)

ylabel(‘Average of Training error’)

legend(‘Cross Entropy’, ‘Sum of SquaredError’)

该程序调用BackpropCE和BackpropXOR函数，并分别训练神经网络1000次。

This program calls the BackpropCE and theBackpropXOR functions and trains the neural networks 1,000 times each.

对于每个神经网络，计算每个时代输出误差(es1和es2)的平方和，并计算其平均值(E1和E2)。

The squared sum of the output error (es1and es2) is calculated at every epoch for each neural network, and theiraverage (E1 and E2) is calculated.

W11、W12、W21和W22分别为各自神经网络的权重矩阵。

W11, W12, W21, and W22 are the weightmatrices of respective neural networks.

完成1000次训练后，在图中的时代轴上比较平均误差。

Once the 1,000 trainings have beencompleted, the mean errors are compared over the epoch on the graph.

如图3-14所示，交叉熵驱动的训练能够以更快的速度减少训练误差。

As Figure 3-14 shows, the crossentropy-driven training reduces the training error at a much faster rate.

【读书1】【2017】MATLAB与深度学习——代价函数比较(1)
图3-14 交叉熵驱动的训练能够以更快的速度减少训练误差Crossentropy-driven training reduces training error at a much faster rate

换句话说，交叉熵驱动的学习规则产生更快的学习过程。

In other words, the cross entropy-drivenlearning rule yields a faster learning process.

这是大多数深度学习的代价函数采用交叉熵函数的原因。

This is the reason that most cost functionsfor Deep Learning employ the cross entropy function.

现在已经完成了反向传播算法的学习内容。

This completes the contents for theback-propagation algorithm.

——本文译自Phil Kim所著的《Matlab Deep Learning》

更多精彩文章请关注微信号：【读书1】【2017】MATLAB与深度学习——代价函数比较(1)