该系列仅在原课程基础上部分知识点添加个人学习笔记,或相关推导补充等。如有错误,还请批评指教。在学习了 Andrew Ng 课程的基础上,为了更方便的查阅复习,将其整理成文字。因本人一直在学习英语,所以该系列以英文为主,同时也建议读者以英文为主,中文辅助,以便后期进阶时,为学习相关领域的学术论文做铺垫。- ZJ
转载请注明作者和出处:ZJ 微信公众号-「SelfImprovementLab」
知乎:https://zhuanlan.zhihu.com/c_147249273
CSDN:http://blog.csdn.net/junjun_zhao/article/details/79025300
4.3 Forward propagation in a deep neural network 深层网络中的前向传播
(字幕来源:网易云课堂)
In the last video we discussed what is the deep multi layer neural network,and also talked about the notation we use to describe such networks in this video. you see,how you can perform for propagation in a deep network as usual let’s first go over what forward propagation will look like for a single training example x,and then later on we’ll talk about the vectorized version,where you want to carry out forward propagation on the entire training set at the same time,but given a single training example x,here’s how you compute the activations of the first layer,so for this first layer you compute
上个视频我们讨论了,什么是深度神经网络,也谈过我们用来描述,这种网络的符号约定,在这个视频里你会看到,如何在深度网络中灵活应用正向和反向传播,跟以前一样 我们先来看看对其中一个训练样本 x,该怎么应用正向传播,之后我们会讨论向量化的版本,也就是当你想要对整个训练集应用正向传播的时候,假设这里有一个训练样本 x,这就是你该怎么计算第一层的**单元,在第一层里需要计算
重点:
因为
so if you do that you’ve now computed the activations from layer 1,how about layer 2 say that layer,well you would then compute
当你已经计算好了第一层的**函数,第二层会怎么样呢 就是这层,这里你需要计算
重点:
基本规律:
so we’ve done all this for a single training example,how about for doing it,in a vectorized way for the whole training set at the same time,the equations look quite similar as before,for the first layer you would have Capital
那么我们已经过了一遍针对一个训练样本的过程,接下来看看,怎么用向量化的方法训练整个训练集,公式其实都差不多,第一层是
so just to summarize our notation.I’m going to modify this up here,our notation allows us to replace lowercase z and a,with the uppercase counterparts is that already looks like a capital Z,and that gives you the vectorized version of forward propagation,that you carry out on the entire training set at a time where a0 is x,now if you look at this implementation of vectorization,it looks like that there is going to be a for loop here right,so it’s left for i equals 1 to 4,for l equals 1 through capital L,then you have to compute the activations for layer 1,then for layer 2 then to layer 3 and then for layer 4,therefore so seems that there is a for loop here,and I know that when implementing your networks,we usually want to get rid of explicit for loops,but this is one place where I don’t think there’s any way,to implement this over other than an explicit for loop,so when we’re implementing forward propagation,it is perfectly OK to have a for loop,they compute the activations for layer 1,then there are 2 then layer threes and layer four,no one knows and I don’t think there is this any way to do this,without a for loops that goes from 1 to capital L,from 1 through the total number of layers and in neural network,so this place is perfectly okay to have an explicit for loop.
来总结一下用到的符号,我会在这上面修改一些东西,这种写法我们可以换掉小写的 z 和 a,换成对应的大写符号,就能得到一个同时针对整个训练集的,向量化的正向传播算法步骤 其中for循环,for循环 i 从 1 到 4,for循环 i 等于 1 到大写 L,然后你去计算第一层的**函数,接着算第二 三 四层,所以看起来是个for循环,我猜你在用代码实现自己的网络时,通常是不想用显式for循环的,但是在这个情况下除了显式for循环,并没有更好的办法,所以当我们在实现正向传播的时候,用for循环也是可以的,它可以计算第一层的**函数,然后按顺序算好第二 三 四层等等,应该没有人能用除了for循环以外更好的方法,来一层层地计算 1 到 L,也就是从输入层到输出层的整个神经网络,这个地方用显式for循环是可以的。
so that’s it for the notation for deep neural networks,as well as how to do forward propagation in these networks,if the pieces we’ve seen so far looks a little bit familiar to you,that’s because what we’ve seen is taking a piece very similar to,what you’ve seen in the neural network with a single hidden layer,and just repeating that more times,now turns out that we implemented deep neural network,one of the ways to increase your odds of having bug-free implementation,is to think very systematically and carefully,about the matrix dimensions you’re working with,so when I’m trying to debug my own code.I’ll often pull a piece of paper and just think carefully through,the dimensions of the matrix I’m working with.let’s see how you could do that in the next video.
那么深度神经网络就讲的差不多了,还有如何在这些网络中使用正向传播,如果其中的过程你觉得很眼熟,那是因为这些步骤其实非常类似于,单隐层的神经网络的步骤,只不过是多重复几遍,事实上 我们在实现深度神经网络的过程中,想增加得到没有 bug 的程序的概率…其中一个方法,需要非常仔细和系统化地,去思考矩阵的维数,我在自己 debug 的时候,通常会拿出一张纸 一边很仔细地过一遍我在操作的矩阵的维数,在下一个视频里 我会详细地解释具体怎么做。
重点总结:
基本规律:
向量化:
基本规律:
参考文献:
[1]. 大树先生.吴恩达Coursera深度学习课程 DeepLearning.ai 提炼笔记(1-4)– 浅层神经网络
PS: 欢迎扫码关注公众号:「SelfImprovementLab」!专注「深度学习」,「机器学习」,「人工智能」。以及 「早起」,「阅读」,「运动」,「英语 」「其他」不定期建群 打卡互助活动。