在看《动手学深度学习》一书时,里面有介绍简化版的对RNN求梯度。其中求隐含层梯度时,作者只是简略地说了句“将上⾯的递归公式展开”就直接给出了结果,下面我详细地给出中间步骤。


∂ht∂L=Whh⊤⋅∂ht+1∂L+Wqh⊤⋅∂Ot∂L
=Whh⊤⋅(Whh⊤⋅∂ht+2∂L+Wqh⊤⋅∂Ot+1∂L)+Wqh⊤⋅∂Ot∂L
=(Whh⊤)2⋅∂ht+2∂L+Whh⊤⋅Wqh⊤⋅∂Ot+1∂L+Wqh⊤⋅∂Ot∂L
=Whh⊤⋅(Whh⊤⋅(Whh⊤⋅∂ht+3∂L+Wqh⊤⋅∂Ot+2∂L)+Wqh⊤⋅∂Ot+1∂L)+Wqh⊤⋅∂Ot∂L
=(Whh⊤)3⋅∂ht+3∂L+(Whh⊤)2⋅∂ht+2∂L+Whh⊤⋅Wqh⊤⋅∂Ot+1∂L+Wqh⊤⋅∂Ot∂L
=⋯⋯
=(Whh⊤)T−t⋅∂hT∂L+i=t+1∑T[(Whh⊤)T−i⋅Wqh⊤⋅∂OT+t−i∂L]
又∂hT∂L=(Wqh⊤)T⋅∂OT∂L
将其代入上式, 即得:
∂ht∂L=i=t∑T[(Whh⊤)T−i⋅Wqh⊤⋅∂OT+t−i∂L]
相关文章:
-
2021-08-10
-
2022-12-23
-
2022-12-23
-
2021-11-20
-
2021-11-03
-
2021-12-14
-
2022-12-23
-
2022-02-20