Recurrent Neural Networks: Process Sequences

cs231n-notes-Lecture-10:Recurrent Neural Networks

  • one to one: Vanilla Neural Networks. raw
  • one to many: eg. Image Captioning image -> sequence of words
  • many to one: e.g. Sentiment Classification sequence of words -> sentiment
  • many to many: e.g. Machine Translation seq of words -> seq of words

(Vanilla) Recurrent Neural Network

ht=f(ht1,xt)h_t = f(h_{t-1},x_t)
ht=tanh(Whht1+Wxxt)h_t = tanh(W_{h} h_{t-1}+W_{x}x_t)
y=Wyhty = W_yh_t

Truncated backpropagation through time

  • Run forward and backward through chunks of the sequence instead of whole sequence

cs231n-notes-Lecture-10:Recurrent Neural Networks

Image Captioning with Attention

cs231n-notes-Lecture-10:Recurrent Neural Networks

  • CNN网络生成L个D维的feature,代表L个location的feature
  • RNN迭代每一步生成一个L个位置的分布向量,表示图片中L个位置中每个位置的attention权重

Long Short Term Memory (LSTM)

cs231n-notes-Lecture-10:Recurrent Neural Networks

Difference to avoid gradient vanishment:

  • Backpropagation from ctc_t to ct1c_{t-1} only elementwise multiplication by f, no matrix multiply by W.
  • f is different at every step. In Vanilla RNN, it always multiply the same matrix.

相关文章:

  • 2021-04-29
  • 2021-06-23
  • 2021-07-20
  • 2021-08-15
  • 2021-08-03
  • 2021-05-03
  • 2021-10-05
  • 2021-04-28
猜你喜欢
  • 2021-08-10
  • 2021-07-17
  • 2021-08-08
  • 2021-11-15
  • 2021-06-04
  • 2022-12-23
  • 2022-12-23
相关资源
相似解决方案