大体思想和RNN encoder-decoder是一样的,只是用来LSTM来实现。

论文笔记-Sequence to Sequence Learning with Neural Networks

paper提到三个important point:

1)encoder和decoder的LSTM是两个不同的模型

2)deep LSTM表现比shallow好,选用了4层的LSTM

3)实践中发现将输入句子reverse后再进行训练效果更好。So for example, instead of mapping the sentence a,b,c to the sentence α,β,γ, the LSTM is asked to map c,b,a to α,β,γ, where α, β, γ is the translation of a, b, c. This way, a is in close proximity to α, b is fairly close to β, and so on, a fact that makes it easy for SGD to “establish communication” between the input and the output.  

相关文章:

  • 2022-01-01
  • 2022-12-23
  • 2021-04-08
  • 2021-08-01
  • 2021-08-22
  • 2022-12-23
  • 2022-01-21
  • 2021-06-16
猜你喜欢
  • 2021-10-24
  • 2021-10-15
  • 2021-10-05
  • 2020-03-13
  • 2021-12-31
  • 2021-07-27
  • 2021-04-01
相关资源
相似解决方案