Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

2019-07-15 22:23:02

 

Paperhttps://arxiv.org/pdf/1801.01290.pdf or Updated Versionhttps://arxiv.org/pdf/1812.05905.pdf 

Projecthttps://sites.google.com/view/soft-actor-critic or https://sites.google.com/view/sac-and-applications/ 

TensorFlowhttps://github.com/haarnoja/sac 

PyTorchhttps://github.com/vitchyr/rlkit 

Demo videohttps://www.youtube.com/channel/UCxXt8Br3-wyluz9Q08-fsaA 

 

Good Related Bloghttps://zhuanlan.zhihu.com/p/70360272

 

==== Video Related Tutorials (A2C, A3C): 

A brief review of Actor-Critic Algorithms:   https://www.youtube.com/watch?v=aODdNpihRwM 

CS885 Lecture 7b: Actor Critic:        https://www.youtube.com/watch?v=5Ke-d1Itk3k 

DRL Lecture 6: Actor-Critic:          https://www.youtube.com/watch?v=j82QLgfhFiY&t=27s

Build an A2C agent that learns to play Sonic with Tensorflow (tutorial):   https://www.youtube.com/watch?v=GCfUdkCL7FQ

Reinforcement Learning 6: Policy Gradients and Actor Critics (Deep Mind):    https://www.youtube.com/watch?v=bRfUxQs6xIM&t=27s 

Actor Critic (A3C) Tutorial:         https://www.youtube.com/watch?v=O5BlozCJBSE 

Actor Critic Algorithms:            https://www.youtube.com/watch?v=w_3mmm0P0j8&t=2s 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

==

 

相关文章:

  • 2021-10-28
  • 2022-12-23
  • 2021-12-19
  • 2021-05-05
  • 2021-09-30
  • 2021-08-02
  • 2021-08-26
  • 2021-09-02
猜你喜欢
  • 2022-02-11
  • 2021-09-20
  • 2022-12-23
  • 2021-10-28
  • 2021-06-02
  • 2022-12-23
相关资源
相似解决方案