Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

2019-07-15 22:23:02

Paper: https://arxiv.org/pdf/1801.01290.pdf or Updated Version: https://arxiv.org/pdf/1812.05905.pdf

Project: https://sites.google.com/view/soft-actor-critic or https://sites.google.com/view/sac-and-applications/

TensorFlow: https://github.com/haarnoja/sac

PyTorch: https://github.com/vitchyr/rlkit

Demo video: https://www.youtube.com/channel/UCxXt8Br3-wyluz9Q08-fsaA

Good Related Blog: https://zhuanlan.zhihu.com/p/70360272

==== Video Related Tutorials (A2C, A3C):

A brief review of Actor-Critic Algorithms: 　　https://www.youtube.com/watch?v=aODdNpihRwM

CS885 Lecture 7b: Actor Critic: 　　　　　　 https://www.youtube.com/watch?v=5Ke-d1Itk3k

DRL Lecture 6: Actor-Critic: 　　　　　　　 https://www.youtube.com/watch?v=j82QLgfhFiY&t=27s

Build an A2C agent that learns to play Sonic with Tensorflow (tutorial): 　　https://www.youtube.com/watch?v=GCfUdkCL7FQ

Reinforcement Learning 6: Policy Gradients and Actor Critics (Deep Mind): 　　 https://www.youtube.com/watch?v=bRfUxQs6xIM&t=27s

Actor Critic (A3C) Tutorial: 　　　　　　　　https://www.youtube.com/watch?v=O5BlozCJBSE

Actor Critic Algorithms: 　　　　　　　　　 https://www.youtube.com/watch?v=w_3mmm0P0j8&t=2s