算法介绍:
1.课程两节 Tutorial: Introduction to Bandits: Algorithms and Theory
http://techtalks.tv/talks/54451/
http://techtalks.tv/talks/54455/
2.博文介绍 Multi_armed bandit
https://mpatacchiola.github.io/blog/2017/08/14/dissecting-reinforcement-learning-6.html
toolbox:
1. Project details for pymabandits
http://mloss.org/software/view/415/
2.Multi-Armed Bandit project (version0.2 2005) C#
http://bandit.sourceforge.net/
3. bandit lib (github C++)
https://github.com/jkomiyama/banditlib
这个作者还有两个bandit算法库
没有优化算法速度,支持 linux/GNU C++ environment. 不支持windows/MacOSX
- Arms:
- Binary and Normal distribution of rewards (arms) are implemented.
- Policies:
- DMED for binary rewards [1]
- Epsilon-Greedy
- KL-UCB [2]
- MOSS [3]
- Thompson sampling for binary rewards [4]
- UCB [5]
- UCB-V [6]
4.https://github.com/bgalbraith/bandits
Bandits
Python library for Multi-Armed Bandits
Implements the following algorithms:
- Epsilon-Greedy
- UCB1
- Softmax
- Thompson Sampling (Bayesian)
- Bernoulli, Binomial <=> Beta Distributions
6.libbandit
https://github.com/tor/libbandit
#LibBandit
LibBandit is a C++ library designed for efficiently simulating multi-armed bandit algorithms.
Currently the following algorithms are implemented:
- UCB
- Optimally confident UCB
- Almost optimally confident UCB
- Thompson sampling (Gaussian prior)
- MOSS
- Finite-horizon Gittins index (Gaussian/Gaussian model/prior)
- An approximation of the finite-horizon Gittins index
- Bayesian optimal for two arms (Gaussian/Gaussian model/prior)
算法程序(不是工具包)
算法图形化展示:
1.Bayesian Bandit Explorer
https://learnforeverlearn.com/bandits/