bandit 算法资料大全

算法介绍：

1.课程两节 Tutorial: Introduction to Bandits: Algorithms and Theory

http://techtalks.tv/talks/54451/

http://techtalks.tv/talks/54455/

2.博文介绍 Multi_armed bandit

https://mpatacchiola.github.io/blog/2017/08/14/dissecting-reinforcement-learning-6.html

toolbox:

1. Project details for pymabandits

http://mloss.org/software/view/415/

2.Multi-Armed Bandit project (version0.2 2005) C#

http://bandit.sourceforge.net/

3. bandit lib (github C++)

https://github.com/jkomiyama/banditlib

这个作者还有两个bandit算法库

bandit 算法资料大全

没有优化算法速度，支持 linux/GNU C++ environment. 不支持windows/MacOSX

Arms:

Binary and Normal distribution of rewards (arms) are implemented.

Policies:

DMED for binary rewards [1]
Epsilon-Greedy
KL-UCB [2]
MOSS [3]
Thompson sampling for binary rewards [4]
UCB [5]
UCB-V [6]

4.https://github.com/bgalbraith/bandits

Bandits

Python library for Multi-Armed Bandits

Implements the following algorithms:

Epsilon-Greedy
UCB1
Softmax
Thompson Sampling (Bayesian)
- Bernoulli, Binomial <=> Beta Distributions

5. MOE 里面包括了bandit, 好像挺大的一个工具包

http://yelp.github.io/MOE/why_moe.html

bandit 算法资料大全

6.libbandit

https://github.com/tor/libbandit

#LibBandit

LibBandit is a C++ library designed for efficiently simulating multi-armed bandit algorithms.

Currently the following algorithms are implemented:

UCB
Optimally confident UCB
Almost optimally confident UCB
Thompson sampling (Gaussian prior)
MOSS
Finite-horizon Gittins index (Gaussian/Gaussian model/prior)
An approximation of the finite-horizon Gittins index
Bayesian optimal for two arms (Gaussian/Gaussian model/prior)

算法程序（不是工具包）

算法图形化展示：

1.Bayesian Bandit Explorer

https://learnforeverlearn.com/bandits/

bandit 算法资料大全

2.Multi-armed Bandit Demo (steven用过）

相关文章：

猜你喜欢

相关资源

相似解决方案

热门标签

Java Python linux javascript Mysql C# Docker 算法前端 SpringBoot Redis Vue spring 设计模式 .net core .net kubernetes c++ 数据库数据结构大数据 js 机器学习微服务 Android Go 程序员面试 JVM ASP.net core 云原生人工智能后端 PHP git CSS golang k8s Nginx Django mybatis 深度学习多线程 React 架构 devops 爬虫云计算 Spring Boot LeetCode