Bernoulli-Bandits

Implementations of some popular bandit algorithms both for the regret minimization setting and for the best arm selection setting.

Also, some code to analyze Thompson Sampling in many ways e.g. computing probabilities of ending up in particular states, probabilities of choosing the picking arm from particular states or at particular time steps, computing the expected reward in given time, etc. Here, state means the number of successes or failures encountered for each arm i.e. the history of arms pulled and rewards obtained.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
EpsG_Pers.py		EpsG_Pers.py
KL-LUCB.py		KL-LUCB.py
KL-UCB.py		KL-UCB.py
KLTS.py		KLTS.py
LUCB1.py		LUCB1.py
README.md		README.md
RegVsPers_lastwrong.py		RegVsPers_lastwrong.py
UCB1.py		UCB1.py
UCB_Pers.py		UCB_Pers.py
e-greedy.py		e-greedy.py
tp.py		tp.py
ts_analysis.py		ts_analysis.py
ts_pers_analysis.py		ts_pers_analysis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bernoulli-Bandits

About

Releases

Packages

Languages

sanitgupta/bernoulli-bandits

Folders and files

Latest commit

History

Repository files navigation

Bernoulli-Bandits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages