Bernoulli-Bandits

Implementations of some popular bandit algorithms both for the regret minimization setting and for the best arm selection setting.

Also, some code to analyze Thompson Sampling in many ways e.g. computing probabilities of ending up in particular states, probabilities of choosing the picking arm from particular states or at particular time steps, computing the expected reward in given time, etc. Here, state means the number of successes or failures encountered for each arm i.e. the history of arms pulled and rewards obtained.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Bernoulli-Bandits

Files

README.md

Latest commit

History

README.md

File metadata and controls

Bernoulli-Bandits