Skip to content

Multi-armed bandit problem under delayed feedback: framework for the numerical experiments

License

Notifications You must be signed in to change notification settings

djo/delayed-bandit

Repository files navigation

Multi-armed bandit (MAB) problem under delayed feedback: numerical experiments

Build

The framework for numerical experiments to simulate the multi-armed bandit in the stochastic stationary environment with delays.

Beta Upper Confidence Bound Policy for the Design of Clinical Trials, 2023

Evaluation of the adapted to delays policies using the publicly available dataset The International Stroke Trial. See this notebook for the analysis and simulation.

Bernoulli multi-armed bandit problem under delayed feedback, 2021

Provides the framework for numerical experiments to simulate the multi-armed bandit problem in the stochastic stationary environment with delays. Part of the paper Bernoulli multi-armed bandit problem under delayed feedback (Journal).

Structure of the project and currently implemented algorithms:

Files
Environments Protocol
Bernoulli MAB
Policies Protocol
Uniform Random
Explore-First
Epsilon-Greedy
Upper Confidence Bound
Thompson Sampling (Beta distribution)
Experiments Bernoulli MAB under delayed feedback
Tests Test module

To run experiments on Bernoulli MAB see

python delayed_bandit/experiments.py --help

One might want to run a significant number of experiments and aggregate the result by removing outliers and averaging. The sampling of delays might be fixated over the horizon.

Bernoulli MAB under delayed feedback with Explore-First algorithm

Comparison of algorithms in Bernoulli MAB with no delays

Comparison of algorithms in Bernoulli MAB under delay t=50

Comparison of algorithms in Bernoulli MAB under delay t=150

Development

python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
./pychecks.sh

MIT License

Copyright (c) 2023 Andrii Dzhoha

About

Multi-armed bandit problem under delayed feedback: framework for the numerical experiments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published