This is a side project to learn more about reinforcement learning. The goal is to have a relatively simple implementation of Deep Q Networks [1,2] that can learn on (some) of the Atari Games. It is not an exact reproduction of the original paper.
- The architecture from DeepMind's nature publication [2] is used.
- Standard DQN (without target network) [1] and Double DQN [3] is implemented.
- Loss clipping from DeepMind's nature paper [2] is used. ( The implementation mimics [6].)
- Pre-processing is done by
- RGB to grayscale conversion
- Rescaling to 84 by 84 (this does not preserve the aspect ratio).
- On the atari games, the replay memory uses uint8 to reduce memory usage.
- The atari games are accessed through OpenAI Gym [5] but not using the default environments.
- PongDeterministic-v3 and BreakOutDeterministic_v3 are used. This used deterministic frame skipping and action repeating similar to [2]. Consequently it learns about 4 times faster compared to the less deterministic Pong-v0 environment.
- The loss of a life results in a terminal state. This was used by Mnih at al. in [2].
- train_agent.py contains the code to train and save the model. It will write summaries of the training reward per episode, the validation reward, the mse, the regularisation parameter, the mean target q value.
- evaluate_agent.py has code to load a trained model and let it run indefinitely. The script shows the following visualisation of game, q-function and value history+reward.
- dqn.py the deep q network implemented in tensorflow. The code supports standard DQN [1] and Double DQN [3].
- agent.py class for interacting with the environment.
- replay.py replay memory implementation
- config.py contains the parameter settings for CartPole, Pong and Breakout.
- util.py some basic helper functions
- saves/ Checkpoints of networks that work reliably
- log/ directory where the tensorboard summaries and the checkpoints are written to.
- Tensorflow 1.0
- OpenAI gym
- Matplotlib
- Numpy
- skimage for grayscale and resizing
- Mnih et al. Playing Atari with Deep Reinforcement Learning
- Mnih et al. Human-level control through deep reinforcement learning
- van Hasselt et al. Deep Reinforcement Learning with Double Q-learning
- Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
- OpenAI gym
- Nathan Sprague's theano DQN implementation