An unofficial implementation of Agent57 for Atari from DeepMind
- Double Q Network
- Dueling Network Architecture
- Distributed Architecture (Multi Actor)
- Update new recurrent states after training step
- LSTM Recurrence
- Episodic Novelty Module
- Lifelong Novelty Module
- Separate Nets for extrinsic and intrinsic reward
- Retrace (Replaced N-Step)
- Reward value rescaling
- Batched Inference (SEEDRL)
- Adaptive Exploration with meta-controller
- Prioritized Experience Replay
- Find bugs and test for correctness
Recurrent Experience Replay in Distributed Reinforcement Learning (R2D2):
https://openreview.net/pdf?id=r1lyTjAqYX
Never Give Up: Learning Directed Exploration Strategies (NGU):
https://arxiv.org/pdf/2002.06038.pdf
Agent57: Outperforming the Atari Human Benchmark (Agent57):