Simple Open-AI Gym CartPole experiments.
DQN: Playing Atari with Deep Reinforcement Learning
python dqn.py
A2C: Asynchronous Methods for Deep Reinforcement Learning
python a2c.py
PPO: Proximal Policy Optimization Algorithms
python ppo.py
--env ENV
--batch_size BATCH_SIZE
--num_episodes NUM_EPISODES
--update_interval UPDATE_INTERVAL
--learning_rate LEARNING_RATE
--weight_decay WEIGHT_DECAY
--gamma GAMMA
--epsilon EPSILON
--seed SEED