This repository is a collection of reinforcement learning algorithms: Policy-Gradient, Actor-Critic, Trust Region Policy Optimization, and Generalized Advantage Estimation. (More algorithms will be added soon...)
In this repository, OpenAI Gym environments such as CartPole-v0
, Pendulum-v0
, and BipedalWalker-v3
are used. You need to install them before running this repository.
- Find the errors of the Actor-Critic
- Implement PPO
- Search other environments to running the algorithms
- An explaination of TRPO line search: link