Skip to content

Proximal Policy Optimization (Continuous Version) in PyTorch.

Notifications You must be signed in to change notification settings

alirezakazemipour/Continuous-PPO

Repository files navigation

Continuous-PPO

PRs Welcome
Implementation of the proximal policy optimization on Mujoco environments. All hyper-parameters have been chosen based on the paper.

For Atari domain. look at this.

Demos

Ant-v2 Walker2d-v2 InvertedDoublePendulum-v2

Results

Ant-v2 Walker2d-v2 InvertedDoublePendulum-v2

Dependencies

  • gym == 0.17.2
  • mujoco-py == 2.0.2.13
  • numpy == 1.19.1
  • opencv_contrib_python == 3.4.0.12
  • torch == 1.4.0

Installation

pip3 install -r requirements.txt

Usage

python3 main.py
  • You may use Train_FLAG flag to specify whether to train your agent when it is True or test it when the flag is False.
  • There are some pre-trained weights in pre-trained models dir, you can test the agent by using them; put them on the root folder of the project and turn Train_FLAG flag to False.

Environments tested

  • Ant
  • InvertedDoublePendulum
  • Walker2d
  • Hopper
  • Humanoid
  • Swimmer
  • HalfCheetah

Reference

Proximal Policy Optimization Algorithms, Schulman et al., 2017

Acknowledgement

About

Proximal Policy Optimization (Continuous Version) in PyTorch.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages