easyrl

Thouroughly commented, clear implementation.

Proximal Policy Optimization

RL algorithm where the maximization objective given a state-action pair is the advantage times ratio of the action probability over the old action probability, clipped (paper).

Works with any environment with discrete actions. Works with multiple envs in parallel. Tested on OpenAI Retro's Sonic environment.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
easyrl		easyrl
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

easyrl

Proximal Policy Optimization

About

Releases

Packages

Languages

albertwujj/easyrl-pytorch

Folders and files

Latest commit

History

Repository files navigation

easyrl

Proximal Policy Optimization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages