Skip to content

Actor-Sharer-Learner training framework for off-policy DRL algorithms

License

Notifications You must be signed in to change notification settings

XinJingHao/Actor-Sharer-Learner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Actor-Sharer-Learner (ASL): An Efficient Training Framework for Off-policy Deep Reinforcement Learning

Python Pytorch DRL TrainingFramework

Introduction

The Actor-Sharer-Learner (ASL) is a highly efficient training framework for off-policy DRL algorithms, capable of enhancing sample efficiency, shortening training time, and improving final performance simultaneously. Detailly, the ASL framework employs a Vectorized Data Collection (VDC) mode to expedite data acquisition, decouples the data collection from model optimization by multithreading, and partially connects the two procedures by harnessing a Time Feedback Mechanism (TFM) to evade data underuse or overuse.

Dependencies

envpool >= 0.6.6  (https://envpool.readthedocs.io/en/latest/)
torch >= 1.13.0  (https://pytorch.org/)
numpy >= 1.23.4  (https://numpy.org/)
tensorboard >= 2.11.0  (https://pytorch.org/docs/stable/tensorboard.html)
python >= 3.8.0 
ubuntu >= 18.04.1 

Quick Start:

After installation, you can use the ASL framework to train an Atari agent via:

python main.py

where the default envionment is Alien and the underlying DRL algorithm is DDQN. For more details about experiment setup, please check the main.py. The trianing curves of 57 Atari games are listed as follows.

Citing the Project

To cite this repository in publications:

@article{Color2023JinghaoXin,
  title={Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity},
  author={Jinghao Xin, Jinwoo Kim, Zhi Li, and Ning Li},
  journal={arXiv preprint arXiv:2305.04180},
  url={https://doi.org/10.48550/arXiv.2305.04180},
  year={2023}
}

Maintenance History

  • 2023/6/20
    • sample_core() in Sharer.py is optimized, where
      • we use a more pytorch way to delete self.ptr-1 in ind
      • for Sharer.shared_data_cuda(), the ind and env_ind are generated on self.B_dvc to run faster

About

Actor-Sharer-Learner training framework for off-policy DRL algorithms

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages