Actor-Sharer-Learner (ASL): An Efficient Training Framework for Off-policy Deep Reinforcement Learning
The Actor-Sharer-Learner (ASL) is a highly efficient training framework for off-policy DRL algorithms, capable of enhancing sample efficiency, shortening training time, and improving final performance simultaneously. Detailly, the ASL framework employs a Vectorized Data Collection (VDC) mode to expedite data acquisition, decouples the data collection from model optimization by multithreading, and partially connects the two procedures by harnessing a Time Feedback Mechanism (TFM) to evade data underuse or overuse.
envpool >= 0.6.6 (https://envpool.readthedocs.io/en/latest/)
torch >= 1.13.0 (https://pytorch.org/)
numpy >= 1.23.4 (https://numpy.org/)
tensorboard >= 2.11.0 (https://pytorch.org/docs/stable/tensorboard.html)
python >= 3.8.0
ubuntu >= 18.04.1
After installation, you can use the ASL framework to train an Atari agent via:
python main.py
where the default envionment is Alien and the underlying DRL algorithm is DDQN. For more details about experiment setup, please check the main.py. The trianing curves of 57 Atari games are listed as follows.
To cite this repository in publications:
@article{Color2025XJH,
title = {Train a real-world local path planner in one hour via partially decoupled reinforcement learning and vectorized diversity},
journal = {Engineering Applications of Artificial Intelligence},
volume = {141},
pages = {109726},
year = {2025},
issn = {0952-1976},
doi = {https://doi.org/10.1016/j.engappai.2024.109726},
}
- 2023/6/20
sample_core()
inSharer.py
is optimized, where- we use a more pytorch way to delete
self.ptr-1
inind
- for
Sharer.shared_data_cuda()
, theind
andenv_ind
are generated onself.B_dvc
to run faster
- we use a more pytorch way to delete