Summary: Using stable-baselines3 'PPO' reinforcement learning algorithm to train dynamic window approach
1️⃣ Stable-Baselines3 [🔗LINK]
For reinforcement learning algorithm, PPO is used in opensource Stable-Baselines3. Click installation link for Stable-Baselines3 above.
Here, the codes are tested in Windows in conda virtual environment dependent on python 3.7.
Please keep in mind to match the compatible versions of stable-baselines3, tensorboard, pytorch, python and so on.
2️⃣ Pygame Environment [🔗LINK]
The base idea of creating dynamic window approach pygame environment has come from the following link.
In scripts/dynamic_window_approach_game.py, you can check the modified code.
The main difference is output control of mobile robot changed from vr, vl (right and left wheel angular speed) to v, w (vehicle linear and angular speed).
In your command prompt (Anaconda Powershell Prompt), execute:
$ python DWA-learn-main.py
There is an option to see just reward log or other train hyperparmeter, loss included logs. If you wish to see details of the training log, go to scripts/DWA_learn_main.py and uncomment line 22 and 28, or just add (if you cannot find):
c_logger = configure(logdir, ["stdout", "csv", "tensorboard"])
model.set_logger(c_logger)
While training, you can check by executing:
$ tensorboard --logdir=logs # simple logs
$ tensorboard --logdir=${your saved log directory name} # detail logs
Check detailed information in the following link HERE.