Distributed PPO algorithm with the CARLA Reinforcement Learning Library

This library implements a distributed version of the Proximal Policy Optimization algorithm and uses the carla_rllib environment. In addition, there is an older implementation of the Asynchronous Actor-Critic which was not further developed.

To get started with the CARLA simulator, click here.

To get started with the CARLA reinforcment learning library, click here.

Prerequisites

1. Install NumPy and PyTorch.

2. Follow the instructions here to install ROS melodic.

3. Follow the instructions here to create a catkin workspace.

4. Clone this repository into the src folder of your catkin workspace:

cd catkin_ws/src
git clone https://github.com/50sven/ros_rl.git

5. Build your package with catkin_make:

cd catkin_ws
source devel/setup.bash
catkin_make

Note: Every time you have made changes to the code, the package must be rebuilt.

Get Started (with PPO)

1. In order to start training, take a look at the following files:

node_ppo.launch: configurate the parameters of the algorithm and the environment
setting.config: configurate the parameters regarding ROS and workstations used

2. Start training with the bash script train.sh:

cd catkin_ws/src/ros_carla_rllib/scripts/
./train.sh

Idea:

train.sh starts three types of nodes: MasterNode, EvalNode and EnvNode; as well as CARLA servers. Each node/server runs in its own tmux session.
There is only one MasterNode and only one EvalNode per training.
In contrast, there can be more than one EnvNode.
EnvNodes and the EvalNode run their own carla_rllib environment, which is uniquely assigned to one carla server.
There can be multiple trainings/nodes running on the same workstations (requires unique ports).

Note: The current implementation uses the trajectory environment from the carla_rllib which was build for a specific use case. In order to fit this implementation to individual needs one must adapted the ppo implementation and the online evaluation.

Concept (of PPO)

Master:

stores rollout data received from the environment nodes
exectues PPO optimization steps
logs training diagnostics

Environment:

runs the policy in the environment to collect training data and sends it to the master

Evaluation:

runs online evaluations during training
logs evaluation metrics

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
launch		launch
msg		msg
nodes		nodes
scripts		scripts
src/ros_carla_rllib		src/ros_carla_rllib
srv		srv
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
dist_ppo.png		dist_ppo.png
package.xml		package.xml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed PPO algorithm with the CARLA Reinforcement Learning Library

Prerequisites

Get Started (with PPO)

Idea:

Concept (of PPO)

Master:

Environment:

Evaluation:

About

Releases

Packages

Languages

License

50sven/ros_rllib

Folders and files

Latest commit

History

Repository files navigation

Distributed PPO algorithm with the CARLA Reinforcement Learning Library

Prerequisites

Get Started (with PPO)

Idea:

Concept (of PPO)

Master:

Environment:

Evaluation:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages