Name		Name	Last commit message	Last commit date
parent directory ..
examples		examples
src/rllib_a2c/a2c		src/rllib_a2c/a2c
tests		tests
tuned_examples		tuned_examples
BUILD		BUILD
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

README.md

A2C (Advantage Actor-Critic)

A2C is a synchronous, deterministic version of A3C; that’s why it is named as “A2C” with the first “A” (“asynchronous”) removed. In A3C each agent talks to the global parameters independently, so it is possible sometimes the thread-specific agents would be playing with policies of different versions and therefore the aggregated update would not be optimal. To resolve the inconsistency, a coordinator in A2C waits for all the parallel actors to finish their work before updating the global parameters and then in the next iteration parallel actors starts from the same policy. The synchronized gradient update keeps the training more cohesive and potentially to make convergence faster.

Installation

conda create -n rllib-a2c python=3.10
conda activate rllib-a2c
pip install -r requirements.txt
pip install -e '.[development]'

Usage

A3C Example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a2c

a2c

README.md

A2C (Advantage Actor-Critic)

Installation

Usage

Files

a2c

Directory actions

More options

Directory actions

More options

Latest commit

History

a2c

Folders and files

parent directory

README.md

A2C (Advantage Actor-Critic)

Installation

Usage