forked from ray-project/ray
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[RLlib] RLlib contrib (ray-project#35141)
Signed-off-by: Avnish <[email protected]>
- Loading branch information
Showing
26 changed files
with
2,528 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# RLlib-Contrib | ||
|
||
RLlib-Contrib is a directory for more experimental community contributions to RLlib including contributed algorithms. **This directory has a more relaxed bar for contributions than Ray or RLlib.** If you are interested in contributing to RLlib-Contrib, please see the [contributing guide](CONTRIBUTING.md). | ||
|
||
## Getting Started and Installation | ||
Navigate to the algorithm sub-directory you are interested in and see the README.md for installation instructions and example scripts to help you get started! | ||
|
||
## Maintenance | ||
|
||
**Any issues that are filed in `rllib_contrib` will be solved best-effort by the community and there is no expectation of maintenance by the RLlib team.** | ||
|
||
**The API surface between algorithms in `rllib_contrib` and current versions of Ray / RLlib is not guaranteed. This means that any APIs that are used in rllib_contrib could potentially become modified/removed in newer version of Ray/RLlib.** | ||
|
||
We will generally accept contributions to this directory that meet any of the following criteria: | ||
|
||
1. Updating dependencies. | ||
2. Submitting community contributed algorithms that have been tested and are ready for use. | ||
3. Enabling algorithms to be run in different environments (ex. adding support for a new type of gymnasium environment). | ||
4. Updating algorithms for use with the newer RLlib APIs. | ||
5. General bug fixes. | ||
|
||
We will not accept contributions that generally add a significant maintenance burden. In this case users should instead make their own repo with their contribution, using the same guidelines as this directory, and the RLlib team can help to market/promote it in the Ray docs. | ||
|
||
## Getting Involved | ||
|
||
| Platform | Purpose | Support Level | | ||
| --- | --- | --- | | ||
| [Discuss Forum](https://discuss.ray.io) | For discussions about development and questions about usage. | Community | | ||
| [GitHub Issues](https://github.com/ray-project/rllib-contrib-maml/issues) | For reporting bugs and filing feature requests. | Community | | ||
| [Slack](https://forms.gle/9TSdDYUgxYs8SA9e8) | For collaborating with other Ray users. | Community | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
A3C (Asynchronous Advantage Actor-Critic) | ||
----------------------------------------- | ||
|
||
`A3C <https://arxiv.org/abs/1602.01783>` is the asynchronous version of A2C, where gradients are computed on the workers directly after trajectory rollouts, and only then shipped to a central learner to accumulate these gradients on the central model. After the central model update, parameters are broadcast back to all workers. Similar to A2C, A3C scales to 16-32+ worker processes depending on the environment. | ||
|
||
|
||
Installation | ||
------------ | ||
|
||
.. code-block:: bash | ||
conda create -n rllib-a3c python=3.10 | ||
conda activate rllib-a3c | ||
pip install -r requirements.txt | ||
pip install -e '.[development]' | ||
Usage | ||
----- | ||
|
||
.. literalinclude:: examples/a3c_cartpole_v1.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
from rllib_a3c.a3c import A3C, A3CConfig | ||
|
||
import ray | ||
from ray import air, tune | ||
|
||
if __name__ == "__main__": | ||
ray.init() | ||
|
||
config = ( | ||
A3CConfig() | ||
.rollouts(num_rollout_workers=1) | ||
.framework("torch") | ||
.environment("CartPole-v1") | ||
.training( | ||
gamma=0.95, | ||
) | ||
) | ||
|
||
num_iterations = 100 | ||
|
||
tuner = tune.Tuner( | ||
A3C, | ||
param_space=config.to_dict(), | ||
run_config=air.RunConfig( | ||
stop={"episode_reward_mean": 150, "timesteps_total": 200000}, | ||
failure_config=air.FailureConfig(fail_fast="raise"), | ||
), | ||
) | ||
results = tuner.fit() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
[build-system] | ||
requires = ["setuptools>=61.0"] | ||
build-backend = "setuptools.build_meta" | ||
|
||
[tool.setuptools.packages.find] | ||
where = ["src"] | ||
|
||
[project] | ||
name = "rllib-a3c" | ||
authors = [{name = "Anyscale Inc."}] | ||
version = "0.1.0" | ||
description = "" | ||
readme = "README.md" | ||
requires-python = ">=3.7, <3.11" | ||
dependencies = ["gym[accept-rom-license]", "gymnasium[mujoco]==0.26.3", "higher", "ray[rllib]==2.3.1"] | ||
|
||
[project.optional-dependencies] | ||
development = ["pytest>=7.2.2", "pre-commit==2.21.0", "tensorflow==2.11.0", "torch==1.12.0"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
tensorflow==2.11.0 | ||
torch==1.12.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
from rllib_a3c.a3c.a3c import A3C, A3CConfig | ||
|
||
from ray.tune.registry import register_trainable | ||
|
||
__all__ = ["A3CConfig", "A3C"] | ||
|
||
register_trainable("rllib-contrib-a3c", A3C) |
Oops, something went wrong.