New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] MARL examples updated to support RLModule #31169

Merged

sven1977 merged 21 commits into ray-project:master from kouroshHakha:marl-rlm-examples

Dec 21, 2022

Contributor

kouroshHakha commented Dec 17, 2022 •

edited

Loading

Why are these changes needed?

Added unittests to cover PPORLModule examples for MARL.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

kouroshHakha added 11 commits

December 15, 2022 11:55


ma_pendulum_ppo.py works

d123cd3

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


test passed on rllib/examples/multi_agent_cartpole.py

58e8df2

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


changed PG -> PPO in multi_agent_custom_policy.py

03bde7f

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


tested multi_agent_custom_policy.py with rl_modules

46bd7a6

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


Updates to examples/multi_agent_different_spaces_for_agents.py:

d90d2d9

1. The test now runs on PPO by default instead of APPO
2. Tests pass when enable_rl_module_api

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


added todos to multi_agent_parameter_sharing.py

bac5ef6

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


fixed command line issues

2062acd

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


wip discovered a bug with open-spiel self play when using connectors.…

68b5f1c

… need to investigate

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

50daa6b

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

741d0c5

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


reverting debugging changes

ad76a6d

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

kouroshHakha requested review from sven1977, gjoliver, avnishn, ArturNiederfahrenhorst, smorad, maxpumperla and krfricke as code owners

December 17, 2022 03:06

kouroshHakha added 5 commits

December 16, 2022 19:08


Merge branch 'master' into marl-rlm-examples

67187b6


updated BUILD

7f201d4

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


updated BUILD

d91c155

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


Merge branch 'master' into marl-rlm-examples

4130f5d


fixed multi_agent_custom_policy.py unittest

6bffd1b

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

gjoliver approved these changes

View reviewed changes

Member

gjoliver left a comment •

edited

Loading

one minor thing.
one random suggestion, if you want to make this job easier going forward, we can let AlgorithmConfiig.validate() turn on enable_connectors and _enable_rl_module_api based on e.g. an env variable.
and you can add a new pipeline in pipeline.ml.yml say "rl_module".

Then, if you want to test any example or tuned_example in rl_module mode, you can simply add a build rule with "rl_module" tag without updating any of the files.
You can also have a separate CI group for RLModule :)

rllib/examples/self_play_with_open_spiel.py Outdated

- ray.init(num_cpus=args.num_cpus or None, include_dashboard=False)
+ args = get_cli_args()
+ ray.init(num_cpus=args.num_cpus or None, include_dashboard=False, local_mode=True)

Member

gjoliver Dec 19, 2022

revert local_mode

Contributor Author

kouroshHakha commented Dec 19, 2022

one minor thing. one random suggestion, if you want to make this job easier going forward, we can let AlgorithmConfiig.validate() turn on enable_connectors and _enable_rl_module_api based on e.g. an env variable. and you can add a new pipeline in pipeline.ml.yml say "rl_module".

Then, if you want to test any example or tuned_example in rl_module mode, you can simply add a build rule with "rl_module" tag without updating any of the files. You can also have a separate CI group for RLModule :)

Very good idea, I'll set that up here in this PR since I have to revert local_mode anyway.

kouroshHakha added 2 commits

December 19, 2022 22:56


introduced a new temporary tag for RLModule specific unittests

e967f1e

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


typo

86e943d

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

kouroshHakha commented

View reviewed changes

rllib/examples/multi_agent_custom_policy.py

  .environment("multi_agent_cartpole")
  .framework(args.framework)
  .multi_agent(
  # The multiagent Policy map.
  policies={
  # The Policy we are actually learning.
- "pg_policy": PolicySpec(
- config=PGConfig.overrides(framework_str=args.framework)
+ "learnable_policy": PolicySpec(

Contributor Author

kouroshHakha Dec 20, 2022

just using better names :)

kouroshHakha added 2 commits

December 19, 2022 23:00


removed extra prints

4b8781a

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>


enable connectors and the new api in algoconfig.validate()

87b8eb8

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

sven1977 reviewed

View reviewed changes

@@ @@ -3182,6 +3182,15 @@ py_test( @@
  args = ["--as-test", "--framework=torch", "--stop-reward=70.0", "--num-cpus=4"]
 )
+py_test(
+ name = "examples/multi_agent_cartpole_torch_w_rlm",

Contributor

sven1977 Dec 20, 2022

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Contributor Author

kouroshHakha Dec 20, 2022

answered below.

sven1977 reviewed

View reviewed changes

@@ @@ -3200,6 +3209,15 @@ py_test( @@
  args = ["--as-test", "--framework=torch", "--stop-reward=80"]
 )
+py_test(
+ name = "examples/multi_agent_custom_policy_torch_w_rlm",

Contributor

sven1977 Dec 20, 2022

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Contributor Author

kouroshHakha Dec 20, 2022

answered below.

sven1977 reviewed

View reviewed changes

@@ @@ -3218,6 +3236,16 @@ py_test( @@
  args = ["--stop-iters=4", "--framework=torch"]
 )
+py_test(
+ name = "examples/multi_agent_different_spaces_for_agents_torch_w_rlm",

Contributor

sven1977 Dec 20, 2022

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Contributor Author

kouroshHakha Dec 20, 2022

answered below.

sven1977 reviewed

View reviewed changes

@@ @@ -3538,6 +3566,15 @@ py_test( @@
  args = ["--framework=torch", "--env=connect_four", "--win-rate-threshold=0.6", "--stop-iters=2", "--num-episodes-human-play=0"]
 )
+py_test(
+ name = "examples/self_play_with_open_spiel_connect_4_torch_w_rlm",

Contributor

sven1977 Dec 20, 2022

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Contributor Author

kouroshHakha Dec 20, 2022

answered below

sven1977 reviewed

View reviewed changes

@@ @@ -3556,6 +3593,15 @@ py_test( @@
  args = ["--framework=torch", "--env=markov_soccer", "--win-rate-threshold=0.6", "--stop-iters=2", "--num-episodes-human-play=0"]
 )
+py_test(
+ name = "examples/self_play_league_based_with_open_spiel_markov_soccer_torch_w_rlm",

Contributor

sven1977 Dec 20, 2022

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Contributor Author

kouroshHakha Dec 20, 2022

These will show up in CI under RL Module tests, It will be super clear in the CI UI. I'd like to keep it concise here. Also, it's a transitional solution. At some point it will become the default and we have to get rid of these stuff anyways.

sven1977 reviewed

View reviewed changes

rllib/algorithms/algorithm_config.py

@@ @@ -765,6 +766,12 @@ def validate(self) -> None: @@
  "`config.rollouts(enable_connectors=True)`."
  )
+ if bool(os.environ.get("RLLIB_ENABLE_RL_MODULE", False)):

Contributor

sven1977 Dec 20, 2022

use command line arg

Contributor Author

kouroshHakha Dec 20, 2022

Actually I had it like what you said and changed it to this way. It will be easier for us to flip it by default later on. Also adding that flag to every other unittest down the line will create a lot of duplicates and is not very clean for a temporary transition. This is a temporary solution until RLModule is proven out across the board and fully rolled out.

sven1977 reviewed

View reviewed changes

rllib/examples/multi_agent_cartpole.py

- gamma=random.choice([0.95, 0.99]),
- )
+ if bool(os.environ.get("RLLIB_ENABLE_RL_MODULE", False)):

Contributor

sven1977 Dec 20, 2022 •

edited

Loading

Can we instead use a command line arg here, like we do for all other settings?
I know we have the gpu env setting, but te reason for that is to be able to set this globally so we can more gracefully handle this case in our buildkite/pipeline.ml.yml file. But for this case here, I don't see that need. Thx!

Contributor Author

kouroshHakha Dec 20, 2022

I have to disagree due to the reason above. I went through both solutions and found this one to be more scalable for our transition.

sven1977 reviewed

View reviewed changes

rllib/examples/multi_agent_different_spaces_for_agents.py Outdated

@@ @@ -120,31 +120,31 @@ def get_cli_args(): @@
  "episode_reward_mean": args.stop_reward,
  }
+ config = {

Contributor

sven1977 Dec 20, 2022 •

edited

Loading

Please no more config dicts.
Use AlgorithmConfig instead.

Contributor

sven1977 Dec 20, 2022

I know you just moved this from below, but either way :)

Contributor Author

kouroshHakha Dec 20, 2022

oh yeah. Done.

sven1977 reviewed

View reviewed changes

rllib/examples/multi_agent_independent_learning.py Outdated

- "policy_mapping_fn": (
- lambda agent_id, episode, worker, **kwargs: agent_id
- ),
+ "policies": env.get_agent_ids(),

Contributor

sven1977 Dec 20, 2022

Sorry, same here. Let's translate to AlgorithmConfig.

Contributor Author

kouroshHakha Dec 20, 2022

done.

sven1977 reviewed

View reviewed changes

rllib/examples/self_play_league_based_with_open_spiel.py

-)
-args = parser.parse_args()
+def get_cli_args():

Contributor

sven1977 Dec 20, 2022

Nice!

sven1977 reviewed

View reviewed changes

rllib/examples/self_play_with_open_spiel.py

-)
-args = parser.parse_args()
+def get_cli_args():

Contributor

sven1977 Dec 20, 2022

Super nice! :)

kouroshHakha assigned sven1977 and gjoliver


dict to AlgorithmConfig

0c0604a

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

sven1977 approved these changes

View reviewed changes

Contributor

sven1977 left a comment

LGTM. Thanks for the explanation on the command line arg question. Yes, sounds reasonable.

sven1977 merged commit c8b3df6 into ray-project:master

Capiru pushed a commit to Capiru/ray that referenced this pull request


[RLlib] MARL examples updated to support RLModule. (ray-project#31169)

d53577e

Signed-off-by: Capiru <[email protected]>

AmeerHajAli pushed a commit that referenced this pull request


[RLlib] MARL examples updated to support RLModule. (#31169)

d682985

tamohannes pushed a commit to ju2ez/ray that referenced this pull request


[RLlib] MARL examples updated to support RLModule. (ray-project#31169)

681ca10

Signed-off-by: tmynn <[email protected]>

tamohannes pushed a commit to ju2ez/ray that referenced this pull request


[RLlib] MARL examples updated to support RLModule. (ray-project#31169)

c03a878

Signed-off-by: tmynn <[email protected]>

tamohannes pushed a commit to ju2ez/ray that referenced this pull request


[RLlib] MARL examples updated to support RLModule. (ray-project#31169)

dd5b57a

Signed-off-by: tmynn <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet