Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] MARL examples updated to support RLModule #31169

Merged
merged 21 commits into from
Dec 21, 2022

Conversation

kouroshHakha
Copy link
Contributor

@kouroshHakha kouroshHakha commented Dec 17, 2022

Why are these changes needed?

Added unittests to cover PPORLModule examples for MARL.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
1. The test now runs on PPO by default instead of APPO
2. Tests pass when enable_rl_module_api

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
… need to investigate

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Copy link
Member

@gjoliver gjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor thing.
one random suggestion, if you want to make this job easier going forward, we can let AlgorithmConfiig.validate() turn on enable_connectors and _enable_rl_module_api based on e.g. an env variable.
and you can add a new pipeline in pipeline.ml.yml say "rl_module".

Then, if you want to test any example or tuned_example in rl_module mode, you can simply add a build rule with "rl_module" tag without updating any of the files.
You can also have a separate CI group for RLModule :)

ray.init(num_cpus=args.num_cpus or None, include_dashboard=False)

args = get_cli_args()
ray.init(num_cpus=args.num_cpus or None, include_dashboard=False, local_mode=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert local_mode

@kouroshHakha
Copy link
Contributor Author

one minor thing. one random suggestion, if you want to make this job easier going forward, we can let AlgorithmConfiig.validate() turn on enable_connectors and _enable_rl_module_api based on e.g. an env variable. and you can add a new pipeline in pipeline.ml.yml say "rl_module".

Then, if you want to test any example or tuned_example in rl_module mode, you can simply add a build rule with "rl_module" tag without updating any of the files. You can also have a separate CI group for RLModule :)

Very good idea, I'll set that up here in this PR since I have to revert local_mode anyway.

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
.environment("multi_agent_cartpole")
.framework(args.framework)
.multi_agent(
# The multiagent Policy map.
policies={
# The Policy we are actually learning.
"pg_policy": PolicySpec(
config=PGConfig.overrides(framework_str=args.framework)
"learnable_policy": PolicySpec(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just using better names :)

@@ -3182,6 +3182,15 @@ py_test(
args = ["--as-test", "--framework=torch", "--stop-reward=70.0", "--num-cpus=4"]
)

py_test(
name = "examples/multi_agent_cartpole_torch_w_rlm",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

answered below.

@@ -3200,6 +3209,15 @@ py_test(
args = ["--as-test", "--framework=torch", "--stop-reward=80"]
)

py_test(
name = "examples/multi_agent_custom_policy_torch_w_rlm",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

answered below.

@@ -3218,6 +3236,16 @@ py_test(
args = ["--stop-iters=4", "--framework=torch"]
)

py_test(
name = "examples/multi_agent_different_spaces_for_agents_torch_w_rlm",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

answered below.

@@ -3538,6 +3566,15 @@ py_test(
args = ["--framework=torch", "--env=connect_four", "--win-rate-threshold=0.6", "--stop-iters=2", "--num-episodes-human-play=0"]
)

py_test(
name = "examples/self_play_with_open_spiel_connect_4_torch_w_rlm",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

answered below

@@ -3556,6 +3593,15 @@ py_test(
args = ["--framework=torch", "--env=markov_soccer", "--win-rate-threshold=0.6", "--stop-iters=2", "--num-episodes-human-play=0"]
)

py_test(
name = "examples/self_play_league_based_with_open_spiel_markov_soccer_torch_w_rlm",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"rlm" -> "rl_module"

Also: see my comment below: Can we use a command line arg here (--use-rl-module)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will show up in CI under RL Module tests, It will be super clear in the CI UI. I'd like to keep it concise here. Also, it's a transitional solution. At some point it will become the default and we have to get rid of these stuff anyways.

@@ -765,6 +766,12 @@ def validate(self) -> None:
"`config.rollouts(enable_connectors=True)`."
)

if bool(os.environ.get("RLLIB_ENABLE_RL_MODULE", False)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use command line arg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I had it like what you said and changed it to this way. It will be easier for us to flip it by default later on. Also adding that flag to every other unittest down the line will create a lot of duplicates and is not very clean for a temporary transition. This is a temporary solution until RLModule is proven out across the board and fully rolled out.

gamma=random.choice([0.95, 0.99]),
)

if bool(os.environ.get("RLLIB_ENABLE_RL_MODULE", False)):
Copy link
Contributor

@sven1977 sven1977 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead use a command line arg here, like we do for all other settings?
I know we have the gpu env setting, but te reason for that is to be able to set this globally so we can more gracefully handle this case in our buildkite/pipeline.ml.yml file. But for this case here, I don't see that need. Thx!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to disagree due to the reason above. I went through both solutions and found this one to be more scalable for our transition.

@@ -120,31 +120,31 @@ def get_cli_args():
"episode_reward_mean": args.stop_reward,
}

config = {
Copy link
Contributor

@sven1977 sven1977 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please no more config dicts.
Use AlgorithmConfig instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you just moved this from below, but either way :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah. Done.

"policy_mapping_fn": (
lambda agent_id, episode, worker, **kwargs: agent_id
),
"policies": env.get_agent_ids(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, same here. Let's translate to AlgorithmConfig.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

)
args = parser.parse_args()

def get_cli_args():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

)
args = parser.parse_args()

def get_cli_args():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice! :)

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Copy link
Contributor

@sven1977 sven1977 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the explanation on the command line arg question. Yes, sounds reasonable.

@sven1977 sven1977 merged commit c8b3df6 into ray-project:master Dec 21, 2022
Capiru pushed a commit to Capiru/ray that referenced this pull request Dec 22, 2022
tamohannes pushed a commit to ju2ez/ray that referenced this pull request Jan 16, 2023
tamohannes pushed a commit to ju2ez/ray that referenced this pull request Jan 25, 2023
tamohannes pushed a commit to ju2ez/ray that referenced this pull request Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants