-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] How best to implement self-play/multiple agents in the same environment? #181
Comments
Hello, I think @AdamGleave tackled that problem in the Adversarial policies repo, you should take a look ;) |
I never finished the self-play implementation but it might still be worth looking at:https://github.com/HumanCompatibleAI/adversarial-policies/blob/master/src/aprl/agents/ppo_self_play.py |
@AdamGleave I can't access the page. Is there still an available/public version of it? |
Yeah it's still in the commit history. |
Here is an example for your reference. |
I'm trying to train a model using self play, and really love the work that has been done here so far. I was wondering whether anyone might have some advice about how I might adapt PPO2 to allow for multiple models to play against each other in the same environment.
The overall strategy would be to:
I have written a custom environment that can take an array of actions, update the game state, and then return a list of rewards for each agent. My main issue is in prying apart the actual model from the interactions with the gym environment. I have been trying to decouple the model from the runner, but it seems as if they are quite tightly intertwined and I'm having a difficult time. Has anyone else played around with this idea before? Or be able to point me in the right direction?
The text was updated successfully, but these errors were encountered: