-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is the continuous_a3c code valid? #20
Comments
Hi, |
No, the algorithm provided in this repo does not work out of the box. You can try to combine MACAD-Gym with MARLlib (or RLlib), we have successfully trained cooperative agent with MAPPO. |
Okay. Thanks |
Hi, @Morphlng |
I've tried the
continuous_A3C.py
, there exists some problems.Problem
1. Incorrect dictionary update
macad-agents/src/macad_agents/a3c/continuous_A3C.py
Lines 30 to 33 in b2726b3
Use
dict.update
to update a dictionary will override pre-exists keys in the dict, such as:This will cause the "fixed_delta_seconds" key to be lost, as a result macad-gym can't initialize.
2. Serialization Problem
macad-agents/src/macad_agents/a3c/continuous_A3C.py
Lines 145 to 154 in b2726b3
Putting environment inside Net is not a good idea, when mp.Process serialize this object, it will try to serialize the environment as well, resulting in "can't pickle pygame.Font object" error.
Training
Even though I fixed those problems, it still doesn't seem to work. The "Mean Reward Curve" doesn't tend to go up (nor the distance curve to go down), and I haven't achieved one success episode yet (3M steps, maybe it's not enough to draw a conclusion?).
I know that PPO and IMPALA are the recommended algorithms, but since A3C is available in the repo, I want to know if it actually works.
The text was updated successfully, but these errors were encountered: