-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recording Expert Data from myself in Discrete Action Space #319
Comments
well, multibinary actions are in {0, 1}^n, and there is a bijective mapping from {0,1}^n to [[0, m]] (discrete actions), where m=2^n -1, which is in fact the binary representation of the number. An example with n=2: That way, you can easily map multibinary actions to discrete ones, and use the |
Good Morning araffin, thanks for the fast response. Unfortunatly i could not quite implement a solution based on your answer. I think i understood what you were saying, but my tests didn´t work out as expected.
Afterwards i tried to take descrete actions which fit your explanation, or better to say my understanding of your explanation, and the outcome was not what i was expecting. Test Code: So descrete 64 (MultiBinary (Binary Value of descrete) = 000001000000) lets Mario walk left, so apperently the bit on spot 6 is left button. But descrete 4 (Binary = 001000000000) makes Mario go left again... so bit on spot 3 is left button as well? By going through some numbers by try and error i could just make mario go left and right (each several different combinations) but i could not find jumping (Button B) or anything else. So all in all i got confused, because i could not quite match one bit to one action because either there was no ingame reaction for a bit, or the same action for different bits (Descrete Values). Sorry for the long issue, i hope its not unnecessarily bothering because of my wrong understanding. |
Good Morning,
i try to pretrain my A2C agent with expert data. Therefore i would like to record myself playing Super Mario (SuperMarioAdvance4 for GameboyAdvance). As documented the
pretrain()
-method does only work with a descrete action space.I can record myself and write all the needed data in a .npz file. Similar to the file created by
generate_expert_traj()
.Now the problem: Unfortunatly i can only record with a MultiBinary action space, not with the required descrete one. I read through the documentation and tried to figure out, how the descrete action space is coded. I could not find a solution.
Is there any way to translate a MultiBinary action space into a descrete one? Or is there something like a map, which explains the actions Mario takes mapped onto the descrete numbers?
The text was updated successfully, but these errors were encountered: