-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] Reproduce the result of PPO on RoboschoolHumanoidFlagrunHarder #179
Comments
Yeah.. That variable is created when the policy is created. As in PPO2, the first call is in the construction of step_model. Thus, the variable scope is "model". Please let me know when you could get that variable. |
This works for me (without the with tf.variable_scope('model', reuse=True):
print(tf.get_variable(name='pi/logstd')) |
I will try that. Thank you. |
@doviettung96 Let me know if you're able to train RoboschoolHumanoidFlagrunHarder successfully. I was not able to, even with annealing the logstd. |
@BruceK4t1qbit , |
@doviettung96 I didn't use tensorboard - just looked at the rendering. I've found the pybullet_env is much easier to install than roboschool... |
@BruceK4t1qbit , |
@BruceK4t1qbit , |
@doviettung96 |
@BruceK4t1qbit , |
@BruceK4t1qbit , |
So it is just a matter of good luck? |
I don't think so. Changes are necessary to improve the performance for those tasks. Just it is quite difficult to know what should we add up to improve. |
So what did you change? |
@erniejunior , |
Thanks! I didn't try such a big network |
Hi @araffin ,
Current I am trying to reproduce the result of PPO paper with the environment RoboschoolHumanoidFlagrunHarder.
As I have tried almost every settings, there is still a big gap between mine and their.
I have just modified the code to make logstd=LinearAnneal(-0.7, -1.6) as in the paper.
As I printed the logstd in the distribution.py file, I got:
I got this:
I had also just use the variable name "pi/logstd" but it was still failed.
How could I change the value of logstd during training?
Thanks.
The text was updated successfully, but these errors were encountered: