-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib]: PPO agent training error: Invalid NaN values in Normal distribution parameters #46442
Comments
@InigoGastesi Thanks for raising this issue. I guess this behavior is not error, but due to the training process. Could you check, if your KL divergence is very high? I guess the reason for the logits turning You could try to increase the |
I am getting the same error as well.
|
@simonsays1980 Sorry for not answering sooner, I had not received the notification. On tensorboard I have two parameters of kl, model: Before getting the error again, kl was at 0.01 |
What happened + What you expected to happen
Hello,
I am encountering an error while training a PPO agent using RLlib. During training, I receive the following error message:
File "/opt/conda/envs/prueba3.11/lib/python3.11/site-packages/ray/rllib/algorithms/ppo/ppo_torch_policy.py", line 85, in loss curr_action_dist = dist_class(logits, model) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/envs/prueba3.11/lib/python3.11/site-packages/ray/rllib/models/torch/torch_action_dist.py", line 250, in __init__ self.dist = torch.distributions.normal.Normal(mean, torch.exp(log_std)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/envs/prueba3.11/lib/python3.11/site-packages/torch/distributions/normal.py", line 56, in __init__ super().__init__(batch_shape, validate_args=validate_args) File "/opt/conda/envs/prueba3.11/lib/python3.11/site-packages/torch/distributions/distribution.py", line 68, in __init__ raise ValueError( ValueError: Expected parameter loc (Tensor of shape (128, 2)) of distribution Normal(loc: torch.Size([128, 2]), scale: torch.Size([128, 2])) to satisfy the constraint Real(), but found invalid values: tensor([[nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan], [nan, nan]], grad_fn=<SplitBackward0>)
I have checked all observations to ensure there are no NaN values, but the error persists. Can you please help me identify the cause of this issue and how to resolve it?
Thank you for your assistance.
Versions / Dependencies
ray: 2.31
Python: 3.11
torch: 2.3.1
Reproduction script
tuner = Tuner(
trainable=PPO,
param_space=...,
run_config=...
)
tuner.fit()
Issue Severity
Low: It annoys or frustrates me.
The text was updated successfully, but these errors were encountered: