-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCT/IB/MLX5: suggest user to increase PF_LOG_BAR_SIZE when UAR allocation fails #9986
base: master
Are you sure you want to change the base?
UCT/IB/MLX5: suggest user to increase PF_LOG_BAR_SIZE when UAR allocation fails #9986
Conversation
src/uct/ib/mlx5/ib_mlx5.c
Outdated
@@ -554,7 +554,9 @@ uct_ib_mlx5_devx_alloc_uar(uct_ib_mlx5_md_t *md, unsigned flags, int log_level, | |||
uar = mlx5dv_devx_alloc_uar(md->super.dev.ibv_context, flags); | |||
if (uar == NULL) { | |||
sprintf(buf, "mlx5dv_devx_alloc_uar(device=%s, flags=0x%x(%s)) " | |||
"failed: %m", uct_ib_device_name(&md->super.dev), flags, title); | |||
"failed: %m. This may be due to insufficient UAR space. " | |||
"Consider increasing PF_LOG_BAR_SIZE from the default value of 5.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "Consider increasing PF_LOG_BAR_SIZE using mlxconfig tool",
- should we also check errno to have more specific error condition?
- i think there could be other places that may fail because of the same reason, for example "mlx5dv_devx_obj_create(CQ)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- added the "mlxconfig tool" to the string.
- do you mean check "ENOMEM"?
- added this, checking for other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- As you mentioned here: openmpi-3.1.1 np=256 ppn=128 UCX_TLS=rc fail open-mpi/ompi#6854 this requires a reboot so I added it to the string as well.
- It looks like
mlx5dv_devx_alloc_uar
->_mlx5dv_devx_alloc_uar
->mlx5_attach_dedicated_uar
->mlx5_alloc_dyn_uar
->execute_ioctl
->ioctl
it returns theerrno
but I don't know if we can know for sure that it'll setENOMEM
whenPF_LOG_BAR_SIZE
is insufficient. mlx5dv_devx_obj_create(CQ)
doesn't fail because ofPF_LOG_BAR_SIZE
, in the email "
After applying the PF_LOG_BAR_SIZE parameter update".
I removed the commit for it.
Regarding non-DEVX flows
uct_worker_tl_data_get
-> uct_ib_mlx5_iface_get_res_domain
-> uct_ib_mlx5_res_domain_init
-> ibv_alloc_td
-> mlx5_alloc_td
-> mlx5_attach_dedicated_uar
-> mlx5_alloc_dyn_uar
-> execute_ioctl
-> ioctl
If that fails due to insufficient PF_LOG_BAR_SIZE
, uct_ib_mlx5_res_domain_init
function does not fail. Instead, it returns UCS_OK
and logs a debug message.
The fallback mechanism uses the PD directly from the MD structure. This ensures continued operation, though with potentially reduced performance, especially for users sending a high volume of messages to different destinations.
This is because the shared UAR may not handle the high message rate as efficiently.
I can add this if ibv_alloc_td
fails in uct_ib_mlx5_res_domain_init
:
ucs_log(log_level, "Consider increasing PF_LOG_BAR_SIZE using mlxconfig tool");
However, this log might not be noticed because the entire flow does not fail.
Another suggestion is to change the log level from ucs_debug
to ucs_warn
:
ucs_warn("ibv_alloc_td() on %s failed: %m. Falling back to using the PD directly from the MD structure, which may reduce performance.", uct_ib_device_name(&md->dev))
This would make the suggestion to increase PF_LOG_BAR_SIZE
more noticeable and highlight the potential performance impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michal-shalev per our offline discussion,
- (2) ENOMEM - please check whether it indeed returns.
- Other flows - IMO generally we shouldn't change the behavior in this PR. Flows should not turn from failing to succeeding as a result of this PR, neither logging levels should change. Only add a suggestion for the user, in the already existing log level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gleon99 I'll write a test that calls mlx5dv_devx_alloc_uar
until it fails and checks the errno to make sure what it returns exactly.
Other flows - I'll add a suggestion.
6964498
to
afd177f
Compare
d505e5c
to
379d502
Compare
379d502
to
fcc6726
Compare
fcc6726
to
b7c21f8
Compare
What ?
Improve error logging for UAR allocation failure in
uct_ib_mlx5_devx_alloc_uar
.Why ?
To inform users to increase
PF_LOG_BAR_SIZE
and reboot when UAR allocation fails due to insufficient memory.see also: open-mpi/ompi#6854
How ?
Update the error message in
uct_ib_mlx5_devx_alloc_uar
to suggest increasingPF_LOG_BAR_SIZE
and mention the reboot requirement.