Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build failure: rocblas times out on hydra #301937

Closed
Tungsten842 opened this issue Apr 5, 2024 · 3 comments
Closed

Build failure: rocblas times out on hydra #301937

Tungsten842 opened this issue Apr 5, 2024 · 3 comments

Comments

@Tungsten842
Copy link
Member

Tungsten842 commented Apr 5, 2024

Build log

https://hydra.nixos.org/build/254878615

Notify maintainers

@NixOS/rocm-maintainers @mschwaig

Add a 👍 reaction to issues you find important.

@mschwaig
Copy link
Member

mschwaig commented Apr 8, 2024

The build is quite flaky recently, yes. Sometimes it fails, sometimes it succeeds:
https://hydra.nixos.org/job/nixos/trunk-combined/nixpkgs.rocmPackages.rocblas.x86_64-linux/all
When I looked last week I think I saw more timeouts than now.

My guess is that it times out because composable_kernel is not cached, and that this issue will go away once #299589 is merged.

@Tungsten842
Copy link
Member Author

The build is quite flaky recently, yes. Sometimes it fails, sometimes it succeeds: https://hydra.nixos.org/job/nixos/trunk-combined/nixpkgs.rocmPackages.rocblas.x86_64-linux/all When I looked last week I think I saw more timeouts than now.

My guess is that it times out because composable_kernel is not cached, and that this issue will go away once #299589 is merged.

Are you sure that that is actually the cause? It looks like that sometimes hydra is a little slower at building the package, and it exceeds the two hours timeout. Raising the hydra timeout should be enough to always fix these kind of build failures.

@mschwaig
Copy link
Member

mschwaig commented Apr 8, 2024

No, I am not sure. I have not found specific information about how long it should actually take for builds to time out.

What I did so far is check and see that there is only one package that is part of ROCm, which has an explicit timeout set: rocsolver has a timeout of 4h.
This recently timed out after less than 3h here: https://hydra.nixos.org/build/254023071
And succeeded to build after more than 3h here: https://hydra.nixos.org/build/254390757
composable_kernel, which does not have a timeout set, tries to build for much longer than that.

That makes me think that dependencies which are not cached count towards the time budget of a given package.

This was referenced Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants