Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocmPackages.* 5.7.1→ 6.0.2 #287846

Merged
merged 33 commits into from
Mar 22, 2024
Merged

rocmPackages.* 5.7.1→ 6.0.2 #287846

merged 33 commits into from
Mar 22, 2024

Conversation

mschwaig
Copy link
Member

@mschwaig mschwaig commented Feb 10, 2024

Description of changes

CC #197885

This adds ROCm 6.0.2 alongside ROCm 5.7.

This PR

  • copies the version 5 folder and creates a new one for version 6
    • version 5 is still available as rocmPackages_5
  • use nix-update and some manual updates to get the package sources up to date
  • fixes the resulting build issues, including a few contributions from rocmPackages_6: init at 6.0.2(?) #289187

Remaining issues

Related work

I was not able to take the changes from https://github.com/Madouura/nixpkgs/tree/dev/rocm-rework,
because that branch has a large amount of changes which I felt would have taken me a long time review and understand, especially since I did not know how solid those changes are already.

Closes #280927

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@mschwaig
Copy link
Member Author

Result of nixpkgs-review run on x86_64-linux 1

14 packages marked as broken and skipped:
  • rocmPackages.llvm.flang
  • rocmPackages.llvm.flang.doc
  • rocmPackages.llvm.flang.info
  • rocmPackages.llvm.flang.man
  • rocmPackages.llvm.libclc
  • rocmPackages.rdc
  • rocmPackages.rdc.doc
  • rocmPackages_5.llvm.flang
  • rocmPackages_5.llvm.flang.doc
  • rocmPackages_5.llvm.flang.info
  • rocmPackages_5.llvm.flang.man
  • rocmPackages_5.llvm.libclc
  • rocmPackages_5.rdc
  • rocmPackages_5.rdc.doc
13 packages failed to build:
  • magma (magma-hip ,magma_2_7_2)
  • magma_2_6_2
  • opensyclWithRocm
  • rocmPackages.composable_kernel
  • rocmPackages.migraphx
  • rocmPackages.miopen (rocmPackages.miopen-hip)
  • rocmPackages.miopen-opencl
  • rocmPackages.rccl
  • rocmPackages.rocmlir
  • rocmPackages.rocmlir-rock
  • rocmPackages.rocmlir.external
  • rocmPackages.rocprofiler
  • rocmPackages_5.rocprofiler
175 packages built:
  • blender-hip
  • rocmPackages.clang-ocl
  • rocmPackages.clr
  • rocmPackages.clr.icd
  • rocmPackages.half
  • rocmPackages.hip-common
  • rocmPackages.hipblas
  • rocmPackages.hipcc
  • rocmPackages.hipcub
  • rocmPackages.hipfft
  • rocmPackages.hipfort
  • rocmPackages.hipify
  • rocmPackages.hiprand
  • rocmPackages.hipsolver
  • rocmPackages.hipsparse
  • rocmPackages.hsa-amd-aqlprofile-bin
  • rocmPackages.llvm.bintools
  • rocmPackages.llvm.clang
  • rocmPackages.llvm.clang-tools-extra
  • rocmPackages.llvm.clang-tools-extra.doc
  • rocmPackages.llvm.clang-tools-extra.info
  • rocmPackages.llvm.clang-tools-extra.man
  • rocmPackages.llvm.clang-unwrapped
  • rocmPackages.llvm.clang-unwrapped.doc
  • rocmPackages.llvm.clang-unwrapped.info
  • rocmPackages.llvm.clang-unwrapped.man
  • rocmPackages.llvm.compiler-rt
  • rocmPackages.llvm.libc
  • rocmPackages.llvm.libc.doc
  • rocmPackages.llvm.libcxx
  • rocmPackages.llvm.libcxx.doc
  • rocmPackages.llvm.libcxxabi
  • rocmPackages.llvm.libunwind
  • rocmPackages.llvm.libunwind.doc
  • rocmPackages.llvm.lld
  • rocmPackages.llvm.lld.doc
  • rocmPackages.llvm.lldb
  • rocmPackages.llvm.lldb.doc
  • rocmPackages.llvm.lldb.info
  • rocmPackages.llvm.lldb.man
  • rocmPackages.llvm.llvm
  • rocmPackages.llvm.llvm.doc
  • rocmPackages.llvm.llvm.info
  • rocmPackages.llvm.llvm.man
  • rocmPackages.llvm.mlir
  • rocmPackages.llvm.openmp
  • rocmPackages.llvm.openmp.doc
  • rocmPackages.llvm.openmp.info
  • rocmPackages.llvm.openmp.man
  • rocmPackages.llvm.polly
  • rocmPackages.llvm.polly.doc
  • rocmPackages.llvm.polly.info
  • rocmPackages.llvm.polly.man
  • rocmPackages.llvm.pstl
  • rocmPackages.llvm.rocmClangStdenv
  • rocmPackages.rocalution
  • rocmPackages.rocblas
  • rocmPackages.rocdbgapi
  • rocmPackages.rocdbgapi.doc
  • rocmPackages.rocfft
  • rocmPackages.rocgdb
  • rocmPackages.rocm-cmake
  • rocmPackages.rocm-comgr
  • rocmPackages.rocm-core
  • rocmPackages.rocm-device-libs
  • rocmPackages.rocm-docs-core
  • rocmPackages.rocm-docs-core.dist
  • rocmPackages.rocm-runtime
  • rocmPackages.rocm-smi
  • rocmPackages.rocm-thunk
  • rocmPackages.rocminfo
  • rocmPackages.rocprim
  • rocmPackages.rocr-debug-agent
  • rocmPackages.rocsolver
  • rocmPackages.rocsparse
  • rocmPackages.rocthrust
  • rocmPackages.roctracer
  • rocmPackages.rocwmma
  • rocmPackages.rpp (rocmPackages.rpp-hip)
  • rocmPackages.rpp-cpu
  • rocmPackages.rpp-opencl
  • rocmPackages.tensile
  • rocmPackages.tensile.dist
  • rocmPackages_5.clang-ocl
  • rocmPackages_5.clr
  • rocmPackages_5.clr.icd
  • rocmPackages_5.composable_kernel
  • rocmPackages_5.half
  • rocmPackages_5.hip-common
  • rocmPackages_5.hipblas
  • rocmPackages_5.hipcc
  • rocmPackages_5.hipcub
  • rocmPackages_5.hipfft
  • rocmPackages_5.hipfort
  • rocmPackages_5.hipify
  • rocmPackages_5.hiprand
  • rocmPackages_5.hipsolver
  • rocmPackages_5.hipsparse
  • rocmPackages_5.hsa-amd-aqlprofile-bin
  • rocmPackages_5.llvm.bintools
  • rocmPackages_5.llvm.clang
  • rocmPackages_5.llvm.clang-tools-extra
  • rocmPackages_5.llvm.clang-tools-extra.doc
  • rocmPackages_5.llvm.clang-tools-extra.info
  • rocmPackages_5.llvm.clang-tools-extra.man
  • rocmPackages_5.llvm.clang-unwrapped
  • rocmPackages_5.llvm.clang-unwrapped.doc
  • rocmPackages_5.llvm.clang-unwrapped.info
  • rocmPackages_5.llvm.clang-unwrapped.man
  • rocmPackages_5.llvm.compiler-rt
  • rocmPackages_5.llvm.libc
  • rocmPackages_5.llvm.libc.doc
  • rocmPackages_5.llvm.libcxx
  • rocmPackages_5.llvm.libcxx.doc
  • rocmPackages_5.llvm.libcxxabi
  • rocmPackages_5.llvm.libunwind
  • rocmPackages_5.llvm.libunwind.doc
  • rocmPackages_5.llvm.lld
  • rocmPackages_5.llvm.lld.doc
  • rocmPackages_5.llvm.lldb
  • rocmPackages_5.llvm.lldb.doc
  • rocmPackages_5.llvm.lldb.info
  • rocmPackages_5.llvm.lldb.man
  • rocmPackages_5.llvm.llvm
  • rocmPackages_5.llvm.llvm.doc
  • rocmPackages_5.llvm.llvm.info
  • rocmPackages_5.llvm.llvm.man
  • rocmPackages_5.llvm.mlir
  • rocmPackages_5.llvm.openmp
  • rocmPackages_5.llvm.openmp.doc
  • rocmPackages_5.llvm.openmp.info
  • rocmPackages_5.llvm.openmp.man
  • rocmPackages_5.llvm.polly
  • rocmPackages_5.llvm.polly.doc
  • rocmPackages_5.llvm.polly.info
  • rocmPackages_5.llvm.polly.man
  • rocmPackages_5.llvm.pstl
  • rocmPackages_5.llvm.rocmClangStdenv
  • rocmPackages_5.migraphx
  • rocmPackages_5.miopen (rocmPackages_5.miopen-hip)
  • rocmPackages_5.miopen-opencl
  • rocmPackages_5.miopengemm
  • rocmPackages_5.miopengemm.doc
  • rocmPackages_5.rccl
  • rocmPackages_5.rocalution
  • rocmPackages_5.rocblas
  • rocmPackages_5.rocdbgapi
  • rocmPackages_5.rocdbgapi.doc
  • rocmPackages_5.rocfft
  • rocmPackages_5.rocgdb
  • rocmPackages_5.rocm-cmake
  • rocmPackages_5.rocm-comgr
  • rocmPackages_5.rocm-core
  • rocmPackages_5.rocm-device-libs
  • rocmPackages_5.rocm-docs-core
  • rocmPackages_5.rocm-docs-core.dist
  • rocmPackages_5.rocm-runtime
  • rocmPackages_5.rocm-smi
  • rocmPackages_5.rocm-thunk
  • rocmPackages_5.rocminfo
  • rocmPackages_5.rocmlir
  • rocmPackages_5.rocmlir-rock
  • rocmPackages_5.rocmlir.external
  • rocmPackages_5.rocprim
  • rocmPackages_5.rocr-debug-agent
  • rocmPackages_5.rocsolver
  • rocmPackages_5.rocsparse
  • rocmPackages_5.rocthrust
  • rocmPackages_5.roctracer
  • rocmPackages_5.rocwmma
  • rocmPackages_5.rpp (rocmPackages_5.rpp-hip)
  • rocmPackages_5.rpp-cpu
  • rocmPackages_5.rpp-opencl
  • rocmPackages_5.tensile
  • rocmPackages_5.tensile.dist
Screenshot illustrating what failed and crazy long build times.

rocm_build_results_crop

@mschwaig
Copy link
Member Author

I added a few more fixes and the composable_kernel, miopenand rocmlir builds succeed now if is set gpuTarget to target only my RX 7600 XT (gfx1102) with a bunch of code along these lines:

--- a/pkgs/development/rocm-modules/6/default.nix
+++ b/pkgs/development/rocm-modules/6/default.nix
@@ -16,6 +16,7 @@
 let
   rocmUpdateScript = callPackage ./update.nix { };
 in rec {
+  gpuTargets = [ "gfx1102" ];
   ## ROCm ##
   llvm = recurseIntoAttrs (callPackage ./llvm/default.nix { inherit rocmUpdateScript rocm-device-libs rocm-runtime rocm-thunk clr; });
 
@@ -73,7 +74,7 @@ in rec {
 
   # Broken, too many errors
   rdc = callPackage ./rdc {
-    inherit rocmUpdateScript rocm-smi rocm-runtime;
+    inherit rocmUpdateScript rocm-smi rocm-runtime gpuTargets;
     stdenv = gcc12Stdenv;
     # stdenv = llvm.rocmClangStdenv;
   };
+[... and so on ...]

I can see that composable_kernel still has some issue (besides taking a loong time to build) and fails if I build for all the gpuTargets.

@ScatteredRay
Copy link
Contributor

Hi! I just noticed that we've been working on the same thing. I hope you don't mind, but I went ahead and cherry-picked your changes into my branch: https://github.com/ScatteredRay/nixpkgs/tree/rocm6 #289187

@mschwaig
Copy link
Member Author

Hi! I just noticed that we've been working on the same thing. I hope you don't mind, but I went ahead and cherry-picked your changes into my branch: https://github.com/ScatteredRay/nixpkgs/tree/rocm6 #289187

That's great. Looks like you already looked at rocprofiler.

@ScatteredRay
Copy link
Contributor

ScatteredRay commented Feb 16, 2024

Yeah. Rocprofiler seems to be building now with stdenv gcc.. want to get ot building with the same the rocm-llvm stdenv, but running into a few conflicts. I'm not certain, but I think it's basically running into this issue: #192459 but with the rocm-llvm

@mschwaig
Copy link
Member Author

Yeah. Rocprofiler seems to be building now with stdenv gcc.. want to get ot building with the same the rocm-llvm stdenv, but running into a few conflicts. I'm not certain, but I think it's basically running into this issue: #192459 but with the rocm-llvm

I don't know anything about that unfortunately.
Is there some actual benefit to building that with rocm-llvm?

@ScatteredRay
Copy link
Contributor

Is there some actual benefit to building that with rocm-llvm?

I don't know specifically. I think some project need it to link against the hip code correctly.

And then from there, there were some problem with conflicting libraries from linking with separate stdenvs with separate libraries. So I think in general it is a bit easier to link with stdenvs that have fewer differences.

@mschwaig
Copy link
Member Author

mschwaig commented Mar 2, 2024

Results of building this against current master (fae7388):

Result of nixpkgs-review pr 287846 run on x86_64-linux 1

14 packages marked as broken and skipped:
  • rocmPackages.llvm.flang
  • rocmPackages.llvm.flang.doc
  • rocmPackages.llvm.flang.info
  • rocmPackages.llvm.flang.man
  • rocmPackages.llvm.libclc
  • rocmPackages.rdc
  • rocmPackages.rdc.doc
  • rocmPackages_6.llvm.flang
  • rocmPackages_6.llvm.flang.doc
  • rocmPackages_6.llvm.flang.info
  • rocmPackages_6.llvm.flang.man
  • rocmPackages_6.llvm.libclc
  • rocmPackages_6.rdc
  • rocmPackages_6.rdc.doc
7 packages failed to build:
  • magma (magma-hip ,magma_2_7_2)
  • magma_2_6_2
  • opensyclWithRocm
  • rocmPackages.migraphx (rocmPackages_6.migraphx)
  • rocmPackages.miopen-opencl (rocmPackages_6.miopen-opencl)
  • rocmPackages.rocmlir (rocmPackages_6.rocmlir)
  • rocmPackages.rocmlir.external (rocmPackages_6.rocmlir.external)
88 packages built:
  • blender-hip
  • rocmPackages.clang-ocl (rocmPackages_6.clang-ocl)
  • rocmPackages.clr (rocmPackages_6.clr)
  • rocmPackages.clr.icd (rocmPackages_6.clr.icd)
  • rocmPackages.composable_kernel (rocmPackages_6.composable_kernel)
  • rocmPackages.half (rocmPackages_6.half)
  • rocmPackages.hip-common (rocmPackages_6.hip-common)
  • rocmPackages.hipblas (rocmPackages_6.hipblas)
  • rocmPackages.hipcc (rocmPackages_6.hipcc)
  • rocmPackages.hipcub (rocmPackages_6.hipcub)
  • rocmPackages.hipfft (rocmPackages_6.hipfft)
  • rocmPackages.hipfort (rocmPackages_6.hipfort)
  • rocmPackages.hipify (rocmPackages_6.hipify)
  • rocmPackages.hiprand (rocmPackages.rocrand ,rocmPackages_6.hiprand ,rocmPackages_6.rocrand)
  • rocmPackages.hipsolver (rocmPackages_6.hipsolver)
  • rocmPackages.hipsparse (rocmPackages_6.hipsparse)
  • rocmPackages.hsa-amd-aqlprofile-bin (rocmPackages_6.hsa-amd-aqlprofile-bin)
  • rocmPackages.llvm.bintools (rocmPackages_6.llvm.bintools)
  • rocmPackages.llvm.clang (rocmPackages_6.llvm.clang)
  • rocmPackages.llvm.clang-tools-extra (rocmPackages_6.llvm.clang-tools-extra)
  • rocmPackages.llvm.clang-tools-extra.doc (rocmPackages_6.llvm.clang-tools-extra.doc)
  • rocmPackages.llvm.clang-tools-extra.info (rocmPackages_6.llvm.clang-tools-extra.info)
  • rocmPackages.llvm.clang-tools-extra.man (rocmPackages_6.llvm.clang-tools-extra.man)
  • rocmPackages.llvm.clang-unwrapped (rocmPackages_6.llvm.clang-unwrapped)
  • rocmPackages.llvm.clang-unwrapped.doc (rocmPackages_6.llvm.clang-unwrapped.doc)
  • rocmPackages.llvm.clang-unwrapped.info (rocmPackages_6.llvm.clang-unwrapped.info)
  • rocmPackages.llvm.clang-unwrapped.man (rocmPackages_6.llvm.clang-unwrapped.man)
  • rocmPackages.llvm.compiler-rt (rocmPackages_6.llvm.compiler-rt)
  • rocmPackages.llvm.libc (rocmPackages_6.llvm.libc)
  • rocmPackages.llvm.libc.doc (rocmPackages_6.llvm.libc.doc)
  • rocmPackages.llvm.libcxx (rocmPackages_6.llvm.libcxx)
  • rocmPackages.llvm.libcxx.doc (rocmPackages_6.llvm.libcxx.doc)
  • rocmPackages.llvm.libcxxabi (rocmPackages_6.llvm.libcxxabi)
  • rocmPackages.llvm.libunwind (rocmPackages_6.llvm.libunwind)
  • rocmPackages.llvm.libunwind.doc (rocmPackages_6.llvm.libunwind.doc)
  • rocmPackages.llvm.lld (rocmPackages_6.llvm.lld)
  • rocmPackages.llvm.lld.doc (rocmPackages_6.llvm.lld.doc)
  • rocmPackages.llvm.lldb (rocmPackages_6.llvm.lldb)
  • rocmPackages.llvm.lldb.doc (rocmPackages_6.llvm.lldb.doc)
  • rocmPackages.llvm.lldb.info (rocmPackages_6.llvm.lldb.info)
  • rocmPackages.llvm.lldb.man (rocmPackages_6.llvm.lldb.man)
  • rocmPackages.llvm.llvm (rocmPackages_6.llvm.llvm)
  • rocmPackages.llvm.llvm.doc (rocmPackages_6.llvm.llvm.doc)
  • rocmPackages.llvm.llvm.info (rocmPackages_6.llvm.llvm.info)
  • rocmPackages.llvm.llvm.man (rocmPackages_6.llvm.llvm.man)
  • rocmPackages.llvm.mlir (rocmPackages_6.llvm.mlir)
  • rocmPackages.llvm.openmp (rocmPackages_6.llvm.openmp)
  • rocmPackages.llvm.openmp.doc (rocmPackages_6.llvm.openmp.doc)
  • rocmPackages.llvm.openmp.info (rocmPackages_6.llvm.openmp.info)
  • rocmPackages.llvm.openmp.man (rocmPackages_6.llvm.openmp.man)
  • rocmPackages.llvm.polly (rocmPackages_6.llvm.polly)
  • rocmPackages.llvm.polly.doc (rocmPackages_6.llvm.polly.doc)
  • rocmPackages.llvm.polly.info (rocmPackages_6.llvm.polly.info)
  • rocmPackages.llvm.polly.man (rocmPackages_6.llvm.polly.man)
  • rocmPackages.llvm.pstl (rocmPackages_6.llvm.pstl)
  • rocmPackages.llvm.rocmClangStdenv (rocmPackages_6.llvm.rocmClangStdenv)
  • rocmPackages.miopen (rocmPackages.miopen-hip ,rocmPackages_6.miopen ,rocmPackages_6.miopen-hip)
  • rocmPackages.rccl (rocmPackages_6.rccl)
  • rocmPackages.rocalution (rocmPackages_6.rocalution)
  • rocmPackages.rocblas (rocmPackages_6.rocblas)
  • rocmPackages.rocdbgapi (rocmPackages_6.rocdbgapi)
  • rocmPackages.rocdbgapi.doc (rocmPackages_6.rocdbgapi.doc)
  • rocmPackages.rocfft (rocmPackages_6.rocfft)
  • rocmPackages.rocgdb (rocmPackages_6.rocgdb)
  • rocmPackages.rocm-cmake (rocmPackages_6.rocm-cmake)
  • rocmPackages.rocm-comgr (rocmPackages_6.rocm-comgr)
  • rocmPackages.rocm-core (rocmPackages_6.rocm-core)
  • rocmPackages.rocm-device-libs (rocmPackages_6.rocm-device-libs)
  • rocmPackages.rocm-docs-core (rocmPackages_6.rocm-docs-core)
  • rocmPackages.rocm-docs-core.dist (rocmPackages_6.rocm-docs-core.dist)
  • rocmPackages.rocm-runtime (rocmPackages_6.rocm-runtime)
  • rocmPackages.rocm-smi (rocmPackages_6.rocm-smi)
  • rocmPackages.rocm-thunk (rocmPackages_6.rocm-thunk)
  • rocmPackages.rocminfo (rocmPackages_6.rocminfo)
  • rocmPackages.rocmlir-rock (rocmPackages_6.rocmlir-rock)
  • rocmPackages.rocprim (rocmPackages_6.rocprim)
  • rocmPackages.rocprofiler (rocmPackages_6.rocprofiler)
  • rocmPackages.rocr-debug-agent (rocmPackages_6.rocr-debug-agent)
  • rocmPackages.rocsolver (rocmPackages_6.rocsolver)
  • rocmPackages.rocsparse (rocmPackages_6.rocsparse)
  • rocmPackages.rocthrust (rocmPackages_6.rocthrust)
  • rocmPackages.roctracer (rocmPackages_6.roctracer)
  • rocmPackages.rocwmma (rocmPackages_6.rocwmma)
  • rocmPackages.rpp (rocmPackages.rpp-hip ,rocmPackages_6.rpp ,rocmPackages_6.rpp-hip)
  • rocmPackages.rpp-cpu (rocmPackages_6.rpp-cpu)
  • rocmPackages.rpp-opencl (rocmPackages_6.rpp-opencl)
  • rocmPackages.tensile (rocmPackages_6.tensile)
  • rocmPackages.tensile.dist (rocmPackages_6.tensile.dist)

When building this branch by itself llvm.clang-tools-extra and rocprofiler also failed.
The former running until it was patching interpreter paths for the info output and no errors indicated and the latter having its tests time out. i did the build with the extra failures in the cloud, while the build against master was done locally.

The rocmlir build failing would be a regression if it also fails with gpuTargets = [ "gfx1102" ];.

@mschwaig
Copy link
Member Author

mschwaig commented Mar 2, 2024

opensyclWithRocm is broken due to the old version of ROCm. AdaptiveCpp/AdaptiveCpp@3952b46. My plan is to keep building this against ROCm 5.7.

[ 90%] Building CXX object src/runtime/CMakeFiles/rt-backend-hip.dir/hip/hip_hardware_manager.cpp.o
[ 92%] Building CXX object src/runtime/CMakeFiles/rt-backend-hip.dir/hip/hip_backend.cpp.o
[ 93%] Building CXX object src/runtime/CMakeFiles/rt-backend-hip.dir/hip/hip_code_object.cpp.o
/build/source/src/runtime/hip/hip_allocator.cpp:144:35: error: no member named 'memoryType' in 'hipPointerAttribute_t'
  out.is_optimized_host = attribs.memoryType == hipMemoryTypeHost;
                          ~~~~~~~ ^
1 error generated.
make[2]: *** [src/runtime/CMakeFiles/rt-backend-hip.dir/build.make:132: src/runtime/CMakeFiles/rt-backend-hip.dir/hip/hip_allocator.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....

magma probably fails for a similar reason, and will be handled similarly.

[257/3332] Building CXX object CMakeFiles/magma.dir/src/zpotf2_vbatched.cpp.o
[258/3332] Building CXX object CMakeFiles/magma.dir/interface_hip/blas_z_v2.cpp.o
FAILED: CMakeFiles/magma.dir/interface_hip/blas_z_v2.cpp.o 
/nix/store/4ic7l20hx0508q6gks795hp65npllmik-clr-6.0.2/bin/hipcc -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -Dmagma_EXPORTS -I/build/magma-2.7.2/build/include -I/build/magma-2.7.2/include -I/build/magma-2.7.2/control -I/build/magma-2.7.2/magmablas_hip -I/build/magma-2.7.2/sparse_hip/include -I/build/magma-2.7.2/sparse_hip/control -I/build/magma-2.7.2/testing -std=c++11 -DADD_ -fopenmp=libomp -Wall -Wno-unused-function -O3 -DNDEBUG -fPIC -MD -MT CMakeFiles/magma.dir/interface_hip/blas_z_v2.cpp.o -MF CMakeFiles/magma.dir/interface_hip/blas_z_v2.cpp.o.d -o CMakeFiles/magma.dir/interface_hip/blas_z_v2.cpp.o -c /build/magma-2.7.2/interface_hip/blas_z_v2.cpp
/build/magma-2.7.2/interface_hip/blas_z_v2.cpp:1853:9: error: no matching function for call to 'hipblasZtrmm'
        hipblasZtrmm(
        ^~~~~~~~~~~~
/nix/store/kw5m73874x1bzkamdws0w41z70xpilrz-hipblas-6.0.2/include/hipblas/hipblas.h:17918:32: note: candidate function not viable: requires 14 arguments, but 12 were provided
HIPBLAS_EXPORT hipblasStatus_t hipblasZtrmm(hipblasHandle_t             handle,
                               ^
1 error generated when compiling for gfx906.
[259/3332] Building CXX object CMakeFiles/magma.dir/src/zpotrf_panel_vbatched.cpp.o
[260/3332] Building CXX object CMakeFiles/magma.dir/interface_hip/interface.cpp.o
FAILED: CMakeFiles/magma.dir/interface_hip/interface.cpp.o 
/nix/store/4ic7l20hx0508q6gks795hp65npllmik-clr-6.0.2/bin/hipcc -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -Dmagma_EXPORTS -I/build/magma-2.7.2/build/include -I/build/magma-2.7.2/include -I/build/magma-2.7.2/control -I/build/magma-2.7.2/magmablas_hip -I/build/magma-2.7.2/sparse_hip/include -I/build/magma-2.7.2/sparse_hip/control -I/build/magma-2.7.2/testing -std=c++11 -DADD_ -fopenmp=libomp -Wall -Wno-unused-function -O3 -DNDEBUG -fPIC -MD -MT CMakeFiles/magma.dir/interface_hip/interface.cpp.o -MF CMakeFiles/magma.dir/interface_hip/interface.cpp.o.d -o CMakeFiles/magma.dir/interface_hip/interface.cpp.o -c /build/magma-2.7.2/interface_hip/interface.cpp
/build/magma-2.7.2/interface_hip/interface.cpp:216:65: error: no member named 'gcnArch' in 'hipDeviceProp_tR0600'
                    g_magma_devices[dev].cuda_arch       = prop.gcnArch;
                                                           ~~~~ ^
/build/magma-2.7.2/interface_hip/interface.cpp:467:22: error: no member named 'gcnArch' in 'hipDeviceProp_tR0600'
                prop.gcnArch );
                ~~~~ ^
/build/magma-2.7.2/interface_hip/interface.cpp:531:30: error: no member named 'memoryType' in 'hipPointerAttribute_t'
                return (attr.memoryType == hipMemoryTypeDevice);
                        ~~~~ ^
3 errors generated when compiling for gfx906.

@mschwaig
Copy link
Member Author

mschwaig commented Mar 3, 2024

The build for migraphx still fails with the following error:

[ 89%] Building CXX object src/targets/gpu/CMakeFiles/migraphx_gpu.dir/reverse.cpp.o
[ 89%] Building CXX object src/targets/gpu/CMakeFiles/migraphx_gpu.dir/rnn_variable_seq_lens.cpp.o
In file included from /build/source/src/targets/gpu/prefuse_ops.cpp:31:
/build/source/src/targets/gpu/include/migraphx/gpu/ck.hpp:33:10: fatal error: 'ck/host/device_gemm_multiple_d.hpp' file not found
#include "ck/host/device_gemm_multiple_d.hpp"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make[2]: *** [src/targets/gpu/CMakeFiles/migraphx_gpu.dir/build.make:524: src/targets/gpu/CMakeFiles/migraphx_gpu.dir/prefuse_ops.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:13502: src/targets/gpu/CMakeFiles/migraphx_gpu.dir/all] Error 2
make: *** [Makefile:156: all] Error 2

the only similar file I can find is https://github.com/ROCm/composable_kernel/blob/9ce18b045d6ffa6cfa29134229b422c07984ffb7/include/ck/tensor_operation/gpu/device/device_gemm_multiple_d.hpp, but that does seems like an intentional difference since that file lives in tensor_operation/gpu/device and not host.

I am not sure how to resolve this situation.

@mschwaig
Copy link
Member Author

mschwaig commented Mar 4, 2024

I would like to mark ROCm 5.7 as deprecated, but using lib.warn for that makes ofborg-eval unhappy:
This PR does not cleanly list package outputs after merging.

--- a/pkgs/top-level/all-packages.nix
+++ b/pkgs/top-level/all-packages.nix
@@ -7757,7 +7757,7 @@ with pkgs;
   rar2fs = callPackage ../tools/filesystems/rar2fs { };
 
   rocmPackages = rocmPackages_6;
-  rocmPackages_5 = recurseIntoAttrs (callPackage ../development/rocm-modules/5 { });
+  rocmPackages_5 = lib.warn "ROCm 5.7 is deprecated in nixpkgs, please upgrade to ROCm 6.+" recurseIntoAttrs (callPackage ../development/rocm-modules/5 { });
   rocmPackages_6 = recurseIntoAttrs (callPackage ../development/rocm-modules/6 { });
 
   rune = callPackage ../development/interpreters/rune { };

EDIT: fixed eval for now

@mschwaig
Copy link
Member Author

It looks like all the packages marked as broken are still maintained upstream. I did not look into fixing them yet.

I also looked at the sizes of all of the output derivations. They did increase a bit, but none of the outputs really blew up in size significantly, and none seem to be close to the 3 GB size limit.

@mschwaig
Copy link
Member Author

mschwaig commented Mar 17, 2024

Since I don't have any ideas about how to get 'migraphx' to build, I have opened an issue about the build error upstream: ROCm/AMDMIGraphX#2892

I am marking this as ready for review to get some feedback.
Maybe someone else has any ideas, maybe the solution for now is to just mark that package as broken and get this merged.

EDIT: I found a PR which adds the file in question to composable_kernel, so let's see if that's enough to make migraphx build.

@mschwaig mschwaig marked this pull request as ready for review March 17, 2024 17:44
@mschwaig
Copy link
Member Author

That PR (ROCm/composable_kernel#1134) is not in the ROCm release tagged for 6.0.2 yet, and simply adding the correct include files to the migraphx build is not enough to make it build.

I'll see what they say in the issue that I opened, but it looks like marking migraphx as broken until there are tagged releases that work might be sensible.

@ulrikstrid
Copy link
Member

ulrikstrid commented Mar 19, 2024

Running nixpkgs-review, currently stuck on rocblas, seems to hang "forever" on 4 warnings generated when compiling for gfx942. I plan on letting it sit for some hours to see if it is doing something that is not showing but from CPU/RAM usage it doesn't look like it...

Edit, it seems like clang is doing something: 2855456 nixbld10 20 0 655788 586932 65536 R 100,0 0,2 2:13.88 clang-17

Other than that 46/54 packages are built so it looks optimistic.

@ulrikstrid
Copy link
Member

Result of nixpkgs-review pr 287846 run on x86_64-linux 1

16 packages marked as broken and skipped:
  • rocmPackages.llvm.flang
  • rocmPackages.llvm.flang.doc
  • rocmPackages.llvm.flang.info
  • rocmPackages.llvm.flang.man
  • rocmPackages.llvm.libclc
  • rocmPackages.migraphx
  • rocmPackages.rdc
  • rocmPackages.rdc.doc
  • rocmPackages_6.llvm.flang
  • rocmPackages_6.llvm.flang.doc
  • rocmPackages_6.llvm.flang.info
  • rocmPackages_6.llvm.flang.man
  • rocmPackages_6.llvm.libclc
  • rocmPackages_6.migraphx
  • rocmPackages_6.rdc
  • rocmPackages_6.rdc.doc
90 packages built:
  • blender-hip
  • rocmPackages.clang-ocl (rocmPackages_6.clang-ocl)
  • rocmPackages.clr (rocmPackages_6.clr)
  • rocmPackages.clr.icd (rocmPackages_6.clr.icd)
  • rocmPackages.composable_kernel (rocmPackages_6.composable_kernel)
  • rocmPackages.half (rocmPackages_6.half)
  • rocmPackages.hip-common (rocmPackages_6.hip-common)
  • rocmPackages.hipblas (rocmPackages_6.hipblas)
  • rocmPackages.hipcc (rocmPackages_6.hipcc)
  • rocmPackages.hipcub (rocmPackages_6.hipcub)
  • rocmPackages.hipfft (rocmPackages_6.hipfft)
  • rocmPackages.hipfort (rocmPackages_6.hipfort)
  • rocmPackages.hipify (rocmPackages_6.hipify)
  • rocmPackages.hiprand (rocmPackages.rocrand ,rocmPackages_6.hiprand ,rocmPackages_6.rocrand)
  • rocmPackages.hipsolver (rocmPackages_6.hipsolver)
  • rocmPackages.hipsparse (rocmPackages_6.hipsparse)
  • rocmPackages.hsa-amd-aqlprofile-bin (rocmPackages_6.hsa-amd-aqlprofile-bin)
  • rocmPackages.llvm.bintools (rocmPackages_6.llvm.bintools)
  • rocmPackages.llvm.clang (rocmPackages_6.llvm.clang)
  • rocmPackages.llvm.clang-tools-extra (rocmPackages_6.llvm.clang-tools-extra)
  • rocmPackages.llvm.clang-tools-extra.doc (rocmPackages_6.llvm.clang-tools-extra.doc)
  • rocmPackages.llvm.clang-tools-extra.info (rocmPackages_6.llvm.clang-tools-extra.info)
  • rocmPackages.llvm.clang-tools-extra.man (rocmPackages_6.llvm.clang-tools-extra.man)
  • rocmPackages.llvm.clang-unwrapped (rocmPackages_6.llvm.clang-unwrapped)
  • rocmPackages.llvm.clang-unwrapped.doc (rocmPackages_6.llvm.clang-unwrapped.doc)
  • rocmPackages.llvm.clang-unwrapped.info (rocmPackages_6.llvm.clang-unwrapped.info)
  • rocmPackages.llvm.clang-unwrapped.man (rocmPackages_6.llvm.clang-unwrapped.man)
  • rocmPackages.llvm.compiler-rt (rocmPackages_6.llvm.compiler-rt)
  • rocmPackages.llvm.libc (rocmPackages_6.llvm.libc)
  • rocmPackages.llvm.libc.doc (rocmPackages_6.llvm.libc.doc)
  • rocmPackages.llvm.libcxx (rocmPackages_6.llvm.libcxx)
  • rocmPackages.llvm.libcxx.doc (rocmPackages_6.llvm.libcxx.doc)
  • rocmPackages.llvm.libcxxabi (rocmPackages_6.llvm.libcxxabi)
  • rocmPackages.llvm.libunwind (rocmPackages_6.llvm.libunwind)
  • rocmPackages.llvm.libunwind.doc (rocmPackages_6.llvm.libunwind.doc)
  • rocmPackages.llvm.lld (rocmPackages_6.llvm.lld)
  • rocmPackages.llvm.lld.doc (rocmPackages_6.llvm.lld.doc)
  • rocmPackages.llvm.lldb (rocmPackages_6.llvm.lldb)
  • rocmPackages.llvm.lldb.doc (rocmPackages_6.llvm.lldb.doc)
  • rocmPackages.llvm.lldb.info (rocmPackages_6.llvm.lldb.info)
  • rocmPackages.llvm.lldb.man (rocmPackages_6.llvm.lldb.man)
  • rocmPackages.llvm.llvm (rocmPackages_6.llvm.llvm)
  • rocmPackages.llvm.llvm.doc (rocmPackages_6.llvm.llvm.doc)
  • rocmPackages.llvm.llvm.info (rocmPackages_6.llvm.llvm.info)
  • rocmPackages.llvm.llvm.man (rocmPackages_6.llvm.llvm.man)
  • rocmPackages.llvm.mlir (rocmPackages_6.llvm.mlir)
  • rocmPackages.llvm.openmp (rocmPackages_6.llvm.openmp)
  • rocmPackages.llvm.openmp.doc (rocmPackages_6.llvm.openmp.doc)
  • rocmPackages.llvm.openmp.info (rocmPackages_6.llvm.openmp.info)
  • rocmPackages.llvm.openmp.man (rocmPackages_6.llvm.openmp.man)
  • rocmPackages.llvm.polly (rocmPackages_6.llvm.polly)
  • rocmPackages.llvm.polly.doc (rocmPackages_6.llvm.polly.doc)
  • rocmPackages.llvm.polly.info (rocmPackages_6.llvm.polly.info)
  • rocmPackages.llvm.polly.man (rocmPackages_6.llvm.polly.man)
  • rocmPackages.llvm.pstl (rocmPackages_6.llvm.pstl)
  • rocmPackages.llvm.rocmClangStdenv (rocmPackages_6.llvm.rocmClangStdenv)
  • rocmPackages.miopen (rocmPackages.miopen-hip ,rocmPackages_6.miopen ,rocmPackages_6.miopen-hip)
  • rocmPackages.rccl (rocmPackages_6.rccl)
  • rocmPackages.rocalution (rocmPackages_6.rocalution)
  • rocmPackages.rocblas (rocmPackages_6.rocblas)
  • rocmPackages.rocdbgapi (rocmPackages_6.rocdbgapi)
  • rocmPackages.rocdbgapi.doc (rocmPackages_6.rocdbgapi.doc)
  • rocmPackages.rocfft (rocmPackages_6.rocfft)
  • rocmPackages.rocgdb (rocmPackages_6.rocgdb)
  • rocmPackages.rocm-cmake (rocmPackages_6.rocm-cmake)
  • rocmPackages.rocm-comgr (rocmPackages_6.rocm-comgr)
  • rocmPackages.rocm-core (rocmPackages_6.rocm-core)
  • rocmPackages.rocm-device-libs (rocmPackages_6.rocm-device-libs)
  • rocmPackages.rocm-docs-core (rocmPackages_6.rocm-docs-core)
  • rocmPackages.rocm-docs-core.dist (rocmPackages_6.rocm-docs-core.dist)
  • rocmPackages.rocm-runtime (rocmPackages_6.rocm-runtime)
  • rocmPackages.rocm-smi (rocmPackages_6.rocm-smi)
  • rocmPackages.rocm-thunk (rocmPackages_6.rocm-thunk)
  • rocmPackages.rocminfo (rocmPackages_6.rocminfo)
  • rocmPackages.rocmlir (rocmPackages_6.rocmlir)
  • rocmPackages.rocmlir-rock (rocmPackages_6.rocmlir-rock)
  • rocmPackages.rocmlir.external (rocmPackages_6.rocmlir.external)
  • rocmPackages.rocprim (rocmPackages_6.rocprim)
  • rocmPackages.rocprofiler (rocmPackages_6.rocprofiler)
  • rocmPackages.rocr-debug-agent (rocmPackages_6.rocr-debug-agent)
  • rocmPackages.rocsolver (rocmPackages_6.rocsolver)
  • rocmPackages.rocsparse (rocmPackages_6.rocsparse)
  • rocmPackages.rocthrust (rocmPackages_6.rocthrust)
  • rocmPackages.roctracer (rocmPackages_6.roctracer)
  • rocmPackages.rocwmma (rocmPackages_6.rocwmma)
  • rocmPackages.rpp (rocmPackages.rpp-hip ,rocmPackages_6.rpp ,rocmPackages_6.rpp-hip)
  • rocmPackages.rpp-cpu (rocmPackages_6.rpp-cpu)
  • rocmPackages.rpp-opencl (rocmPackages_6.rpp-opencl)
  • rocmPackages.tensile (rocmPackages_6.tensile)
  • rocmPackages.tensile.dist (rocmPackages_6.tensile.dist)

Copy link
Member

@ulrikstrid ulrikstrid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell this looks good to go

Comment on lines +48 to +54
let stdenv' = stdenv; in
let stdenv =
if stdenv'.cc.cc.isGNU or false && lib.versionAtLeast stdenv'.cc.cc.version "13.0"
then gcc12Stdenv
else stdenv';
in

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this still be here? Doesn't rocm-llvm 6 build with gcc 13?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I will verify that it builds with gcc 13 and remove this if it does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a good machine for building things so let me know if you want me to test something

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a good machine for building things so let me know if you want me to test something

Thanks for the offer. I have 128 GB of RAM now as well, and that's enough for me to build ROCm now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still does not build with GCC 13.
What fails to build without gcc12Stdenv is rStdenv and runtimes inside llvm/default.nix.

I was thinking about pushing a change to reduce the usages down to strictly those, but I have the feeling that better to not have more than one specific version of GCC involved there.

@@ -0,0 +1,58 @@
{ # stdenv FIXME: Try changing back to this with a new ROCm release https://github.com/NixOS/nixpkgs/issues/271943
gcc12Stdenv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this.

@mschwaig
Copy link
Member Author

I think this in terms of content this PR is ready to get merged now.
I just want to allow a bit more time for reviews and for people to give input on those remaining issues.

Copy link
Member

@Flakebi Flakebi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is great!

pkgs/development/rocm-modules/6/clr/default.nix Outdated Show resolved Hide resolved
NixOS#286720 introduced these patches to address
a specific compilation error mentioned in ROCm/HIP#3403,
but added them to the source tree because they were originally for ROCm 6.

For ROCm 6, we can now switch to using fetchpatch to get the original commits as patches.
@mschwaig mschwaig requested a review from Flakebi March 22, 2024 00:39
@ulrikstrid
Copy link
Member

I'm currently running nixpkgs-review, but from what I can tell the changes looks good. So if everything builds I would be happy to merge this.

@mschwaig
Copy link
Member Author

Technically some reviews happened that left comments and not approvals.
I think I addressed those comments reasonably, but I don't know if I should still wait for those reviewers approve or leave further comments.

@ulrikstrid, if you say those reviewers left their comments and we don't have to go through the motions of waiting for their approval I'm fine with you merging this now. I would give them some more time, because I don't have that confidence, but I get that there is also a cost to dragging things out unnecessarily.


These are the results of my run (of just the commit, not merging the PR into master).

The build of rocprofiler for both 5.7 and 6.0 seems a bit flaky. With ´nixpkgs-review´ it fails quite often for me.

Result of nixpkgs-review run on x86_64-linux 1

15 packages marked as broken and skipped:
  • rocmPackages.llvm.flang
  • rocmPackages.llvm.flang.doc
  • rocmPackages.llvm.flang.info
  • rocmPackages.llvm.flang.man
  • rocmPackages.llvm.libclc
  • rocmPackages.migraphx
  • rocmPackages.rdc
  • rocmPackages.rdc.doc
  • rocmPackages_5.llvm.flang
  • rocmPackages_5.llvm.flang.doc
  • rocmPackages_5.llvm.flang.info
  • rocmPackages_5.llvm.flang.man
  • rocmPackages_5.llvm.libclc
  • rocmPackages_5.rdc
  • rocmPackages_5.rdc.doc
1 package failed to build:
  • rocmPackages_5.rocprofiler
182 packages built:
  • blender-hip
  • rocmPackages.clang-ocl
  • rocmPackages.clr
  • rocmPackages.clr.icd
  • rocmPackages.composable_kernel
  • rocmPackages.half
  • rocmPackages.hip-common
  • rocmPackages.hipblas
  • rocmPackages.hipcc
  • rocmPackages.hipcub
  • rocmPackages.hipfft
  • rocmPackages.hipfort
  • rocmPackages.hipify
  • rocmPackages.hiprand
  • rocmPackages.hipsolver
  • rocmPackages.hipsparse
  • rocmPackages.hsa-amd-aqlprofile-bin
  • rocmPackages.llvm.bintools
  • rocmPackages.llvm.clang
  • rocmPackages.llvm.clang-tools-extra
  • rocmPackages.llvm.clang-tools-extra.doc
  • rocmPackages.llvm.clang-tools-extra.info
  • rocmPackages.llvm.clang-tools-extra.man
  • rocmPackages.llvm.clang-unwrapped
  • rocmPackages.llvm.clang-unwrapped.doc
  • rocmPackages.llvm.clang-unwrapped.info
  • rocmPackages.llvm.clang-unwrapped.man
  • rocmPackages.llvm.compiler-rt
  • rocmPackages.llvm.libc
  • rocmPackages.llvm.libc.doc
  • rocmPackages.llvm.libcxx
  • rocmPackages.llvm.libcxx.doc
  • rocmPackages.llvm.libcxxabi
  • rocmPackages.llvm.libunwind
  • rocmPackages.llvm.libunwind.doc
  • rocmPackages.llvm.lld
  • rocmPackages.llvm.lld.doc
  • rocmPackages.llvm.lldb
  • rocmPackages.llvm.lldb.doc
  • rocmPackages.llvm.lldb.info
  • rocmPackages.llvm.lldb.man
  • rocmPackages.llvm.llvm
  • rocmPackages.llvm.llvm.doc
  • rocmPackages.llvm.llvm.info
  • rocmPackages.llvm.llvm.man
  • rocmPackages.llvm.mlir
  • rocmPackages.llvm.openmp
  • rocmPackages.llvm.openmp.doc
  • rocmPackages.llvm.openmp.info
  • rocmPackages.llvm.openmp.man
  • rocmPackages.llvm.polly
  • rocmPackages.llvm.polly.doc
  • rocmPackages.llvm.polly.info
  • rocmPackages.llvm.polly.man
  • rocmPackages.llvm.pstl
  • rocmPackages.llvm.rocmClangStdenv
  • rocmPackages.miopen
  • rocmPackages.rccl
  • rocmPackages.rocalution
  • rocmPackages.rocblas
  • rocmPackages.rocdbgapi
  • rocmPackages.rocdbgapi.doc
  • rocmPackages.rocfft
  • rocmPackages.rocgdb
  • rocmPackages.rocm-cmake
  • rocmPackages.rocm-comgr
  • rocmPackages.rocm-core
  • rocmPackages.rocm-device-libs
  • rocmPackages.rocm-docs-core
  • rocmPackages.rocm-docs-core.dist
  • rocmPackages.rocm-runtime
  • rocmPackages.rocm-smi
  • rocmPackages.rocm-thunk
  • rocmPackages.rocminfo
  • rocmPackages.rocmlir
  • rocmPackages.rocmlir-rock
  • rocmPackages.rocmlir.external
  • rocmPackages.rocprim
  • rocmPackages.rocprofiler
  • rocmPackages.rocr-debug-agent
  • rocmPackages.rocsolver
  • rocmPackages.rocsparse
  • rocmPackages.rocthrust
  • rocmPackages.roctracer
  • rocmPackages.rocwmma
  • rocmPackages.rpp (rocmPackages.rpp-hip)
  • rocmPackages.rpp-cpu
  • rocmPackages.rpp-opencl
  • rocmPackages.tensile
  • rocmPackages.tensile.dist
  • rocmPackages_5.clang-ocl
  • rocmPackages_5.clr
  • rocmPackages_5.clr.icd
  • rocmPackages_5.composable_kernel
  • rocmPackages_5.half
  • rocmPackages_5.hip-common
  • rocmPackages_5.hipblas
  • rocmPackages_5.hipcc
  • rocmPackages_5.hipcub
  • rocmPackages_5.hipfft
  • rocmPackages_5.hipfort
  • rocmPackages_5.hipify
  • rocmPackages_5.hiprand
  • rocmPackages_5.hipsolver
  • rocmPackages_5.hipsparse
  • rocmPackages_5.hsa-amd-aqlprofile-bin
  • rocmPackages_5.llvm.bintools
  • rocmPackages_5.llvm.clang
  • rocmPackages_5.llvm.clang-tools-extra
  • rocmPackages_5.llvm.clang-tools-extra.doc
  • rocmPackages_5.llvm.clang-tools-extra.info
  • rocmPackages_5.llvm.clang-tools-extra.man
  • rocmPackages_5.llvm.clang-unwrapped
  • rocmPackages_5.llvm.clang-unwrapped.doc
  • rocmPackages_5.llvm.clang-unwrapped.info
  • rocmPackages_5.llvm.clang-unwrapped.man
  • rocmPackages_5.llvm.compiler-rt
  • rocmPackages_5.llvm.libc
  • rocmPackages_5.llvm.libc.doc
  • rocmPackages_5.llvm.libcxx
  • rocmPackages_5.llvm.libcxx.doc
  • rocmPackages_5.llvm.libcxxabi
  • rocmPackages_5.llvm.libunwind
  • rocmPackages_5.llvm.libunwind.doc
  • rocmPackages_5.llvm.lld
  • rocmPackages_5.llvm.lld.doc
  • rocmPackages_5.llvm.lldb
  • rocmPackages_5.llvm.lldb.doc
  • rocmPackages_5.llvm.lldb.info
  • rocmPackages_5.llvm.lldb.man
  • rocmPackages_5.llvm.llvm
  • rocmPackages_5.llvm.llvm.doc
  • rocmPackages_5.llvm.llvm.info
  • rocmPackages_5.llvm.llvm.man
  • rocmPackages_5.llvm.mlir
  • rocmPackages_5.llvm.openmp
  • rocmPackages_5.llvm.openmp.doc
  • rocmPackages_5.llvm.openmp.info
  • rocmPackages_5.llvm.openmp.man
  • rocmPackages_5.llvm.polly
  • rocmPackages_5.llvm.polly.doc
  • rocmPackages_5.llvm.polly.info
  • rocmPackages_5.llvm.polly.man
  • rocmPackages_5.llvm.pstl
  • rocmPackages_5.llvm.rocmClangStdenv
  • rocmPackages_5.migraphx
  • rocmPackages_5.miopen (rocmPackages_5.miopen-hip)
  • rocmPackages_5.miopen-opencl
  • rocmPackages_5.miopengemm
  • rocmPackages_5.miopengemm.doc
  • rocmPackages_5.rccl
  • rocmPackages_5.rocalution
  • rocmPackages_5.rocblas
  • rocmPackages_5.rocdbgapi
  • rocmPackages_5.rocdbgapi.doc
  • rocmPackages_5.rocfft
  • rocmPackages_5.rocgdb
  • rocmPackages_5.rocm-cmake
  • rocmPackages_5.rocm-comgr
  • rocmPackages_5.rocm-core
  • rocmPackages_5.rocm-device-libs
  • rocmPackages_5.rocm-docs-core
  • rocmPackages_5.rocm-docs-core.dist
  • rocmPackages_5.rocm-runtime
  • rocmPackages_5.rocm-smi
  • rocmPackages_5.rocm-thunk
  • rocmPackages_5.rocminfo
  • rocmPackages_5.rocmlir
  • rocmPackages_5.rocmlir-rock
  • rocmPackages_5.rocmlir.external
  • rocmPackages_5.rocprim
  • rocmPackages_5.rocr-debug-agent
  • rocmPackages_5.rocsolver
  • rocmPackages_5.rocsparse
  • rocmPackages_5.rocthrust
  • rocmPackages_5.roctracer
  • rocmPackages_5.rocwmma
  • rocmPackages_5.rpp (rocmPackages_5.rpp-hip)
  • rocmPackages_5.rpp-cpu
  • rocmPackages_5.rpp-opencl
  • rocmPackages_5.tensile
  • rocmPackages_5.tensile.dist

Copy link
Member

@Flakebi Flakebi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn’t build all of this, but ran the OpenCL and rocm-smi tests. LGTM from me.

eval $(nix-build -A rocmPackages.clr.icd.impureTests.rocm-smi)
eval $(nix-build -A rocmPackages.clr.icd.impureTests.opencl-example)

Copy link
Member

@ulrikstrid ulrikstrid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had 0 failures when building and I think it looks good to go.

I'll give it a bit more time if you want to @mschwaig before I merge. I guess it's just @Tungsten842 that has a open comment right?

@mschwaig
Copy link
Member Author

That's right. There are no other open review comments, besides @Tungsten842's.

Using my own judgement about that issue and the discussion around it, I think we are at the point where we should merge.

I think most remaining issues listed in the top comment are worse than still using GCC 12, and we are also not waiting for those to be fixed either.

@ulrikstrid
Copy link
Member

Lets ship it then!

@ulrikstrid ulrikstrid merged commit b10ff24 into NixOS:master Mar 22, 2024
30 of 31 checks passed
@Tungsten842
Copy link
Member

Great job 🎉

@kurnevsky
Copy link
Member

torchWithRocm still depends on miopen-opencl so it now fails with 'miopen-opencl' has been deprecated.

@kurnevsky
Copy link
Member

Also rocblas times out on hydra: https://hydra.nixos.org/build/254017164

@mschwaig
Copy link
Member Author

And composable_kernel exceeds the output limit: https://hydra.nixos.org/build/254015643

@mschwaig mschwaig deleted the rocm-6.0.2 branch March 24, 2024 11:53
@mschwaig
Copy link
Member Author

Also rocblas times out on hydra: https://hydra.nixos.org/build/254017164

I don't understand why rocblas times out, but other derivations that take much longer to build, like composable_kernel, do not.

rocsovleris the only ROCm package that I could find that has an explicit timeout set:
https://github.com/mschwaig/nixpkgs/blob/9a4f48bb251a2275293a85611a503b46bbdcf9cb/pkgs/development/rocm-modules/6/rocsolver/default.nix#L88-L98

How should we address this issue?

@mschwaig
Copy link
Member Author

And composable_kernel exceeds the output limit: https://hydra.nixos.org/build/254015643

Why is composable_kernel too large?

$ du --max-depth 1 -h /nix/store/sfybgqj64vhgp386fd3xw0fkjx0wvzwf-composable_kernel-6.0.2
3.7M	/nix/store/sfybgqj64vhgp386fd3xw0fkjx0wvzwf-composable_kernel-6.0.2/include
30K	/nix/store/sfybgqj64vhgp386fd3xw0fkjx0wvzwf-composable_kernel-6.0.2/share
681M	/nix/store/sfybgqj64vhgp386fd3xw0fkjx0wvzwf-composable_kernel-6.0.2/lib
684M	/nix/store/sfybgqj64vhgp386fd3xw0fkjx0wvzwf-composable_kernel-6.0.2

That's weil below the 3 GB limit, and I though I would be surprised if symlinked contents counted I also cannot find any:

$ find /nix/store/sfybgqj64vhgp386fd3xw0fkjx0wvzwf-composable_kernel-6.0.2 -type l -ls
$ 
``

@mschwaig
Copy link
Member Author

The NAR-size reported by nix-tree is orders of magnitude larger (3.69 GiB instead of 684 MB):

┌────────────────────────────┬────────────────────────────┬────────────────────────────┐
│                            │composable_kernel-6 3.69 GiB│                            │
│                            │                            │                            │
└────────────────────────────┴────────────────────────────┴────────────────────────────┘
/nix/store/sfybgqj64vhgp386fd3xw0fkjx0wvzwf-composable_kernel-6.0.2                     
NAR Size: 3.69 GiB | Closure Size: 3.69 GiB | Added Size: 3.69 GiB                      
Immediate Parents: -                                                                    

@mschwaig
Copy link
Member Author

mschwaig commented Mar 26, 2024

Turns out I should have used the --apparent-size flag with du to get the actual size of the uncompressed data on ZFS, so the composable_kernel output is (probably) really too large for the cache.

@mschwaig
Copy link
Member Author

mschwaig commented Mar 26, 2024

Turns out the issue is one huge .a file, which can actually be compressed quite nicely.

zstd ck6/lib/libdevice_operations.a -o libdevice_operations.a
ck6/lib/libdevice_operations.a :  4.94%   (  3.69 GiB =>    186 MiB, libdevice_operations.a) 

And migraphx and miopen, the only downstrem dependencies that I found, do not retain a runtime dependency on libdevice_operations.a and are also nowhere near that offensively large.

So I am actually tempted to compress that file in the output of the original build derivation, and then have another derivation based on runCommandLocal to decompress it.

Is that the kind of evil sorcery that would be allowed in nixpkgs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update request: rocmPackages.*5.7.1→ 6.0.1
6 participants