-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build failure: rocmPackages_5.rocblas #302412
Comments
As a work-around: rocmPackages_5 = super.rocmPackages_5 // {
rocblas = (super.rocmPackages_5.rocblas.override {
# Work-around for https://github.com/ROCm/Tensile/issues/1757
# https://www.reddit.com/r/ROCm/comments/1bd8vde/psa_rdna1_gfx1010gfx101_gpus_should_start_working/
# for ROCm 5.2+ till 6.1 released
tensileLazyLib = false;
tensileSepArch = false; # https://github.com/ROCm/rocBLAS/issues/1339#issuecomment-1682846493
gpuTargets = ["gfx1010"];
}).overrideDerivation (oldAttrs: {
# work-around for https://github.com/NixOS/nixpkgs/issues/302412
postPatch = ''
${oldAttrs.postPatch}
rm -v /build/source/build/Tensile/library/Kernels.so-000-gfx1010.hsaco
'';
});
}; But it took ~5h for me to re-build all downstream dependencies because I used this in overlay. So maybe it also worth to re-consider default values until ROCm 6.1 released as mentioned in https://www.reddit.com/r/ROCm/comments/1bd8vde/psa_rdna1_gfx1010gfx101_gpus_should_start_working/
Or maybe there is a way to split package into build and run-time dependencies. |
We have an open PR which I think should address this for ROCm 6.0 : Could you also use ROCm 6.0 if we had that fix? EDIT: thanks for opening this issue and sharing your workaround |
Yes! That PR looks exactly what I need. Fixes both issue with options not been usable by dropping those GPU buckets for Tensile, and also set switches lazy-loading off by default.
I could, once it will be in stable release. Right now I'm on NixOS 23.11 which still have only 5.x. But 1-2 months we are going to have 23.05, as I understand. P.S. Thank you for maintaining these packages. |
No luck after upgrade to 24.05 😢. Without
(and it is ROCm 6.0.2 judging by symlinks like And that's despite the fact that currently # https://github.com/ROCm/Tensile/issues/1757
# Allows gfx101* users to use rocBLAS normally.
# Turn the below two values to `true` after the fix has been cherry-picked
# into a release. Just backporting that single fix is not enough because it
# depends on some previous commits.
, tensileSepArch ? false
, tensileLazyLib ? false I also double-checked that those are rocmPackages_6 = prev.rocmPackages_6 // {
rocblas = (prev.rocmPackages_6.rocblas.override {
# Work-around for https://github.com/ROCm/Tensile/issues/1757
# https://www.reddit.com/r/ROCm/comments/1bd8vde/psa_rdna1_gfx1010gfx101_gpus_should_start_working/
# for ROCm 5.2+ till 6.1 released
tensileLazyLib = false;
tensileSepArch = false; # https://github.com/ROCm/rocBLAS/issues/1339#issuecomment-1682846493
# gpuTargets = ["gfx1010"];
});
}; It looks like as if with ROCm 6 those flags are not effective. P.S. I think that |
Rocblas is in the cache (https://hydra.nixos.org/search?query=rocblas), but Ollama was doing an override so it resulted in a cache miss. Does it work if you use nixpkgs after this commit: fbb5b1b I can confirm on my system I was getting a cache miss, but now it is able to use rocblas from Hydra after syncing past that point for nixpkgs. Edit: Another note is that overriding rocblas the way you are doesn't entirely work. It will override a direct reference, but I find that other rocmPackages (e.g. rocsolver and hipblas) still refer to a non-overridden version of rocblas. I'm still trying to figure out the proper way to do that override. |
Nice finding! I tried to pick it from Would be nice to backport it to P.S. Next override to replicate that change works too: ollama = pkgs.ollama.override {
rocmPackages = pkgs.rocmPackages // {
# Ignore overrides for rocblas
rocblas = pkgs.rocmPackages.rocblas // { override = (attrs: pkgs.rocmPackages.rocblas); };
};
}; |
As of 2024-09-01 it is back-ported to 24.05. Closing as original issue no longer exists with ROCm 6 |
Steps To Reproduce
Steps to reproduce the behavior:
tensileLazyLib
andtensileSepArch
both tofalse
to follow work-around for RDNA 1 like in Debian/build/source/build/Tensile/library/Kernels.so-000-gfx1010.hsaco
Build log
Additional context
This likely caused by symlinking to Nix store in
nixpkgs/pkgs/development/rocm-modules/5/rocblas/default.nix
Lines 152 to 154 in 3feca97
This might also make overriding
gpuTargets
a bit less effective as it being re-overridden again in those intermediate packages likerocblas-tensile-gfx90
regardless of what originalrocblas
package have.Notify maintainers
From
teams.rocm.members
:Metadata
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste the result.Add a 👍 reaction to issues you find important.
The text was updated successfully, but these errors were encountered: