-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adaptivecpp: init at 24.06.0 #360893
base: master
Are you sure you want to change the base?
adaptivecpp: init at 24.06.0 #360893
Conversation
Interesting that you apparently had no issues with the rocm build, while I had quite some trouble just to try make it compile. I do see that the clhpp package in nixpkgs was updated a few weeks ago, and in general, some time has passed, so that might have helped. As for the open questions that you posed, I'd say updating to rocm6 is preferable, if it's possible as well as creating an alias in |
False alarm on ROCm 6 building, I forgot However, it SHOULD be possible. Building the arch-linux package in distrobox works just fine. Their build script doesn't do anything special. I think we might just need a slightly newer ROCm for some fixes? They use ROCm 6.2.4 and LLVM 18.1. We're on ROCm 6.0.2 |
As it turns out, the issue was caused by Nix' default hardening options (specifically zerocallusedregs)
|
b4658da
to
f10b721
Compare
Adds adaptivecpp, based on opensycl package. Updated to newest version and to use LLVM 17 and ROCm 6.
f10b721
to
1f7d58e
Compare
I've added the test suite, but I'm not sure it actually runs on GPU. It compiles and runs without complaints but it only seems to use CPU (nothing shows up in nvtop while btop spikes). Looking at their CI, they seem to specify specific Target Architecture. It's not the most elegant, but I've passed it through like this, though it doesn't actually seem to make any difference in what runs: nix-build -A adaptivecpp.tests --arg config '{ rocmSupport = true; }' --arg targetsBuild '"omp;hip:gfx1030"' --arg targetsRun '"omp;hip"' Could you kindly take a look at this @yboettcher? |
I might be wrong, but given that nix builds are usually rather "sealed off" I would not expect a nix-build call to be able to use (or even detect) a gpu. I cloned your branch, built the adativecppWithRocm package (with nix-build) and then used that to try and build the adaptivecpp tests manually (in a clone of the adaptivecpp repo).
to the postFixup in addition to whats already there.
messages. Setting the mentioned variables however, only added some initial output about a few AMD_COMGR_ACTION_... calls that all ended with AMD_COMGR_STATUS_SUCCESS, so I guess that's not helpful. And I think this is also how far I've gotten in trying to make this work: Explicit compilation for a specific target works (when adding that extra That said, I think I'd rather have a dysfunctional generic target with an all around updated compiler where you can at least specify a specific target, than no updated compiler (which also does not even have a generic target). Although I would prefer if we could somehow make this work, but I honestly don't know why it fails. It might even be that the way rocm is installed on nixos is the problem. Or not. |
I second this, having a up to date version would be much appreciated even if the generic backend does not yet work. |
Hi - this discussion has made it into the AdaptiveCpp discord ;) Without having seen the JIT logs it's hard to say why If ROCm thinks that it has generated the device binary successfully, then it is possible that there is an internal ROCm issue. IIRC there were also some ROCm versions where the retrieval of the ROCm log was prevented by a bug (in case of an error, ROCm would return before filling the error log). This might play a role here. I don't want get involved in your internal prioritization, but perhaps it is helpful for your decision process to outline how we see these compilation flows from the perspective of the AdaptiveCpp project.
|
Looking at the output more closely, I'm not convinced the executables should be wrapped. I've built it without any wrapping, and I got different results for the tests though (both with and without wrapping): |
I just tested the generic backend on Nvidia hardware, it seems to be working correctly. |
@illuhad, thank you very much for your insights! Given that we get the
Given that I added some more details (including the dumped IR) here. @blenderfreaky I noticed that when I run a binary that was compiled with Log excerpt when executing
|
@yboettcher thanks for the details. I assumed that since
there was an issue retrieving the logs. Anyway, from you
for each of the relevant ROCm bitcode libraries like From you IR output, we can see that it tries to call The way the AdaptiveCpp amdgpu backend discovers these bitcode libraries is at the moment a bit hacky (for various reasons): It tries to invoke
You can also try |
I am guessing here, but given that it works on omp and nvidia, I would assume that the problem is not with this adaptivecpp derivation, but maybe somewhere in how rocm is working on nixos. But that's just a guess. In any case, unless we can find some "simple fix" by just supplying adaptivecpp with some parameter or including another buildInput or something like that, I would say that fixing this issue is out of scope of this PR, and I would say we can merge. Cpu and NVidia users gain the |
It turns out I linked to the wrong path in the nix store. I now also get the Looking into the code @illuhad linked, acpp should try to run
Passing I tried re-adding the wrapper, however that didn't help at all. @yboettcher, how exactly did you pass the arguments for it to see ROCm? Revisting the One thing I noticed is that Nonetheless, peppering in some debug statements, it doesn't seem like it actually ever receives those arguments from the json files, so maybe this is a dead-end path anyways? Even though
If you want to try the patch-stuff I did, I've put it in a separate branch for now to keep this one somewhat merge-able. See blenderfreaky:adaptivecpp-amd |
Yep, I think too that one of them is likely incorrect.
The fallback paths are not well tested. I think the main assumption is that we require a working ROCm installation, and a working ROCm installation will have a functioning Is it possible to fix |
I don't think Afaict, all that's needed is to somehow pass |
What is the workflow for users? Can they just invoke As I said, the fallback path is not well-tested. It would be more predictable to get it to work with The fallback path seems to assume a standard ROCm directory layout where the bitcode libraries live in |
Ah ok, makes sense. I compiled and ran the example in arch via distrobox and compared the comgr commands, there are some interesting differences. On Arch:
On Nix:
It seems the two are using different methods for linking, no idea why though. |
@blenderfreaky Are we looking at the same ROCm versions? IIRC there was a change where they changed how builtin bitcode libraries were included in the compilation. But it was back around ROCm 5.7. I'm also a bit surprised to see EDIT: Notice the two different comgr actions:
This indicates that the difference might be related to the mentioned ROCm change. |
@illuhad We're using ROCm 6.0.2 (Nix seems a little out of date here), the arch one is on 6.2.1 & 6.2.4 for some packages. Both are definitely >= ROCm 6 though. I interpreted the difference in Action as the Arch one doing Linking and stuff in one step, whereas on the Nix one splits it out into several, linking in a later step. @yboettcher Can you make any sense of the ofBorg failures? It looks like it's just the test failing, however:
We could just comment them out to get this merged sooner though I think, seeing as they don't run as they should anyways and it's working for Nvidia already. |
I actually have the same failures ofborg has on my end. It appears that for the normal
but when building the normal Edit:
|
Good catch, added python to the build deps. Now ofBorg fails on darwin, but works on linux. Just to finally get something for NVIDIA and OMP merged, I've just simply marked the tests package as broken on darwin for now. |
I've been playing around with #367695, and I've managed to successfully run adaptivecpp on my AMD GPU (with the generic target afaict). With the newer ROCm 6.3.1 in place (just merging the two PRs branches), all that's needed is a merged ROCm derivation with the wrapped hipcc, and telling CMake about the ROCm path. Along the way, I've also updated to the 24.10 version. @yboettcher Originally, the package used @tdavidcl, could you verify the updated version still plays nice with NVIDIA? If NVIDIA works, I'd maintain that this can be merged as is. Once the ROCm update lands, this package should get working ROCm automatically. Maybe mark it as |
Awesome that you got it to work 😁 When you say As for darwin, this is where the darwin call was initially added: https://github.com/NixOS/nixpkgs/pull/236274/files#r1264687523. I also had issues with ofborg builds on darwin and apparently, that was what was necessary to make it work. Maybe using a newer sdk would help the darwin build? I don't know whether there is any other way to quickly check this without waiting for ofborg (aside from using apple hardware, which I don't have). It does seem though that nixpkgs-vet does not like it when a package in Would it be possible to use the version of the rocm package in the broken check? As in As for the tests, I do like automated tests, but since the generic workflow depends on the available hardware and nixpkgs will only be able to test omp, it would not be able to detect any rocm/cuda specific failures, which does feel suboptimal. One could remove the targetsBuild input, or at least make it default to null, since — if I remember correctly — not passing ACPP_TARGETS should make it use the generic workflow (which you talk about in one of the comments), which in turn would only be able to discover the cpu and then run the omp backend. Then the test would at least show that the compiler and generic/omp workflow are working in a simple situation. |
If the generic JIT compiler is enabled in the AdaptiveCpp build, then it will indeed default to that for an empty As you say
Note that device discovery is independent from |
The reason I moved it was because nix-vet complained. By telling CMake I meant the two lines
which seem to be the variables it looks up. Marking it as I'm gonna refactor the tests as |
e6635f0
to
8213174
Compare
If I'm understanding right, ofBorg is down for good now that 2024 has ended. The Eval GitHub Actions pass, though I don't think they're building tests. |
88445bc
to
41f5d9c
Compare
41f5d9c
to
f32a72f
Compare
OpenSYCL
has been renamed toAdaptiveCpp
, "due to external legal pressure" (see their repo)The package is a 1:1 of
pkgs/development/compilers/opensycl/default.nix
, with the repository and name updated, as well as the version bumped to the newest.Open questions:
opensycl
package and add an alias or warning?and does seem towork now, but I haven't done sufficient testing yetTagging maintainers: @yboettcher
Things done
Built successfully on x86_64 with AMD (ROCm) GPU.
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.