-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move libtorch bits to PREFIX & enable cuDSS, cuSPARSELt, CUPTI on win; fix duplicate .pyc
versions
#328
Conversation
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( I do have some suggestions for making it better though... For recipe/meta.yaml:
This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12939294964. Examine the logs at this URL for more detail. |
Would also be good to address #327 in this PR. |
(draft until all builds are enabled again) |
So the extra CUDA deps on windows seem to work fine 🥳 |
Okay, I'll try adding some logic for that here later today. |
Ok, I've decided to instead start by adding error handling to By the way, any chance you could give me the power to run CI myself? |
You already should have AFAIU, and I don't have the power to change it further. This should be a matter of Quansight/open-gpu-server#52 plus a sync of the setup to https://github.com/conda-forge/.cirun (which happened), so I'm really not sure where this goes wrong. Pinging @jaimergp & @aktech One conceivable angle is that the lack of authorization for the windows agents prevents you from starting any job at all (though IIRC you already weren't able to start things before we merged windows support). For that, you could make a PR like conda-forge/.cirun#27 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to add
if %ERRORLEVEL% neq 0 exit 1
to calls that can fail, rather than adding || goto error
everywhere. There's no functional difference AFAIK, and although both styles have their pros & cons, the goto-variant is quite rare in conda-forge (a quick search shows 170 feedstocks using goto, while ~1.6 feedstocks use an errorlevel variant).
@mgorny You need to add your name here: https://github.com/conda-forge/.cirun/blob/218150c90899c7dfeb771d0bd9b176ca191848f3/.access.yml#L89 like this PR: conda-forge/.cirun#27 and this is subject to @wolfv's approval I believe as they (prefix.dev) are the ones sponsoring the windows machine. and you need to be in any of these roles for this repo to be able to trigger workflows that involves GPU CI runs. cc @jaimergp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're going to need existence tests for the DLLs to be under %LIBRARY_BIN%
.
The tests are already there, and just need to be modified, e.g.
pytorch-cpu-feedstock/recipe/meta.yaml
Line 199 in 39b84c3
- if not exist %SP_DIR%\torch\lib\{{ each_lib }}.dll exit 1 # [win] |
(similar below)
Thanks for the quick response @aktech!
Ah, perhaps it's just a question of making @mgorny a maintainer here? That's a hypothesis we can test easily (given your involvement here, I'm assuming you have nothing against this @mgorny) 😉 |
Well, perhaps I did have the power after all: 4ff92b3 ;-) We'll find out on your next push here! |
Ok, I'll replace |
Sorry, I was talking about the linux builds. For the windows agents, we'll definitely still need to wait for conda-forge/.cirun#32 |
You're probably aware, but just for completeness (and because batch is an insane language): |
Ah, okay, sorry. Either case, I really need to take a break now, so maybe it'll be done when I get back. |
Don't you want to enable the remaining Windows builds? I wanted to add removal of |
Of course, but I wanted to solve all the current issues first, and also ask @hmaarrfk and @isuruf for review, if they have time. CC also @danpetry if interested.
How high is your confidence that this will work first try? I think this PR is already doing a lot, and I'd like to get it out so we can start building dependent packages and test out whether things work as intended. I wouldn't be surprised if follow-ups turn out to be necessary anyway, so I'd prefer to put |
.pyc
versions
Nevermind, I mind found an issue with the windows CUDA builds that needs fixing anyway, so pushing a rerendered build, but will cancel everything but the windows+CUDA job. And it'll also be good to see if CI on osx passes with the couple relevant changes here. |
…nda-forge-pinning 2025.01.23.20.59.37
This reverts commit fc5693c.
the win+cuda build passed fine, and so did the osx builds. I've launched aarch+CUDA as a final smoketest, but I expect this PR to be ready. Please let me know if there are comments @conda-forge/pytorch-cpu, otherwise I'm looking to merge this soon. |
I'd still do the |
Can you raise a PR on your fork against this branch? If you enable actions on your repo, this should actually run on the open-gpu server (at least it does by default on my fork). That way we can keep the CI runs separate and don't have to roll back if it goes wrong. |
Done that in mgorny#4. It seems that it created the 16 jobs, but I don't see the CIrun gating job, so I don't think they'll start. But I have another idea: I'll just open another PR to this repo, adding the extra commits, and you'll merge whichever works. |
Sounds good to me! |
Any comments here @conda-forge/pytorch-cpu? CI in #331 (which adds one more change on top of this PR) is looking good and I'd like to merge this soon. |
Alrighty! Thanks a lot for the reviews! Post-merge comments are still very welcome, we'll have yet another round with #318 coming right up anyway where we can take this up. My goal would be to wrap up that PR in a few days as well and then get 2.6 out the door. 🙃 |
Ah dammit, the workaround for conda-forge/conda-forge-pinning-feedstock#6910, causes windows to fail here 😑
(logs, originally for commit 1c645fa, now force-pushed away) |
Checklist
0
(if the version changed)conda-smithy
(Use the phrase@conda-forge-admin, please rerender
in a comment in this PR for automated rerendering)Fixes #319
Fixes #320
Fixes #321
Fixes #327
Fixes #329