-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Pixi for dependency management #4
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jayqi This is excellent. 👏 👏 An enormous thank you for digging into this and finding a much more efficient solution!!
Poking around, I agree and like the idea of switching to Pixi. It's faster and easy to work with, a really great suggestion.
Pixi is a full environment management tool. It supports Conda and PyPI packages but is not interoperable with Conda environments. That means we'll need to update our Dockerfile and entrypoint.sh script to use Pixi. I think this is probably fine: I expect pixi install -e gpu and pixi run -e gpu should work.
Before officially switching over, I'd like to poke around with making these updates to check if we run into any additional issues.
@klwetstone it'll also be good to throw in some PyPI packages just to make sure that works as expected. |
@klwetstone you should just take over this branch and push any commits you think make sense. |
Documenting steps updating docker container:
We may need to change the base image from |
@klwetstone Yeah, we don't need the Micromamba image because Pixi is entirely an alternative to Micromamba. Looks like there are official Pixi Docker images: https://github.com/prefix-dev/pixi-docker There's a |
(In case you didn't know, |
So weird -- I searched docker hub and that one didn't come up. I ended up making it worth using |
They're not on Docker Hub, they're on a GitHub Container Registry repository. |
@klwetstone I'd recommend using the images they already publish. https://github.com/prefix-dev/pixi-docker/pkgs/container/pixi Firstly, you won't need to muck with installing Pixi. Secondly, that lets you easily pin to a particular version of Pixi, instead of installing the latest version each time that you build. |
Agreed -- thank you! I didn't think to search Github container registry in addition to docker hub |
Agreed -- thank you! I didn't think to search Github container registry in addition to docker hub The thing to do is just here is to just google " |
@jayqi I ran into some interesting snags here -- I'm continuing to work on debugging, curious if you have any initial thoughts about the right directions to investigate. When I add in pypi dependencies, Some hypothesis I have for what might be going on here:
I added to
Error when I run on my mac:
(the above runs fine in the docker container) |
Okay, quick fix is that I think you're going to have to add osx-64 to the platforms. We're using Pixi in a way that I think isn't 100% intended, so this complaint feels like something that isn't necessary but is just how it's set up right now. |
@klwetstone alternative idea: does it make sense to set up a Docker entrypoint that runs the locking inside a container? That way, we can still run the locking without needing to add macOS or Windows to the platforms. (While this PyPI kludge is a thing.) One possible other benefit there is that someone can run the locking without needing to install Pixi. |
Okay this is a known limitation in Pixi right now: prefix-dev/pixi#1130 Basically, because of the way Python packages work (packages can run arbitrary Python code to set their package metadata), Pixi needs a Python interpreter to resolve the PyPI packages. Because your OS (osx-64) is not listed in the platforms, Pixi is unable to install a Python that fits the requirements. By my understanding, this is a technically correct thing. I think it's possible they come up with something smart that can work around this in the future, or they relax some assumptions (I think conda-lock and pip-compile make assumptions that platform and/or Python versions don't matter), but for now this is a limitation. The two approaches here would be:
|
Thank you this is extremely helpful!!
I don't think this will work with our GPU environment. When I add
This makes sense, because we can't install cudatoolkit on a mac. I also see an option 3, which is to stick with conda-lock for now and get it to run more efficiently by significantly simplify the requirements (a la this comment). I'm going to see if I can get option 3 working for now, mainly in the interest of time so I can test some different node sizes. I do think option 2 (using pixi) is the better long term solution, because conda-lock might just get extremely slow again once participants start adding packages. We could have another Dockerfile that just updates and checks the lock files, and run that dockerfile with |
@klwetstone FWIW, conda-lock being slow like what you experienced is not normal behavior, and should either be considered a bug or a pathological edge case that should be addressed rather than lived with. |
Thanks! For my understanding, by "pathological edge case" do you mean it's more than just unresolvable dependencies -- Ie. It may be related to my installation of conda, and not just something that can be solved by loosening dependencies? Either way, I think it's a good idea to test out running conda-lock on some super-basic yamls to check that it runs in a reasonable time. |
"related to my installation of conda" = bug "pathological edge case" = your specific set of dependencies and versions and the solver interact in a specific way where the algorithm works in an especially inefficient way. https://en.wikipedia.org/wiki/Worst-case_complexity |
@jayqi I think I have everything set up to use pixi instead of conda-lock, do you have time to review and make sure the new approach makes sense? Details are in the updated PR description |
* Fix Dockerfiles * Fix permissions and directories and stuff * Update test command * Fix command ordering * Use clean cache command * Add maximize build space action * Add more root reserve space * Remove unwanted software directly * Add some diagnostics * Print out pixi info * Fix typo --------- Co-authored-by: Jay Qi <[email protected]>
There's still something weird and horrible going on with the GPU image. I've spent too much time on this already, but just some loose ideas: Test is failing:
What is this? This error comes from Pixi, and I think In theory, if we have CUDA installed correctly, Pixi should "detect" this as an available virtual package. If you google about this, there have been bugs in the past but nothing obviously still open that is relevant. In general, our GPU stuff is kind of brittle.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
- I can re-run
make update-lockfile
and nothing changes - If I delete
pixi.lock
it regenerates without error (nicely, regenerating from scratch will result in a differentpixi.lock
, proving that running on top of an existingpixi.lock
doesn't change anything it doesn't need to) - It's fast!
One suggested change, on principle more than anything else. Otherwise, looks good to me!
@r-b-g-b ready for a final look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smallest nit! Looks really good!
Sets up dependency management with Pixi instead of conda-lock / conda, including both generating lock files and managing the environment. Also prevents logging outside of smoke tests (resolves issue 83)
Generating lockfiles
The lockfile is now generated within a docker container, because of the known Pixi limitation that we can't solve cross platform pypi dependencies. To generate
runtime/pixi.lock
, the commandmake update-lockfile
:Dockerfile-lock
. This new dockerfile just runs the command to generatepixi.lock
. It doesn't install any dependencies or run the submission.docker create
to create a dummy container from the image without running itpixi.lock
from the dummy container back to the hostHaving a separate Dockerfile allows us to update the lockfile more quickly. If we used the existing
Dockerfile
, the full submission would run every time we need to update the lockfile.make update-lockfiles
runs from scratch (with no existingpixi.lock
) in about 2 minutes. I also tested that it works with some pypi packages included.Outstanding
Check whether we still need test_lockfile.py -- try and install a package in both conda and pip, see if pixi lets us --> pixi resolves this for us, we don't need test_lockfile.py
test_lockfile.py
just checks whether there are both conda and pip versions of the same package.pixi list
shows one entry for each package that has been installed. That one entry is either installed from pypi or conda.) Scrap work in this commitMake sure
entrypoint.sh
runspython main.py
in the correct environmentpixi run
to runmain.py
. This means we also need to specify which pixi environment withCPU_OR_GPU
, so we setCPU_OR_GPU
as an environment variable in the dockerfileUpdate README (tracked in separate project issue)
Background about Pixi from JQ
Pixi is the new thing in the Conda ecosystem that has been gaining momentum for a while. It's made by the team that makes Mamba. They claim to be production-ready.
What benefits does Pixi have?
Some notes on implementation:
thea "base" feature that gets inherited bydefault
environmentcpu
andgpu
.The default environment on its own shouldn't be used though.pytorch-cuda
andcpuonly
metapackages to help you pin, so I'm using those. It's how they do it in their official docs.Running it yourself:
brew install pixi
on macOS.pixi ls
orpixi tree
which are basically no-op commands normally but will trigger it to check the lockfile for freshness.rm pixi.lock
first.Some helpful references:
Also, I saw the conda-lock maintainers are considering endorsing Pixi as a better default solution so that feels like more of a turnoff from using conda-lock.