-
-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: pex3 cache
introspection/gc command
#2201
Comments
I would love any comments from anyone at all about how they currently manage the pex cache (or don't)! |
A user request for cache clearing from pip: pypa/pip#12176. |
I think starting things off with 0 magic would be great. Simply supporting |
I am requesting a |
@zmanji that seems to make sense, although there is a detail to iron out: what does cache add for an sdist mean? Naively this builds and installs the sdist in All that said, I don't think this will help your offline goals. Pip does more work than you think it does even if you hand it all pinned deps and Pex doesn't currently try to ameliorate that. I just recently added |
The |
@zmanji I think That said, I'm adding a new resolver type for #1907 in #2512 that allows fully offline PEX creation using |
Discussed in https://github.com/pantsbuild/pex/discussions/2200
Originally posted by cosmicexplorer August 1, 2023
forked from a response to #2175 (comment):
Cache GC Policies
Generalizing this a bit, I recall that pantsd used to have a flag for how often it garbage collects the rust store--if there are concerns about the bloat of pex cache directories, are there any opportunities for pex itself to help the user automate the cache management outside of just
rm -rf ~/.pex
? What is currently the easiest way to implement e.g. LRU eviction? I guess I can do something like this?The above probably works, but I'm wondering if the dilemma about cache bloat that you describe is partially because the user isn't given enough tools to mediate it? Or am I misinterpreting you?
Insight: evict cache entries based on usage frequency
In particular, one GC heuristic that pex (or pip) itself would be in the best place to record is not just how recently each cache entry was accessed, but how often. Something like this could be fun:
Does that sound like a fruitful thing to investigate further? Or are there better ways to address the disk usage pressure?
Prior Art
Examples of this from other tools:
pip
exampleOne useful bit of prior art is the new
pip cache
subcommand within pip (it's on themain
branch, not sure which version it first appeared in):spack
comparisonI know
spack
users also have the same issue, but it's less pressing because:spack uninstall 'emacs~tree-sitter'
) or "anything compiled by a version of clang less than or equal to X.Y.Z and any transitive dependees" (that looks likespack uninstall --all '%clang@:X.Y.Z'
) by deferring to the clingo ASP logic solver (e.g. https://github.com/spack/spack/blob/936c6045fc0686e683c6b3da20967d2e30a7ec87/lib/spack/spack/solver/concretize.lp#L7).So spack users generally have the ability to very finely tune the tool's disk usage to suit their own immediate needs, and pruning or even seeding a cache e.g. for export to an internal environment is considered a top-level feature. While
pex
(and especiallypex3
) also make the creation of python environments a top-level feature, we currently aren't able to apply the same selection logic to prune our cache directories.Insight: select cache entries to evict using our existing platform/interpreter selection logic
Along those lines, to expand on the proposed
pex3 cache
command, we could introduce platform selection logic:The text was updated successfully, but these errors were encountered: