Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving pipenv lock performances #3417

Closed
fbertola opened this issue Dec 31, 2018 · 4 comments
Closed

Improving pipenv lock performances #3417

fbertola opened this issue Dec 31, 2018 · 4 comments
Labels
Category: Performance Issue relates to performance

Comments

@fbertola
Copy link

fbertola commented Dec 31, 2018

Hi!
I was recently bit by the infamous slowness of the lock command. I was thinking that I could improve the performances by just pre-calculating the hashes and dependencies of all our company's projects and modify the underling implementation a bit.
Scaling up to the whole project, I think it's overkill to do that a priori on the whole PyPi database, but maybe we could apply a mechanism that upload your local cache to a central location, every now and then.
It wouldn't be that hard to develop and maintain (provided a place to store the data) and, with time, it will cover pretty much all the python packages currently used.
I could surely help in this regards as I'm doing something similar in my spare time.

What do you think?

@jxltom
Copy link
Contributor

jxltom commented Jan 3, 2019

It will be nice for pypi to do that hash calculation, and even for dependency resolution. :D

But it's not and we can make our own. It will be so cool if we could make this dependency resolution as a service, and I'm definitely +1 for it!

@stewartmiles
Copy link

In addition to caching hashes from pypi (pip already caches downloaded packages), it's probably worth caching the set of dependencies of local editable package installs. We have a project with a load of editable packages installed in a venv, it's nice to work in. However, pipenv lock takes forever and part of this is that it queries each editable package multiple times (which builds a wheel etc.), on our pretty beefy workstations each iteration takes ~1s which very quickly adds up with many editable packages and many pipenv passes as it attempts to find a graph of packages that work together.

@matteius matteius added the Category: Performance Issue relates to performance label Aug 22, 2022
@matteius
Copy link
Member

I think there are two primary things we can do to improve performance overall:
1.) batch up the installs in batch_install instead or writing a temp file per requirement and passing that into a separate pip subprocess -- there is already a working prototype PR for this (the implication is we drop the status bar): #5301
2.) We can allow more pypi servers than just pypi.org to support the same style json API that allows us to quickly retrieve package hashes. I worked on a prototype of this as well, but also involved converting the test runner to pypiserver project, that is also available for review: #5284

There may be other things, but I see those as the big two, and I have performance numbers on the # 1 PR that are quite promising. Open to feedback, so let me know what you think.

@matteius matteius closed this as completed Mar 4, 2023
@matteius
Copy link
Member

matteius commented Mar 4, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Performance Issue relates to performance
Projects
None yet
Development

No branches or pull requests

4 participants