Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large RAM spike when caching large wheels #9678

Closed
1 task done
astrojuanlu opened this issue Mar 2, 2021 · 3 comments
Closed
1 task done

Large RAM spike when caching large wheels #9678

astrojuanlu opened this issue Mar 2, 2021 · 3 comments
Labels
type: bug A confirmed bug or unintended behavior

Comments

@astrojuanlu
Copy link
Contributor

pip version

21.0.1 (also tested 19.2.3 and 20.0.2)

Python version

3.8.2

OS

Ubuntu

Additional information

No response

Description

A large spike in RAM consumption happens right after downloading a big wheel, before actual installation begins.

(This comes from https://discuss.python.org/t/large-ram-spike-when-caching-large-wheels/7465?u=astrojuanlu)

After going over the code again, I think that the reason is outlined in the description of #6879:

Now, we securely create a temporary file next to the target file, write to it, and rename it to the target path (removing the target path first if it exists).

Indeed, I suspended the process at that stage and found these two file descriptors:

root@b9bf3ee06983:/# ls -l /proc/15/fd                                                                                
total 0                                                                                                               
lrwx------ 1 root root 64 Mar  2 22:32 0 -> /dev/pts/0
lrwx------ 1 root root 64 Mar  2 22:32 1 -> /dev/pts/0
lrwx------ 1 root root 64 Mar  2 22:32 2 -> /dev/pts/0
lrwx------ 1 root root 64 Mar  2 22:32 3 -> 'socket:[1665829]'                                                         
lrwx------ 1 root root 64 Mar  2 22:32 4 -> 'socket:[1668023]'                                                         
l-wx------ 1 root root 64 Mar  2 22:32 5 -> /tmp/pip-unpack-usy34lcd/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl       
lrwx------ 1 root root 64 Mar  2 22:32 6 -> /root/.cache/pip/http/6/d/0/1/5/6d0158b0405ef103e321da6fbc081896dfd6cf2711e52d0323754ad8ummvuv8n.tmp
root@b9bf3ee06983:/# du -h /tmp/pip-unpack-usy34lcd/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl                        
741M    /tmp/pip-unpack-usy34lcd/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl                                                                                         
root@b9bf3ee06983:/# du -h /root/.cache/pip/http/6/d/0/1/5/6d0158b0405ef103e321da6fbc081896dfd6cf2711e52d0323754ad8ummvuv8n.tmp 
741M    /root/.cache/pip/http/6/d/0/1/5/6d0158b0405ef103e321da6fbc081896dfd6cf2711e52d0323754ad8ummvuv8n.tmp

741 MB x 3 = 2.2 GB, which is roughly the observed VmPeak:

root@b9bf3ee06983:/# grep VmPeak /proc/15/status                                                                      
VmPeak:  2323784 kB 

Finally, passing --no-cache-dir removes the issue entirely.

Expected behavior

Memory stays contained during installation.

How to Reproduce

$ docker run -it --rm --name pip-wheel python:3.8 bash
# pip install torch -vv

In another terminal:

$ docker exec -it pip-wheel bash
# grep VmPeak "/proc/$(ps -C pip -o pid= | xargs)/status"  # Before the caching stage, it is more or less contained
# grep VmPeak "/proc/$(ps -C pip -o pid= | xargs)/status"  # After the caching stage, it is much larger

Output

No response

Code of Conduct

  • I agree to follow the PSF Code of Conduct
@astrojuanlu astrojuanlu added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Mar 2, 2021
@RonnyPfannschmidt
Copy link
Contributor

This is technically a bug in the http caching lib used, it's taking the complete content into ram before dumping it to the cache files

@astrojuanlu
Copy link
Contributor Author

Do you mean psf/cachecontrol#145 , which in turn took me to #2984 ?

@pradyunsg
Copy link
Member

Aha! So, this a long standing issue.

Thanks for filing this issue and locating that older one @astrojuanlu! Closing as a duplicate of #2984. :)

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 30, 2021
@pradyunsg pradyunsg removed the S: needs triage Issues/PRs that need to be triaged label Mar 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: bug A confirmed bug or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants