Memory leak when training multiple models sequentially #8463

tchaton · 2021-07-19T09:04:21Z

🐛 Bug

Please reproduce using the BoringModel

It seems there might be some memory link related to optimizer states still on GPU.

Check out: https://discuss.pytorch.org/t/how-to-avoid-memory-leak-when-training-multiple-models-sequentially/100315/5

To Reproduce

Use following BoringModel and post here

Expected behavior

Environment

Note: Bugs with code are solved faster ! Colab Notebook should be made public !

IDE: Please, use our python bug_report_model.py template.
Colab Notebook: Please copy and paste the output from our environment collection script (or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master/tests/collect_env_details.py
# For security purposes, please check the contents of collect_env_details.py before running it.
python collect_env_details.py

PyTorch Lightning Version (e.g., 1.3.0):
PyTorch Version (e.g., 1.8)
Python version:
OS (e.g., Linux):
CUDA/cuDNN version:
GPU models and configuration:
How you installed PyTorch (conda, pip, source):
If compiling from source, the output of torch.__config__.show():
Any other relevant information:

Additional context

The text was updated successfully, but these errors were encountered:

awaelchli · 2021-07-19T15:07:31Z

possibly the same as here: #8430

tchaton added bug Something isn't working help wanted Open to be worked on labels Jul 19, 2021

edenlightning added this to the v1.3.x milestone Jul 19, 2021

edenlightning assigned tchaton Jul 19, 2021

tchaton mentioned this issue Jul 20, 2021

[bugfix] Reduce memory leaks #8490

Merged

12 tasks

tchaton closed this as completed in #8490 Jul 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak when training multiple models sequentially #8463

Memory leak when training multiple models sequentially #8463

tchaton commented Jul 19, 2021

awaelchli commented Jul 19, 2021

Memory leak when training multiple models sequentially #8463

Memory leak when training multiple models sequentially #8463

Comments

tchaton commented Jul 19, 2021

🐛 Bug

Please reproduce using the BoringModel

To Reproduce

Expected behavior

Environment

Additional context

awaelchli commented Jul 19, 2021