Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tqdm progress bar error #721

Closed
ChristofHenkel opened this issue Jan 21, 2020 · 7 comments
Closed

Tqdm progress bar error #721

ChristofHenkel opened this issue Jan 21, 2020 · 7 comments
Labels
bug Something isn't working duplicate This issue or pull request already exists help wanted Open to be worked on
Milestone

Comments

@ChristofHenkel
Copy link

When running one epoch with train and val dataloader, as soon as validation is started the progressbar will create a new line for each iteration. I have this bug in pycharm as well as kaggle kernels. Below a typical example. 80% runs smoothly, as soon as validation starts a new line for each tqdm iteration is started

Selected optimization level O1: Insert automatic casts around Pytorch functions and Tensor methods.
Defaults for this optimization level are:
enabled : True
opt_level : O1
cast_model_type : None
patch_torch_functions : True
keep_batchnorm_fp32 : None
master_weights : None
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O1
cast_model_type : None
patch_torch_functions : True
keep_batchnorm_fp32 : None
master_weights : None
loss_scale : dynamic
Epoch 1: 80%|████████ | 1216/1520 [09:01<02:08, 2.36batch/s, batch_nb=1215, gpu=0, loss=0.649, train_loss=0.616, v_nb=0]
Validating: 0%| | 0/304 [00:00<?, ?batch/s]
Epoch 1: 80%|████████ | 1217/1520 [09:01<01:44, 2.90batch/s, batch_nb=1215, gpu=0, loss=0.649, train_loss=0.616, v_nb=0]
Epoch 1: 80%|████████ | 1218/1520 [09:02<01:26, 3.48batch/s, batch_nb=1215, gpu=0, loss=0.649, train_loss=0.616, v_nb=0]
Epoch 1: 80%|████████ | 1219/1520 [09:02<01:14, 4.05batch/s, batch_nb=1215, gpu=0, loss=0.649, train_loss=0.616, v_nb=0]
Epoch 1: 80%|████████ | 1220/1520 [09:02<01:05, 4.58batch/s, batch_nb=1215, gpu=0, loss=0.649, train_loss=0.616, v_nb=0]
Epoch 1: 80%|████████ | 1221/1520 [09:02<00:59, 5.04batch/s, batch_nb=1215, gpu=0, loss=0.649, train_loss=0.616, v_nb=0]
Epoch 1: 80%|████████ | 1222/1520 [09:02<00:54, 5.42batch/s, batch_nb=1215, gpu=0, loss=0.649, train_loss=0.616, v_nb=0]
Epoch 1: 80%|████████ | 1223/1520 [09:02<00:51, 5.72batch/s, batch_nb=1215, gpu=0, loss=0.649, train_loss=0.616, v_nb=0]

Environment

PyTorch version: 1.2.0
Is debug build: No
CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: Could not collect

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: GeForce GTX 1080 Ti
GPU 1: GeForce GTX 1080 Ti
GPU 2: GeForce GTX 1080 Ti

Nvidia driver version: 418.87.00
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] numpy==1.16.4
[pip] pytorch-lightning==0.5.3.2
[pip] pytorchcv==0.0.50
[pip] torch==1.2.0
[pip] torchaudio==0.3.0
[pip] torched==0.11
[pip] torchfile==0.1.0
[pip] torchvision==0.4.0
[conda] pytorch-lightning 0.5.3.2 pypi_0 pypi
[conda] pytorchcv 0.0.50 pypi_0 pypi
[conda] torch 1.2.0 pypi_0 pypi
[conda] torchaudio 0.3.0 pypi_0 pypi
[conda] torched 0.11 pypi_0 pypi
[conda] torchfile 0.1.0 pypi_0 pypi
[conda] torchvision 0.4.0 pypi_0 pypi

Additional context

@ChristofHenkel ChristofHenkel added the bug Something isn't working label Jan 21, 2020
@bzz
Copy link

bzz commented Jan 21, 2020

Indeed, this looks quite annoying in such an otherwise amazing CLI user experience!
Happens to me only on Colab though with everything working as expected locally :/

I'm not very familiar with the code base, but it seems that the tqdm progressbar for test/val and train has a bit different set of parameters on creation.

A quick search lands on a similar issue on SO that suggests initializing tqdm with position=0 and leave=True.

I do not exactly understand how that supposed to fix the issue, but as according to the tqdm docs leave is set by default, that makes me think it may have something to do with the initial position value.

@neggert
Copy link
Contributor

neggert commented Jan 21, 2020

I think this is a tqdm issue, since I've seen it across a variety of code that uses tqdm. I've mostly seen it when my terminal isn't wide enough to fit the progress bar plus all of the printed quantities.

@Borda
Copy link
Member

Borda commented Jan 22, 2020

I am afraid that we cannot do much TQDM, I have experienced the printing bar on a new line even in other projects and it is typically when (another) process move errstream cursor or print anything else to stdout

@sneiman
Copy link
Contributor

sneiman commented Jan 24, 2020

I am not sure this is tqdm, as I don't use it (using my own progress bar). Slightly different circumstances - things work fine and suddenly the progress bar - trn, val or test - creates a new line on each call. My current theory is that this is an Ubuntu terminal problem - but I have yet to prove it.
s

btw - happy to donate my prog bar code - it simply does the job of the moving bar, and takes a leadin and a leadout string to print before and after. Unicode terminals only, only tested on Ubuntu

@Borda
Copy link
Member

Borda commented Jan 25, 2020

btw, probably similar to #330

@bzz
Copy link

bzz commented Feb 1, 2020

#765 seems also relevant.

@Borda Borda added help wanted Open to be worked on need fix labels Feb 3, 2020
@Borda Borda added this to the 0.6.1 milestone Feb 3, 2020
@Borda
Copy link
Member

Borda commented Mar 2, 2020

I will close this in favour of #765 so pls let's continue the discussion there... 🤖

@Borda Borda closed this as completed Mar 2, 2020
@Borda Borda added the duplicate This issue or pull request already exists label Mar 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working duplicate This issue or pull request already exists help wanted Open to be worked on
Projects
None yet
Development

No branches or pull requests

5 participants