Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training stuck at one certain epoch #51

Open
derkbreeze opened this issue Apr 21, 2023 · 0 comments
Open

Training stuck at one certain epoch #51

derkbreeze opened this issue Apr 21, 2023 · 0 comments

Comments

@derkbreeze
Copy link

Hi Meitar,

So I was training on the MNIST dataset using pretrained features, e.g.

python DeepDPM.py --dataset MNIST --dir './pretrained_embeddings/umap_embedded_datasets/MNIST' --gpus 0

but every time training stucks at epoch 44 and will not continue, log:

Epoch 0: 100%|███████████| 547/547 [00:00<00:00, 661.71it/s, loss=nan, v_num=]Initializing clusters params using Kmeans...
Epoch 44: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 547/547 [00:19<00:00, 27.60it/s, loss=0, v_num=]

Also, why the loss becomes nan in the first epoch? Appreciate if you can suggest!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant