You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed in the supplementary material that the number of steps is 50,000, but in main.py, steps_per_epoch=500. I would like to ask if this is a mistake? Additionally, the batch_size and gradient accumulation are also different from what was used in the paper.
The text was updated successfully, but these errors were encountered:
@LiJiaBei-7 Thank you so much for pointing this issue out, please set the hyper-parameters following the paper. I modified the default parameters when testing the maximum batch size on 80GB GPUs. I will update a new version of them along with the debug of your next issue on poor training results.
I noticed in the supplementary material that the number of steps is 50,000, but in
main.py
,steps_per_epoch=500
. I would like to ask if this is a mistake? Additionally, thebatch_size
andgradient accumulation
are also different from what was used in the paper.The text was updated successfully, but these errors were encountered: