train_one_epoch中似乎没有进行反向传播更新梯度，请问是哪一行更新了梯度呢 #11

Yang-bug-star · 2024-03-02T13:36:29Z

No description provided.

shansongliu · 2024-03-02T13:53:44Z

It's in loss_scaler function call at line 69 in engine_train.py, which points to misc.py in util/ folder. You will see the backward call in the class NativeScalerWithGradNormCount at line 255 in misc.py

Yang-bug-star · 2024-03-02T14:37:50Z

thank you very much

Yang-bug-star · 2024-03-02T14:41:53Z

Then, I also want to ask why the number of training epochs in the code is different from that in the paper. In the paper, it is 5,5,2, but in the training script, it is 5,2,2. In addition, it is noticed that min_lr=1e-2 is greater than lr=1e-5, which leads to the increase of learning rate in adjust_lr_rate process with the number of iteration rounds. Is this the normal phenomena ? However, lr is set to 1e-4 in the paper. May I ask what is the appropriate learning rate?

crypto-code · 2024-03-02T22:10:05Z

Then, I also want to ask why the number of training epochs in the code is different from that in the paper. In the paper, it is 5,5,2, but in the training script, it is 5,2,2. In addition, it is noticed that min_lr=1e-2 is greater than lr=1e-5, which leads to the increase of learning rate in adjust_lr_rate process with the number of iteration rounds. Is this the normal phenomena ? However, lr is set to 1e-4 in the paper. May I ask what is the appropriate learning rate?

The code was uploaded from part of our experiments when we were testing the model with different configurations. The configuration that gave us the best results was lr=1e-4 with 5,5 and 2 epochs by stage. We will set the hyper-parameters to the paper's configuration. Thank you for bringing this to our attention.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train_one_epoch中似乎没有进行反向传播更新梯度，请问是哪一行更新了梯度呢 #11

train_one_epoch中似乎没有进行反向传播更新梯度，请问是哪一行更新了梯度呢 #11

Yang-bug-star commented Mar 2, 2024

shansongliu commented Mar 2, 2024 •

edited

Loading

Yang-bug-star commented Mar 2, 2024

Yang-bug-star commented Mar 2, 2024

crypto-code commented Mar 2, 2024

train_one_epoch中似乎没有进行反向传播更新梯度，请问是哪一行更新了梯度呢 #11

train_one_epoch中似乎没有进行反向传播更新梯度，请问是哪一行更新了梯度呢 #11

Comments

Yang-bug-star commented Mar 2, 2024

shansongliu commented Mar 2, 2024 • edited Loading

Yang-bug-star commented Mar 2, 2024

Yang-bug-star commented Mar 2, 2024

crypto-code commented Mar 2, 2024

shansongliu commented Mar 2, 2024 •

edited

Loading