We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
你好,打扰了,以下是我的问题。
训练:CUDA_VISIBLE_DEVICES=0,1,2,3 python train_dist.py --dataset minc --model deepten_resnet50_minc --batch-size 2 --lr 0.004 --epochs 80 --lr-step 60 --lr-scheduler step --weight-decay 5e-4
问题一:print(model)与forward中定义的model不一致,与文章中的结构不一致 print(model)比deepten.py中的define forward的结构多了全连接层,如下附件 print(model).txt deepten.txt 而文章中这个结构似乎没有全连接层? 问题二:训练时,fc层梯度无法更新? 如下是我训练打印出的梯度 gra.txt
总的来说我不太清楚是否应该包含全连接层2048,1000。如果应该包含,我如何解决梯度更新的问题,如果不该包含,我该在哪里取删除它?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
你好,打扰了,以下是我的问题。
训练:CUDA_VISIBLE_DEVICES=0,1,2,3 python train_dist.py --dataset minc --model deepten_resnet50_minc --batch-size 2 --lr 0.004 --epochs 80 --lr-step 60 --lr-scheduler step --weight-decay 5e-4
问题一:print(model)与forward中定义的model不一致,与文章中的结构不一致
print(model)比deepten.py中的define forward的结构多了全连接层,如下附件
print(model).txt
deepten.txt
而文章中这个结构似乎没有全连接层?
问题二:训练时,fc层梯度无法更新?
如下是我训练打印出的梯度
gra.txt
总的来说我不太清楚是否应该包含全连接层2048,1000。如果应该包含,我如何解决梯度更新的问题,如果不该包含,我该在哪里取删除它?
The text was updated successfully, but these errors were encountered: