问题一：print(model）与forward中定义的model不一致，与文章中的结构不一致。问题二：且训练时，无法更新fc层梯度 #427

githubsuperfans · 2024-05-09T07:56:54Z

你好，打扰了，以下是我的问题。

训练：CUDA_VISIBLE_DEVICES=0,1,2,3 python train_dist.py --dataset minc --model deepten_resnet50_minc --batch-size 2 --lr 0.004 --epochs 80 --lr-step 60 --lr-scheduler step --weight-decay 5e-4

问题一：print(model）与forward中定义的model不一致，与文章中的结构不一致
print(model)比deepten.py中的define forward的结构多了全连接层，如下附件
print(model).txt
deepten.txt
而文章中这个结构似乎没有全连接层？

问题二：训练时，fc层梯度无法更新？
如下是我训练打印出的梯度
gra.txt

总的来说我不太清楚是否应该包含全连接层2048，1000。如果应该包含，我如何解决梯度更新的问题，如果不该包含，我该在哪里取删除它？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

问题一：print(model）与forward中定义的model不一致，与文章中的结构不一致。问题二：且训练时，无法更新fc层梯度 #427

问题一：print(model）与forward中定义的model不一致，与文章中的结构不一致。问题二：且训练时，无法更新fc层梯度 #427

githubsuperfans commented May 9, 2024

问题一：print(model）与forward中定义的model不一致，与文章中的结构不一致。问题二：且训练时，无法更新fc层梯度 #427

问题一：print(model）与forward中定义的model不一致，与文章中的结构不一致。问题二：且训练时，无法更新fc层梯度 #427

Comments

githubsuperfans commented May 9, 2024