Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Confidence Loss on first epoch #175

Closed
mlpotter opened this issue Mar 30, 2019 · 1 comment
Closed

High Confidence Loss on first epoch #175

mlpotter opened this issue Mar 30, 2019 · 1 comment

Comments

@mlpotter
Copy link

Is a high confidence loss for a new dataset to be expected on the first epoch?

image

@glenn-jocher
Copy link
Member

@mlpotter this high conf loss is created on purpose, as the conf loss is multiplied by 64 in the compute_loss() function. This was done as the result of a hyperparameter search from a few months ago #2 (comment), though this was done on the incorrect mAP metric used back then, so it should probably be redone.

cls loss is also divided by 4, so the natural conf and cls losses at epoch 0 batch 0 are really about 2 and 12 rather than 139 and 3.29. You can run your own evaluation of these loss component weights, or reset them to 1 (which would align more closely with the darknet loss function as I understand it), but in our tests we saw much worse results this way.

yolov3/utils/utils.py

Lines 264 to 277 in 09b02d2

# Compute losses
k = 1 # nT / bs
if len(b) > 0:
pi = pi0[b, a, gj, gi] # predictions closest to anchors
tconf[b, a, gj, gi] = 1 # conf
lxy += k * MSE(torch.sigmoid(pi[..., 0:2]), txy[i]) # xy loss
lwh += k * MSE(pi[..., 2:4], twh[i]) # wh loss
lcls += (k / 4) * CE(pi[..., 5:], tcls[i]) # class_conf loss
# pos_weight = FT([gp[i] / min(gp) * 4.])
# BCE = nn.BCEWithLogitsLoss(pos_weight=pos_weight)
lconf += (k * 64) * BCE(pi0[..., 4], tconf) # obj_conf loss
loss = lxy + lwh + lconf + lcls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants