Divergence and Metrics #1612

jramborger78 · 2023-11-29T15:39:13Z

jramborger78
Nov 29, 2023

Hey guys, I had two new questions:

1.) Is there a rule-of-thumb in terms of when to stop a training early after the train and validation loss diverge? Something like "if it has been X number of epochs then after X difference in loss, stop training", up to now i have been kind of intuitively flying by the seat of my pants on it.

2.) Here is a screen shot of my best model thus far. After my last post on the 0 instances being predicted and mentioning it could be from inconsistent labeling, I went ahead and simplified the skeleton a bit and added the catheter as a label to be used as the centroid of my top-down. I presently have 900 labeled frames and was curious what metrics I should be shooting for. When meeting with Talmo a few months ago I wrote down "mean above 0.7 is good, 95% below 10 pixels is good, but 95% below 5 pixels is even better but hard to do." In terms of mean, which of the 6 mean metrics is that referring to? OKS? PCK? Etc. Thanks!

Answered by talmo

Nov 29, 2023

Hi @jramborger78,

Hey guys, I had two new questions:

1.) Is there a rule-of-thumb in terms of when to stop a training early after the train and validation loss diverge? Something like "if it has been X number of epochs then after X difference in loss, stop training", up to now i have been kind of intuitively flying by the seat of my pants on it.

Yes! And actually, SLEAP will do this by default so there's nothing you should have to worry about. By default it'll reduce the learning rate and/or stop training early if the validation loss stops improving for a certain number of epochs. The default values are specified in the training configs and will result in early stopping when the validat…

View full answer

talmo · 2023-11-29T20:48:49Z

talmo
Nov 29, 2023
Maintainer

Hi @jramborger78,

Hey guys, I had two new questions:

1.) Is there a rule-of-thumb in terms of when to stop a training early after the train and validation loss diverge? Something like "if it has been X number of epochs then after X difference in loss, stop training", up to now i have been kind of intuitively flying by the seat of my pants on it.

Yes! And actually, SLEAP will do this by default so there's nothing you should have to worry about. By default it'll reduce the learning rate and/or stop training early if the validation loss stops improving for a certain number of epochs. The default values are specified in the training configs and will result in early stopping when the validation loss hasn't improved by at least 1e-08 for 10 epochs.

2.) Here is a screen shot of my best model thus far. After my last post on the 0 instances being predicted and mentioning it could be from inconsistent labeling, I went ahead and simplified the skeleton a bit and added the catheter as a label to be used as the centroid of my top-down. I presently have 900 labeled frames and was curious what metrics I should be shooting for. When meeting with Talmo a few months ago I wrote down "mean above 0.7 is good, 95% below 10 pixels is good, but 95% below 5 pixels is even better but hard to do." In terms of mean, which of the 6 mean metrics is that referring to? OKS? PCK? Etc. Thanks!

This looks pretty good to me! I was referring to <10 px for the localization error, which is the plot on the right broken down by node type. In terms of the summary metrics, average distance is good and it looks like you're in great shape there at 4.3!

Since not all poses are equally as easy to predict, the 95th and 99th percentile tell you about the worst case scenarios, which it looks like you're doing well on too.

The OKS VOC mAP is usually a good summary measure when comparing across different models on the same dataset since it incorporates lots of other forms of errors like missed points and etc. into a single number, but can be hard to interpret when comparing across datasets.

Let us know if you have any more questions!

Cheers,

Talmo

0 replies

jramborger78 · 2023-11-29T21:15:05Z

jramborger78
Nov 29, 2023
Author

1.) Awesome, wasn't sure if it would be a good idea to do so before it did on its own or not, appreciate the clarification! 2.) That is great to hear, thank you sir! Built an intuition the last few months and haven't seen numbers close to this good, so happy to hear it is in the right direction. I have a spreadsheet attempting to keep track of different model augmentations and noting the metrics as well as computing an F1 score and color coding them all. This model specifically was the best on many of the metrics such as average-99th, and well above average in the bottom 6 comparison measures. I am sure I will have questions soon on tracking in some regards, but appreciate the continued help and always being available. Cheers.

…

On Wed, Nov 29, 2023 at 1:48 PM Talmo Pereira ***@***.***> wrote: Hi @jramborger78 <https://github.com/jramborger78>, Hey guys, I had two new questions: 1.) Is there a rule-of-thumb in terms of when to stop a training early after the train and validation loss diverge? Something like "if it has been X number of epochs then after X difference in loss, stop training", up to now i have been kind of intuitively flying by the seat of my pants on it. Yes! And actually, SLEAP will do this by default so there's nothing you should have to worry about. By default it'll reduce the learning rate and/or stop training early if the validation loss stops improving for a certain number of epochs. The default values are specified in the training configs <https://github.com/talmolab/sleap/tree/develop/sleap/training_profiles> and will result in early stopping when the validation loss hasn't improved by at least 1e-08 for 10 epochs. 2.) Here is a screen shot of my best model thus far. After my last post on the 0 instances being predicted and mentioning it could be from inconsistent labeling, I went ahead and simplified the skeleton a bit and added the catheter as a label to be used as the centroid of my top-down. I presently have 900 labeled frames and was curious what metrics I should be shooting for. When meeting with Talmo a few months ago I wrote down "mean above 0.7 is good, 95% below 10 pixels is good, but 95% below 5 pixels is even better but hard to do." In terms of mean, which of the 6 mean metrics is that referring to? OKS? PCK? Etc. Thanks! This looks pretty good to me! I was referring to <10 px for the localization error, which is the plot on the right broken down by node type. In terms of the summary metrics, average distance is good and it looks like you're in great shape there at 4.3! Since not all poses are equally as easy to predict, the 95th and 99th percentile tell you about the worst case scenarios, which it looks like you're doing well on too. The OKS VOC mAP is usually a good summary measure when comparing across different models on the same dataset since it incorporates lots of other forms of errors like missed points and etc. into a single number, but can be hard to interpret when comparing across datasets. Let us know if you have any more questions! Cheers, Talmo — Reply to this email directly, view it on GitHub <#1612 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BABV7SMZJEHT7TBUQR56U53YG6NTXAVCNFSM6AAAAAA77VYLP2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TOMBZHEZDS> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Sincerely, Jarryd Ramborger 951-541-3922

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Divergence and Metrics #1612

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Divergence and Metrics #1612

jramborger78 Nov 29, 2023

Replies: 2 comments

talmo Nov 29, 2023 Maintainer

jramborger78 Nov 29, 2023 Author

jramborger78
Nov 29, 2023

talmo
Nov 29, 2023
Maintainer

jramborger78
Nov 29, 2023
Author