-
Notifications
You must be signed in to change notification settings - Fork 147
Update PR TEMPLATE to include prediction algorithm metrics #221
Conversation
LGTM |
.github/PULL_REQUEST_TEMPLATE.md
Outdated
## Metrics (if appropriate): | ||
|
||
<!--- | ||
If you submitting a PR for a prediction algorithm (segmentation, identification, or classification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind inserting a closing parenthesis at the end? :)
.github/PULL_REQUEST_TEMPLATE.md
Outdated
|
||
algorithm | relevant metrics | ||
---------------|------------------ | ||
segmentation | jaccard loss, training time, prediction time, data IO, disk space usage, memory usage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding some metrics for a segmentation algorithm that have already been implemented to provide more evaluation information such as Hausdorff distance, sensitivity, specificity or dice coefficient?
One metric I left out is an estimation of model over/under fitting (video). This is something we should probably address at some point. |
7938e84
to
237e99b
Compare
@WGierke yep |
LGTM.. :) |
@reubano one thing that just came to my mind: isn't identification about predicting a (numerical) location of a nodule? How should we then calculate the accuracy based on that? From my understanding the accuracy can only be used for classification problems, since here we have a regression problem something like the mean squared error would be more appropriate, wouldn't it? Or do you want to put every x, y and z value in their own classes? This way the accuracy wouldn't differ for a prediction in which e.g. x_hat is x_true+1 and a prediction where x_hat is x_true+100. |
I consulted my pillow and it said that one could handle an identification as correct if the predicted nodule location is within the boundaries of the nodule, but not necessary the centroid (for LIDC images we can get the boundaries easily using pylidc. Another way to achieve this would be to handle an identification as correct if the predicted locations are within a Ɛ-pipe of the true location (x_true - Ɛ <= x_hat <= x_true + Ɛ, y_true - Ɛ <= y_hat ...). Ɛ could vary for z since the size of this dimension differs from x and y. |
@WGierke good points! I think one way to address the identification's accuracy calculation is to do something similar as mentioned in the log loss note:
This would convert identification into something more classification like. As to Ɛ error value you suggested, I think that's a great idea. We could implement that by counting a label as being matched if it is If I understand your 3rd point correctly, you are saying accuracy only accounts for |
Description
This PR updates
PULL_REQUEST_TEMPLATE.md
to include various prediction algorithm metrics such as logloss and prediction time. View formatted file.Motivation and Context
This PR will allow us to evaluate current and future machine learning algorithms in a standardized and objective fashion.
CLA