Skip to content
This repository has been archived by the owner on Feb 22, 2020. It is now read-only.

Update PR TEMPLATE to include prediction algorithm metrics #221

Merged
merged 1 commit into from
Nov 20, 2017

Conversation

reubano
Copy link
Contributor

@reubano reubano commented Nov 10, 2017

Description

This PR updates PULL_REQUEST_TEMPLATE.md to include various prediction algorithm metrics such as logloss and prediction time. View formatted file.

Motivation and Context

This PR will allow us to evaluate current and future machine learning algorithms in a standardized and objective fashion.

CLA

  • I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well

@lamby
Copy link
Contributor

lamby commented Nov 11, 2017

LGTM

## Metrics (if appropriate):

<!---
If you submitting a PR for a prediction algorithm (segmentation, identification, or classification
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind inserting a closing parenthesis at the end? :)


algorithm | relevant metrics
---------------|------------------
segmentation | jaccard loss, training time, prediction time, data IO, disk space usage, memory usage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding some metrics for a segmentation algorithm that have already been implemented to provide more evaluation information such as Hausdorff distance, sensitivity, specificity or dice coefficient?

@reubano
Copy link
Contributor Author

reubano commented Nov 17, 2017

One metric I left out is an estimation of model over/under fitting (video). This is something we should probably address at some point.

@reubano reubano force-pushed the pr-template branch 5 times, most recently from 7938e84 to 237e99b Compare November 17, 2017 16:17
@lamby
Copy link
Contributor

lamby commented Nov 18, 2017

@reubano Did you address all of @WGierke's comments? :)

@reubano
Copy link
Contributor Author

reubano commented Nov 18, 2017

@WGierke yep

@lamby
Copy link
Contributor

lamby commented Nov 19, 2017

LGTM.. :)

@pjbull pjbull merged commit f09d302 into master Nov 20, 2017
@pjbull pjbull deleted the pr-template branch November 20, 2017 23:06
@WGierke
Copy link
Contributor

WGierke commented Dec 13, 2017

@reubano one thing that just came to my mind: isn't identification about predicting a (numerical) location of a nodule? How should we then calculate the accuracy based on that? From my understanding the accuracy can only be used for classification problems, since here we have a regression problem something like the mean squared error would be more appropriate, wouldn't it? Or do you want to put every x, y and z value in their own classes? This way the accuracy wouldn't differ for a prediction in which e.g. x_hat is x_true+1 and a prediction where x_hat is x_true+100.

@WGierke
Copy link
Contributor

WGierke commented Dec 14, 2017

I consulted my pillow and it said that one could handle an identification as correct if the predicted nodule location is within the boundaries of the nodule, but not necessary the centroid (for LIDC images we can get the boundaries easily using pylidc. Another way to achieve this would be to handle an identification as correct if the predicted locations are within a Ɛ-pipe of the true location (x_true - Ɛ <= x_hat <= x_true + Ɛ, y_true - Ɛ <= y_hat ...). Ɛ could vary for z since the size of this dimension differs from x and y.
This brings me to my third point: the accuracy is the ratio of correct classifications from all the classifications the algorithm performed. If the algorithm only predicts one nodule out of five and the prediction is correct, its accuracy would be 100% but it still wasn't very helpful since it missed four nodules. A metric to cover this case would be the recall which can also easily be computed. Any thoughts on that? :)

@reubano
Copy link
Contributor Author

reubano commented Dec 18, 2017

@WGierke good points! I think one way to address the identification's accuracy calculation is to do something similar as mentioned in the log loss note:

In order to calculate Log Loss for identification, the data needs to be arranged in a way that shows for each pixel, whether or not it is a nodule centriod. Restated, the pixel level labels of 1/0 would correspond to centriod/not-centriod.

This would convert identification into something more classification like. As to Ɛ error value you suggested, I think that's a great idea. We could implement that by counting a label as being matched if it is x number of pixels away from the true centroid.

If I understand your 3rd point correctly, you are saying accuracy only accounts for precision. The recall metric you referenced is also known as sensitivity. Since that metric is already included in the list, I agree we should just add identification as a relevant algorithm.

@WGierke WGierke mentioned this pull request Jan 3, 2018
6 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants