Continuous improvement of nodule classification models (see #2) #131

isms · 2017-09-20T15:57:11Z

Overview

We want to continuously improve the accuracy and reliability of models developed for nodule classification. This is a continuation of #2 and will remain open indefinitely.

Note: Substantive contributions are currently eligible for increased point awards.

Design doc reference:
Jobs to be done > Detect and select > Prediction service

Acceptance criteria

trained model for classification
documentation for the trained model (e.g., cross validation performance, data used) and how to re-train it

NOTE: All PRs must follow the standard PR checklist.

WGierke · 2017-12-13T23:53:20Z

Maybe we should proceed in small steps by e.g. first adding evaluation methods that cover most of the metrics introduced in #221 so we can have a more standardized way to determine those metrics? This way we would also be able to quickly determine the quality of the current implementations of the identification, classification and segmentation algorithms. This would in turn make it easier to focus first on the algorithm that is performing the worst so far. Any thoughts @isms @reubano @lamby ?

pjbull · 2017-12-22T17:37:38Z

For folks who are interested, see the latest announcement here:
https://concepttoclinic.drivendata.org/newsfeed

There are a limited number of AWS credits available for folks to continue to make progress on these algorithms.

WGierke · 2017-12-22T18:02:56Z

Thanks @pjbull !
I contacted the DrivenData team and asked for credits. If they can provide me with some, I'd like to work on this issue. Once I get their answer, I'll update this comment with the current state for transparency reasons.

Update: I received the credits but unfortunately I have to wait ~2 weeks for my credit card being delivered, before that I won't be able to complete the sign-up process at AWS. So in case someone wants to start with this issue as well: feel free to do so! :)

Serhiy-Shekhovtsov · 2017-12-29T15:50:16Z

@WGierke, I have created a virtual machine, the support should increase the instances limit anytime soon, and then the machine will be ready for use. Also, I have plenty of time for upcoming week. So, if you have some ideas, we can work together on the issue and share the points and the fun :)
If this sounds good to you, please, contact me on Gitter.

Serhiy-Shekhovtsov · 2018-01-02T17:54:27Z

@reubano, @pjbull what @WGierke said here makes perfect sense to me. If the score for those metrics is a required part of models improvement it would be nice to have a set of standard tests for them.
I am happy to take part in it's development. What do you think?

reubano · 2018-01-03T08:35:05Z

@Serhiy-Shekhovtsov @WGierke sounds good to me. Feel free to create the relevant issues.

swarm-ai · 2018-01-18T17:58:45Z

Hi @reubano I have been working on retraining the classifier and detector models for better performance. I am planning to document the process for both detector and classifier models and submit a pull request to the concept-to-clinic clone of the GRT code base here: https://github.com/concept-to-clinic/DSB2017

Will that work? I did not find any training code set up in the concept-to-clinic repo.

reubano · 2018-01-18T23:19:32Z

@swarm-ai that repo is just for reference. Are you able to incorporate your performance enhancements to the code in this repo? The GRT model has already been included as per #4.

caseyfitz · 2018-01-18T23:38:53Z

Hi @swarm-ai , improved models would be very welcome––good luck! There is currently no workflow for including training processes in the application codebase, but this is something we'd love to have.

The minimum we need to incorporate an improved model are currently

the weights, which live in the assets/ subdir for each algorithm
the architecture, which lives in the src/ subdir for each algorithm (so that we can load the trained weights)

With an eye towards future development, we'd be happy to see a PR that augments the algorithm directories with a training/ subdir (in addition to the current src/ and assets/)

isms added enhancement major official prediction labels Sep 20, 2017

isms added this to the 1-mvp milestone Sep 20, 2017

isms added the POINT-BOUNTY label Sep 20, 2017

isms removed this from the 1-mvp milestone Oct 10, 2017

isms added this to the 2-feature-building milestone Oct 29, 2017

WGierke mentioned this issue Jan 3, 2018

Evaluation Pipeline for Models #271

Open

6 tasks

isms modified the milestones: 2-feature-building, 3-packaging Jan 5, 2018

vessemer mentioned this issue Jan 24, 2018

Classification model pipeline #298

Merged

1 task

swarm-ai mentioned this issue Jan 25, 2018

Add Training Process for Nodule Detection and Classification - added customized datasets #300

Closed

1 task

isms removed the POINT-BOUNTY label Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continuous improvement of nodule classification models (see #2) #131

Continuous improvement of nodule classification models (see #2) #131

isms commented Sep 20, 2017

WGierke commented Dec 13, 2017

pjbull commented Dec 22, 2017

WGierke commented Dec 22, 2017 •

edited

Loading

Serhiy-Shekhovtsov commented Dec 29, 2017 •

edited

Loading

Serhiy-Shekhovtsov commented Jan 2, 2018

reubano commented Jan 3, 2018

swarm-ai commented Jan 18, 2018

reubano commented Jan 18, 2018

caseyfitz commented Jan 18, 2018 •

edited

Loading

Continuous improvement of nodule classification models (see #2) #131

Continuous improvement of nodule classification models (see #2) #131

Comments

isms commented Sep 20, 2017

Overview

Acceptance criteria

WGierke commented Dec 13, 2017

pjbull commented Dec 22, 2017

WGierke commented Dec 22, 2017 • edited Loading

Serhiy-Shekhovtsov commented Dec 29, 2017 • edited Loading

Serhiy-Shekhovtsov commented Jan 2, 2018

reubano commented Jan 3, 2018

swarm-ai commented Jan 18, 2018

reubano commented Jan 18, 2018

caseyfitz commented Jan 18, 2018 • edited Loading

WGierke commented Dec 22, 2017 •

edited

Loading

Serhiy-Shekhovtsov commented Dec 29, 2017 •

edited

Loading

caseyfitz commented Jan 18, 2018 •

edited

Loading