Skip to content
This repository has been archived by the owner on Feb 22, 2020. It is now read-only.

Continuous improvement of nodule classification models (see #2) #131

Open
2 tasks
isms opened this issue Sep 20, 2017 · 9 comments
Open
2 tasks

Continuous improvement of nodule classification models (see #2) #131

isms opened this issue Sep 20, 2017 · 9 comments

Comments

@isms
Copy link
Contributor

isms commented Sep 20, 2017

Overview

We want to continuously improve the accuracy and reliability of models developed for nodule classification. This is a continuation of #2 and will remain open indefinitely.

Note: Substantive contributions are currently eligible for increased point awards.

Design doc reference:
Jobs to be done > Detect and select > Prediction service

Acceptance criteria

  • trained model for classification
  • documentation for the trained model (e.g., cross validation performance, data used) and how to re-train it

NOTE: All PRs must follow the standard PR checklist.

@isms isms added this to the 1-mvp milestone Sep 20, 2017
@isms isms removed this from the 1-mvp milestone Oct 10, 2017
@isms isms added this to the 2-feature-building milestone Oct 29, 2017
@WGierke
Copy link
Contributor

WGierke commented Dec 13, 2017

Maybe we should proceed in small steps by e.g. first adding evaluation methods that cover most of the metrics introduced in #221 so we can have a more standardized way to determine those metrics? This way we would also be able to quickly determine the quality of the current implementations of the identification, classification and segmentation algorithms. This would in turn make it easier to focus first on the algorithm that is performing the worst so far. Any thoughts @isms @reubano @lamby ?

@pjbull
Copy link
Member

pjbull commented Dec 22, 2017

For folks who are interested, see the latest announcement here:
https://concepttoclinic.drivendata.org/newsfeed

There are a limited number of AWS credits available for folks to continue to make progress on these algorithms.

@WGierke
Copy link
Contributor

WGierke commented Dec 22, 2017

Thanks @pjbull !
I contacted the DrivenData team and asked for credits. If they can provide me with some, I'd like to work on this issue. Once I get their answer, I'll update this comment with the current state for transparency reasons.

Update: I received the credits but unfortunately I have to wait ~2 weeks for my credit card being delivered, before that I won't be able to complete the sign-up process at AWS. So in case someone wants to start with this issue as well: feel free to do so! :)

@Serhiy-Shekhovtsov
Copy link
Contributor

Serhiy-Shekhovtsov commented Dec 29, 2017

@WGierke, I have created a virtual machine, the support should increase the instances limit anytime soon, and then the machine will be ready for use. Also, I have plenty of time for upcoming week. So, if you have some ideas, we can work together on the issue and share the points and the fun :)
If this sounds good to you, please, contact me on Gitter.

@Serhiy-Shekhovtsov
Copy link
Contributor

@reubano, @pjbull what @WGierke said here makes perfect sense to me. If the score for those metrics is a required part of models improvement it would be nice to have a set of standard tests for them.
I am happy to take part in it's development. What do you think?

@reubano
Copy link
Contributor

reubano commented Jan 3, 2018

@Serhiy-Shekhovtsov @WGierke sounds good to me. Feel free to create the relevant issues.

@swarm-ai
Copy link
Contributor

Hi @reubano I have been working on retraining the classifier and detector models for better performance. I am planning to document the process for both detector and classifier models and submit a pull request to the concept-to-clinic clone of the GRT code base here: https://github.com/concept-to-clinic/DSB2017

Will that work? I did not find any training code set up in the concept-to-clinic repo.

@reubano
Copy link
Contributor

reubano commented Jan 18, 2018

@swarm-ai that repo is just for reference. Are you able to incorporate your performance enhancements to the code in this repo? The GRT model has already been included as per #4.

@caseyfitz
Copy link

caseyfitz commented Jan 18, 2018

Hi @swarm-ai , improved models would be very welcome––good luck! There is currently no workflow for including training processes in the application codebase, but this is something we'd love to have.

The minimum we need to incorporate an improved model are currently

  1. the weights, which live in the assets/ subdir for each algorithm
  2. the architecture, which lives in the src/ subdir for each algorithm (so that we can load the trained weights)

With an eye towards future development, we'd be happy to see a PR that augments the algorithm directories with a training/ subdir (in addition to the current src/ and assets/)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants