Converted Team 'grt123' identification and classification algorithms #122

dchansen · 2017-09-18T14:15:11Z

Description

This pull request implements the grt123 identification algorithm.
It is based on PyTorch, which currently conflicts with Tensorflow, so the pull request also disables the Keras import.

While the algorithm is quite fast on GPU, it is painfully slow (but working) on the CPU due to a bug in PyTorch. The bug is fixed in the PyTorch development branch, but the next release is about 2 months off.

Reference to official issue

#4
#1

Motivation and Context

This Pull request implements part of the prediction service (https://concept-to-clinic.readthedocs.io/en/latest/design-doc.html#detect-prediction-service)

How Has This Been Tested?

The algorithm has largely been tested by manually confirming about 5 random LUNA2016 datasets - as the loaded model has already been tested extensively elsewhere, this was deemed sufficient.
It has been tested both outside of docker, on a Titan X GPU, and inside docker on an i7 Intel CPU.

Screenshots (if appropriate):

CLA

I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well

WGierke · 2017-09-18T15:06:27Z

prediction/src/tests/test_identify_trained_model_predict.py

+
+
+
+def test_classify_predict_model_load(dicom_path):


As @lamby already stated here it'd be nice to still have some tests that validate the output of the models. I'm currently working on some tags to only execute these long running tests if one explicitly desires to do so. So feel free to write this test a bit more detailed :) Edit: I added this decorator in c3c626f

Sure, I'll have a look at that tomorrow.

I updated the test cases to be a little more comprehensive.

isms · 2017-09-18T16:16:46Z

[...] the pull request also disables the Keras import.

@dchansen I'm not sure this is a feasible compromise. Is there a way to reimplement this in Tensorflow?

dchansen · 2017-09-18T17:14:34Z

@isms I agree that it's not a feasible compromise at all. I did it mainly to highlight the issue.

I think we should have a single deep learning framework for the project as a whole, as they don't really play well together. I am not sure if Keras is the best choice, but either way, the project leads should make the decisions. Perhaps we should open an issue?

Converting the model itself into Keras would relatively easy, but converting the weights is somewhat more involved. Converting to Caffe2 or CNTK would be easy, as they all support the ONNX format.

reubano

Nice job with this! Here some other thoughts...

use git-lfs for the dsb2017_detector.ckpt model
adopt google style docs
use new style classes
fix spacing issues (have you run flake8?)
integrate skip_slow_test

reubano · 2017-09-19T11:25:38Z

prediction/src/algorithms/identify/trained_model.py

+
+    reader = sitk.ImageSeriesReader()
+    filenames = reader.GetGDCMSeriesFileNames(dicom_path)
+    if len(filenames) == 0:


if not filenames:

reubano · 2017-09-19T11:27:33Z

prediction/src/preprocess/extract_lungs.py

+        if np.average([min_distance[i] for i in range(label.shape[0]) if slice_area[i] > area_th]) < dist_th:
+            valid_label.add(vol.label)
+
+    if len(valid_label) == 0:


if not valid_label:

reubano · 2017-09-19T11:28:51Z

prediction/src/preprocess/extract_lungs.py

+        bw3 = bw1 & bw2
+        label = measure.label(bw, connectivity=1)
+        label3 = measure.label(bw3, connectivity=1)
+        l_list = list(set(np.unique(label)) - {0})


You should be able to iterate fine without needing to cast this to a list

reubano · 2017-09-19T11:29:56Z

prediction/src/preprocess/extract_lungs.py

+            sum = 0
+            while sum < np.sum(area) * cover:
+                sum = sum + area[count]
+                count = count + 1


sum += area[count] and count += 1

dchansen · 2017-09-19T13:39:45Z

I am a little unsure why, but the conflict between Tensorflow and Pytorch seems gone. It was related to this issue in Tensorflow tensorflow/models#523

lamby · 2017-09-20T06:49:56Z

This should block on #118 due to the slow test stuff.

reubano · 2017-09-22T07:34:32Z

Looking pretty good! Just a few questions for you. I noticed that in classify/trained_model/predict, you commented out the previous code instead of deleted it. Was just curious, why did you decided to keep it around? Also there's still some spacing errors. Can you run flake8 prediction in your terminal? Finally, I see this travis error:

$ git lfs pull
Git LFS: (540 of 540 files, 2 skipped) 497.08 MB / 497.08 MB, 41.20 MB skipped 
[eb2c037dd55e2f49da95657f7a7851cbcfe2f2b516848ed03f8c5c820f3e16b4] Object does not exist on the server: [404] Object does not exist on the server
[3a46182c1c4883e170d5b576d3a2645309f34817c53c16d4f7f29fbdcf799308] Object does not exist on the server: [404] Object does not exist on the server

I'll try to pinpoint which files exactly are missing. Meanwhile, can you verify that the model you added to git-lfs was properly uploaded? Thanks for your work!

dchansen · 2017-09-22T08:13:28Z

The git lfs issue should be resolved now.
I kept the previous code because I was unsure what the long-term plans were regarding the style of the project. Conceivably, this pull request could be changed to fit the style of the previous one.
While I don't agree with all the choices made, it was in many ways nicer than what I contributed here. The main reason I did not follow it more closely was one of time, as I just wanted a working adaption of the gtr123 model as soon as possible.

I have fixed all the flake8 issues, except the complexity ones. Hopefully, I will have them fixed by the end of the day.

reubano · 2017-09-22T08:41:49Z

The git lfs issue should be resolved now.

Awesome!

I kept the previous code because I was unsure what the long-term plans were regarding the style of the project. Conceivably, this pull request could be changed to fit the style of the previous one. While I don't agree with all the choices made, it was in many ways nicer than what I contributed here. The main reason I did not follow it more closely was one of time, as I just wanted a working adaption of the gtr123 model as soon as possible.

Ok, I understand. I'm taking a look to see how to integrate both versions.

I have fixed all the flake8 issues, except the complexity ones. Hopefully, I will have them fixed by the end of the day.

Great!

reubano · 2017-09-22T09:52:26Z

Ok... here's my attempt. What do you think?

import SimpleITK as sitk
from src.algorithms.classify.src import gtr123_model
from src.preprocess.load_ct import load_ct, MetaData


def predict(dicom_path, centroids, model_path=None,
            preprocess_ct=None, preprocess_model_input=None):
    reader = sitk.ImageSeriesReader()
    filenames = reader.GetGDCMSeriesFileNames(dicom_path)

    if not filenames:
        raise FileNotFoundError()

    reader.SetFileNames(reader.GetGDCMSeriesFileNames(dicom_path))
    image = reader.Execute()

    if preprocess_ct:
        meta = load_ct(dicom_path)[1]
        voxel_data = preprocess_ct(image, MetaData(meta))
    else:
        voxel_data = image

    if preprocess_model_input:
        preprocessed = preprocess_model_input(voxel_data, centroids)
    else:
        preprocessed = voxel_data

    return gtr123_model.predict(preprocessed, centroids)

Since the keras model is now gone, do you see any reason to continue to keeping model_path? If not, I can help you refactor it out.

dchansen · 2017-09-22T10:04:55Z

This would work for now.
I think the long term solution should be to move the gtr123 preprocessing into preprocess_ct. I will probably not have time to do that until next week though.

reubano · 2017-09-22T10:22:52Z

I think the long term solution should be to move the gtr123 preprocessing into preprocess_ct.

Agreed. However, I think your PR can work without that at this stage (MVP). I looked around some more and saw that gtr123_model.predict actually takes a model_path param. So the last line could be return gtr123_model.predict(preprocessed, centroids, model_path). I think then, the only change would have to be #L14-L16 here in test_classify_trained_model_predict.py.

reubano · 2017-09-26T09:08:24Z

Looking good! What are your thoughts about incorporating something like I mentioned here? Main reason is just so that we don't disable current functionality with this PR. I think with that, we can get this merged.

dchansen · 2017-09-26T09:38:55Z

I'm trying to do a reasonable merge right now, and algorith.identify.trained_model is giving me a headache.
I think we really need a standardized way of specifying a model, which is independent of frameworks (Keras, Torch, Tensorflow, etc).

I would suggest that each model independently implements a predict method taking only a dicom path (for identification) or a dicom path and a list of candidates. trained_model.predict would then take the same argument, plus a string identifying the algorithm to be used . How does this sound?

vessemer · 2017-09-26T10:07:32Z

@dchansen I'm right now working on that, also I've tried to convert the grt123 classification model into Keras (TF), but I've faced troubles with the BatchNorm layer, despite it transfer weights correctly it feeds me with different results (if I just throw away each BatchNorm from the graph then the output becomes just identical, thus I need to dig deeper).
What is related to the way of standardization without transferring the model I've almost done.

reubano · 2017-09-26T11:00:38Z

I'm trying to do a reasonable merge right now, and algorith.identify.trained_model is giving me a headache.

Do you mind elaborating? What specifically is causing issues?

I think we really need a standardized way of specifying a model, which is independent of frameworks (Keras, Torch, Tensorflow, etc).

I would suggest that each model independently implements a predict method taking only a dicom path (for identification) or a dicom path and a list of candidates. trained_model.predict would then take the same argument, plus a string identifying the algorithm to be used . How does this sound?

So if I understand you correctly, algorithms.identify.src.gtr123_model would look like this:

def predict(dicom_path):
    model_path = 'src/algorithms/identify/assets/dsb2017_detector.ckpt'
    ...

and algorithms.identify.trained_model would look like this:

from src.algorithms.identify import src 

def predict(dicom_path, algorithm='gtr123_model'):
    model = func(src, algorithm)
    return model.predict(dicom_path)

Is this correct?

dchansen · 2017-09-26T11:08:33Z

@vessemer I am a little unsure about the benefit of moving the model to Keras now that Tensorflow and Torch seem to work without disturbing one another. Different approaches will require different enough post- and pre-processing that they can't be represented as a Keras model anyway.
Long term, I do see the advantage of having all models on a single deep-learning framework, but is Keras the right choice? Using ONNX as the way to save the model would allow for easy use of many frameworks, including Caffe2 and CNTK.
Either way, as soon as there is a consensus, I would be happy to help converting the models to that particular format.

One difference between the Torch and Keras batchnorm is that they use a different epsilon value (1e-5 vs 1e-3), so maybe that's the issue?
Could you make your WIP available on the model standardization?

dchansen · 2017-09-26T11:10:15Z

@reubano Yes, that was my thought exactly. The problem right now is that all the models take ownership of trained_model.predict, and there is no good way of switching between particular models. Ideally, we might also like to do meta-model, combining many different ones, and a uniform interface would help greatly.

vessemer · 2017-09-26T11:19:44Z

@dchansen, thank you for your input, but I've taken into account the difference of the default values, I've used the nn-transfer for the model standardization.

dchansen · 2017-09-26T11:59:37Z

@vessemer Well, if you can share the code, I can have a look at it.

reubano · 2017-09-26T12:18:33Z

The problem right now is that all the models take ownership of trained_model.predict, and there is no good way of switching between particular models. Ideally, we might also like to do meta-model, combining many different ones, and a uniform interface would help greatly.

For now, I think this is the preferred course of action. We want to focus on one main model for each of (classify, identify, and segment). Later on, we may consider making it easier to switch out the models. The current area of flexibility is in the serialized trained data, e.g., assets/dsb2017_detector.ckpt. So while it is still tied to the underlying python ML model, you can tweak the training parameters as needed. See my comment on #2 for more info.

Also, let me know what exactly is giving you a headache so I can try and help you out.

dchansen · 2017-09-26T12:34:21Z

My issue was that I had to disable a lot of code in identify.trained_model
But I think this is what you want?

reubano · 2017-09-26T12:35:33Z

I was literally suggesting you just copy/paste my code from here.

reubano · 2017-09-26T13:53:14Z

prediction/src/algorithms/classify/trained_model.py

+    else:
+        preprocessed = voxel_data
+
+    return gtr123_model.predict(preprocessed, centroids)


... model_path = model_path or 'src/algorithms/classify/assets/gtr123_model.ckpt' ... return gtr123_model.predict(image, centroids, model_path)

Not sure what you're asking here?

Just reintroducing the model_path part from here. Also just noticed a typo which I've fixed. Maybe that tripped you up.. sorry!

reubano · 2017-09-26T14:01:14Z

Just about there! Does this solve your headache with algorithms.identify.trained_model?

reubano · 2017-09-26T14:45:51Z

From the travis log

$ flake8 prediction
prediction/src/algorithms/identify/trained_model.py:13:1: F401 'glob' imported but unused
prediction/src/algorithms/identify/trained_model.py:15:1: F401 'dicom' imported but unused
prediction/src/algorithms/identify/trained_model.py:16:1: F401 'src.preprocess.errors.EmptyDicomSeriesException' imported but unused
prediction/src/algorithms/identify/trained_model.py:17:1: F401 'src.preprocess.lung_segmentation.save_lung_segments' imported but unused
prediction/src/algorithms/classify/trained_model.py:68:56: E231 missing whitespace after ','
prediction/src/preprocess/extract_lungs.py:57:19: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:66:18: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:67:18: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:68:18: E127 continuation line over-indented for visual indent
prediction/src/tests/test_endpoints.py:10:1: F811 redefinition of unused 'json' from line 7
The command "flake8 prediction" exited with 1.

$ pycodestyle prediction
prediction/src/algorithms/classify/trained_model.py:68:56: E231 missing whitespace after ','
prediction/src/preprocess/extract_lungs.py:57:19: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:66:18: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:67:18: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:68:18: E127 continuation line over-indented for visual indent

dchansen · 2017-09-27T11:07:00Z

The test run in my local environment (outside docker). Inside docker they run when individually, but not when running them all at once. I suspect it is tensorflow/torch shenanigans, but it is quite tricky to debug.

dchansen · 2017-09-27T12:46:46Z

Using a different docker image solved the problem for me.
I cannot for the life of me figure out where the segfault is coming from in the python3:6 image however.

reubano · 2017-09-27T12:48:27Z

Yea that is odd... which image works? And what is the one that doesn't?

dchansen · 2017-09-27T13:00:00Z

pytorch:36 segfaults.
varunagrawal/pytorch does not, but I had to manually install opencv and tkinter in it.

dchansen · 2017-09-27T14:02:02Z

I changed the image to be Ubuntu based instead. All tests should pass now.

reubano · 2017-09-27T15:02:19Z

Ok.... looks like the tests passed, but there are some new lint errors now

$ flake8 prediction
prediction/src/tests/__init__.py:1:1: F401 'torch' imported but unused
prediction/src/tests/__init__.py:1:13: E261 at least two spaces before inline comment
prediction/src/tests/__init__.py:1:14: E262 inline comment should start with '# '
The command "flake8 prediction" exited with 1.

$ pycodestyle prediction
prediction/src/tests/__init__.py:1:13: E261 at least two spaces before inline comment
prediction/src/tests/__init__.py:1:14: E262 inline comment should start with '# '
The command "pycodestyle prediction" exited with 1.

dchansen · 2017-09-27T15:19:18Z

The torch import is required simply because torch needs to be imported before Keras, always. I am very open to other ways of doing that.

reubano · 2017-09-27T15:22:26Z

Try adding this on the offending line: # noqa pylint: disable=unused-import

Edit: if that doesn't work, try this: # noqa: F401

reubano · 2017-09-28T09:14:02Z

Looks like everything is a go now! Do you mind squashing to 1 commit? Or perhaps two, 1 for each issue (if it's possible to separate them).

dchansen · 2017-09-28T12:47:02Z

I seem to have messed that up rather thoroughly. This will take me a while to sort out.

reubano · 2017-09-28T12:49:54Z

No worries :). Perhaps I can help out. What seems to be the issue(s)?

dchansen · 2017-09-28T12:58:59Z

Just not very experienced in using git rebase. I found myself in the situation of having to redo every merge made on the project for the last ten days.
I have messed up my repository quite badly - I think my best approach is to try again from home, where I have a pristine version of my repo, before my attempts at rebasing.

WGierke · 2017-09-28T13:00:55Z

Instead of rebasing, I prefer to squash the commits like so. It fulfills the purpose as well, namely having only 1 commit in the end :)

reubano · 2017-09-28T13:07:26Z

You can also take a look at the advice I posted here.

reubano · 2017-09-28T13:11:21Z

Also git config --global rerere.enabled true will save conflict resolutions so in that in the future, you only have to resolve them once.

dchansen · 2017-09-28T14:51:16Z

@WGierke Thank you! Uploading the squashed commit now.

reubano · 2017-09-28T16:02:46Z

$ flake8 interface
interface/backend/api/tests.py:44:1: W293 blank line contains whitespace
interface/backend/api/tests.py:48:57: E502 the backslash is redundant between brackets
interface/backend/api/tests.py:49:81: E502 the backslash is redundant between brackets
interface/backend/api/serializers.py:74:1: E302 expected 2 blank lines, found 1
interface/backend/api/serializers.py:120:17: E127 continuation line over-indented for visual indent
interface/backend/api/views.py:24:1: F811 redefinition of unused 'Response' from line 13
interface/backend/api/views.py:47:37: W291 trailing whitespace
The command "flake8 interface" exited with 1.

$ pycodestyle interface
interface/backend/api/serializers.py:74:1: E302 expected 2 blank lines, found 1
interface/backend/api/serializers.py:120:17: E127 continuation line over-indented for visual indent
interface/backend/api/tests.py:44:1: W293 blank line contains whitespace
interface/backend/api/tests.py:48:57: E502 the backslash is redundant between brackets
interface/backend/api/tests.py:49:81: E502 the backslash is redundant between brackets
interface/backend/api/views.py:47:37: W291 trailing whitespace
The command "pycodestyle interface" exited with 1.

dchansen · 2017-09-28T16:29:50Z

But the pull request doesn't include those files??

isms · 2017-09-28T18:02:21Z

@reubano @dchansen Those issues were introduced by recent commits unrelated to this PR. I pushed fixes which should get the code style back to passing. Will restart the build on this branch to update the status.

(cc @lamby so we can avoid in the future!)

reubano · 2017-09-28T19:19:25Z

Ok... master looks good. You mind rebasing @dchansen to ensure everything is good to go?

…based on the code from https://github.com/lfz/DSB2017

reubano · 2017-09-29T10:48:08Z

closes #4

WGierke reviewed Sep 18, 2017

View reviewed changes

dchansen changed the title ~~Converted Team 'grt123' identification algorithm~~ Converted Team 'grt123' identification and classification algorithms Sep 19, 2017

reubano suggested changes Sep 19, 2017

View reviewed changes

dchansen mentioned this pull request Sep 19, 2017

Feature: Adapt the grt123 model #4

Closed

1 task

dchansen force-pushed the master branch from a63e97c to e6d4808 Compare September 21, 2017 07:19

lamby assigned reubano Sep 21, 2017

reubano suggested changes Sep 26, 2017

View reviewed changes

dchansen force-pushed the master branch 2 times, most recently from d888503 to 11b8146 Compare September 28, 2017 12:38

dchansen force-pushed the master branch 2 times, most recently from 1ac544e to 524f843 Compare September 28, 2017 15:11

Converted Team 'grt123' identification and classification algorithms …

f284f1f

…based on the code from https://github.com/lfz/DSB2017

dchansen force-pushed the master branch from 524f843 to f284f1f Compare September 29, 2017 07:58

reubano merged commit 45db7be into drivendataorg:master Sep 29, 2017

reubano mentioned this pull request Oct 20, 2017

Document the Daniel Hammack algorithm #19

Closed

1 task

isms unassigned reubano Oct 17, 2018

Converted Team 'grt123' identification and classification algorithms #122

Converted Team 'grt123' identification and classification algorithms #122

Conversation

dchansen commented Sep 18, 2017

Description

Reference to official issue

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

CLA

WGierke Sep 18, 2017 • edited Loading

Choose a reason for hiding this comment

dchansen Sep 18, 2017

Choose a reason for hiding this comment

dchansen Sep 19, 2017

Choose a reason for hiding this comment

isms commented Sep 18, 2017

dchansen commented Sep 18, 2017

reubano left a comment • edited Loading

Choose a reason for hiding this comment

reubano Sep 19, 2017

Choose a reason for hiding this comment

reubano Sep 19, 2017 • edited Loading

Choose a reason for hiding this comment

reubano Sep 19, 2017

Choose a reason for hiding this comment

reubano Sep 19, 2017

Choose a reason for hiding this comment

dchansen commented Sep 19, 2017 • edited Loading

lamby commented Sep 20, 2017

reubano commented Sep 22, 2017

dchansen commented Sep 22, 2017

reubano commented Sep 22, 2017

reubano commented Sep 22, 2017 • edited Loading

dchansen commented Sep 22, 2017

reubano commented Sep 22, 2017

reubano commented Sep 26, 2017

dchansen commented Sep 26, 2017

vessemer commented Sep 26, 2017

reubano commented Sep 26, 2017

dchansen commented Sep 26, 2017

dchansen commented Sep 26, 2017

vessemer commented Sep 26, 2017

dchansen commented Sep 26, 2017

reubano commented Sep 26, 2017

dchansen commented Sep 26, 2017 • edited by reubano Loading

reubano commented Sep 26, 2017 • edited Loading

reubano Sep 26, 2017 • edited Loading

Choose a reason for hiding this comment

dchansen Sep 26, 2017

Choose a reason for hiding this comment

reubano Sep 26, 2017 • edited Loading

Choose a reason for hiding this comment

reubano commented Sep 26, 2017

reubano commented Sep 26, 2017

dchansen commented Sep 27, 2017

dchansen commented Sep 27, 2017

reubano commented Sep 27, 2017

dchansen commented Sep 27, 2017

dchansen commented Sep 27, 2017

reubano commented Sep 27, 2017

dchansen commented Sep 27, 2017

reubano commented Sep 27, 2017 • edited Loading

reubano commented Sep 28, 2017 • edited Loading

dchansen commented Sep 28, 2017

reubano commented Sep 28, 2017

dchansen commented Sep 28, 2017

WGierke commented Sep 28, 2017

reubano commented Sep 28, 2017

reubano commented Sep 28, 2017 • edited Loading

dchansen commented Sep 28, 2017

reubano commented Sep 28, 2017

dchansen commented Sep 28, 2017

isms commented Sep 28, 2017

reubano commented Sep 28, 2017

reubano commented Sep 29, 2017 • edited Loading

WGierke Sep 18, 2017 •

edited

Loading

reubano left a comment •

edited

Loading

reubano Sep 19, 2017 •

edited

Loading

dchansen commented Sep 19, 2017 •

edited

Loading

reubano commented Sep 22, 2017 •

edited

Loading

dchansen commented Sep 26, 2017 •

edited by reubano

Loading

reubano commented Sep 26, 2017 •

edited

Loading

reubano Sep 26, 2017 •

edited

Loading

reubano Sep 26, 2017 •

edited

Loading

reubano commented Sep 27, 2017 •

edited

Loading

reubano commented Sep 28, 2017 •

edited

Loading

reubano commented Sep 28, 2017 •

edited

Loading

reubano commented Sep 29, 2017 •

edited

Loading