Classification model pipeline #298

vessemer · 2018-01-24T22:21:44Z

Pipeline to train and predict by the model was provided along with an interface for classification_model.

Reference to official issue

Metrics:

Model' CPM score was described in PR #292:

CPM over 10-Fold cross validation:

0.125	0.25	0.5	1	2	4	8	Score (CPM)
0.595	0.670	0.731	0.793	0.835	0.868	0.887	0.76

Item	Config
GPU	NVIDIA TITAN X
GPU memory usage	12GiB for batch size 32

CLA

I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well

vessemer · 2018-01-24T22:23:02Z

Currently, I'm working on the training part adjustment of grt123 algorithm.

reubano · 2018-01-25T11:45:31Z

Restarted the travis build since the error occurred before even reaching the tests.

reubano · 2018-01-25T11:40:12Z

prediction/src/algorithms/classify/src/classification_model.py

+        pull_size (int): maximum amount of batches allowed to be stored in RAM.
+    """
+
+    @abc.abstractmethod


what does abstractmethod do in this case? Python docs say,

Using this decorator requires that the class’s metaclass is ABCMeta or is derived from it.

Also, why not use a new style class, i.e., class ClassificationModel(object):

Indeed, thanks :)

vessemer · 2018-01-26T03:01:47Z

Okh, this line along with removal this one makes me suffer for a while and force to code and tests updates. Unfortunately, at the moment of PR #272 I did not have the opportunity to provide a review, now to be clear:
First: ndarray coordinates of the real world point should be computed in this way: (point - origin) / spacingand not (point - origin) * spacing, where spacing is the shape of one voxel in real-world units.
Second: meta.spacing stores information of aforementioned voxel's shape, and if we apply affine transformation such as zoom, then meta.spacing should changes too, which is not the previous behaviour. The way that preprocess_ct.PreprocessCT handle spacing parameter is to zoom an ndarray exactly to this new spacing by setting zoom_fctr to be old_spacing / new_spacing this means that we need no more to store the old_spacing in meta and at the same time it's enough to store only current spacing for real-world - ndarray coordinates translations. That's why this line is important.
This mistakes were propagated to the tests in a way of coordinates picking.

Quick check:

lamby · 2018-01-26T03:44:20Z

@vessemer Can you rework your previous comment into the code itself? Remarks here are really really useful (!) but it won't be clear to anyone following the codebase later, you see...

vessemer · 2018-01-26T04:00:12Z

@lamby, good point, done :)

lamby · 2018-01-26T21:29:36Z

Thanks!

reubano · 2018-01-30T15:56:02Z

prediction/src/tests/test_classification.py

+#     predicted = trained_model.predict(dicom_paths[0], nodule_locations, model_path)
+#     assert predicted
+#     assert 0 <= predicted[0]['p_concerning'] <= 1
+#



Why comment out these tests? Anything else needed to keep them passing?

Not at all, just forget to uncomment them, done here

vessemer force-pushed the 131_additional_model branch from 30dc1a5 to d7a95e9 Compare January 24, 2018 23:50

reubano reviewed Jan 25, 2018

View reviewed changes

Classification model pipeline

73226c2

vessemer force-pushed the 131_additional_model branch from d7a95e9 to 73226c2 Compare January 26, 2018 02:15

Preserve correct spacing translations

1729eb4

vessemer force-pushed the 131_additional_model branch from a7bd393 to dbf800e Compare January 26, 2018 02:36

Tests updates wrt spacing translation update

a4f5846

vessemer force-pushed the 131_additional_model branch from dbf800e to a4f5846 Compare January 26, 2018 03:07

vessemer force-pushed the 131_additional_model branch from d921425 to f5e1619 Compare January 26, 2018 04:05

Remarks

ab71933

vessemer force-pushed the 131_additional_model branch from f5e1619 to ab71933 Compare January 26, 2018 09:35

lamby merged commit 9087540 into drivendataorg:master Jan 26, 2018

vessemer deleted the 131_additional_model branch January 26, 2018 22:11

vessemer restored the 131_additional_model branch January 26, 2018 22:11

reubano reviewed Jan 30, 2018

View reviewed changes

This was referenced Jan 30, 2018

Chose an appropriate DATA_SHAPE #304

Closed

Reactivate tests and clear models #306

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classification model pipeline #298

Classification model pipeline #298

vessemer commented Jan 24, 2018

vessemer commented Jan 24, 2018

reubano commented Jan 25, 2018

reubano Jan 25, 2018

vessemer Jan 26, 2018

vessemer commented Jan 26, 2018

lamby commented Jan 26, 2018

vessemer commented Jan 26, 2018

lamby commented Jan 26, 2018

reubano Jan 30, 2018

vessemer Jan 30, 2018

Classification model pipeline #298

Classification model pipeline #298

Conversation

vessemer commented Jan 24, 2018

Reference to official issue

Metrics:

CLA

vessemer commented Jan 24, 2018

reubano commented Jan 25, 2018

reubano Jan 25, 2018

Choose a reason for hiding this comment

vessemer Jan 26, 2018

Choose a reason for hiding this comment

vessemer commented Jan 26, 2018

lamby commented Jan 26, 2018

vessemer commented Jan 26, 2018

lamby commented Jan 26, 2018

reubano Jan 30, 2018

Choose a reason for hiding this comment

vessemer Jan 30, 2018

Choose a reason for hiding this comment