Multi-label Segmentation #51

shanpriya3 · 2021-10-11T14:38:01Z

Hi, I have a ground-truth with 3 classes including background with values 0,127,255. As mentioned in #43 I changed num_classes=3 in axialnet.py
In utils.py,

Medical-Transformer/utils.py

Lines 156 to 157 in 703a080

    
           mask[mask<=127] = 0 
        
           mask[mask>127] = 1

which makes the ground-truth with values 0 and 1 but I should have 0,1 and 2 for my case(with 3 classes).

I tried doing this
mask[mask<127] = 0
mask[mask==127] = 1
mask[mask>127] = 2

But I got this error. Could you please help me with this?

jeya-maria-jose · 2021-10-11T15:42:55Z

You should remove those lines with mask if you are converting it to multi-class problem. Your ground truth should just contain pixels of values 0,1,2,3 if you are working on a 3-class classification problem.

shanpriya3 · 2021-10-14T22:35:02Z

Hi, Thanks for your response. I did remove those 2 lines from my code. I have 3 classes/labels in total(including background) which have the values 0,1,2 in my ground-truth respectively. I also changed num_classes=3 in axialnet.py but when I run the code, I get this error. Does it have to do with the loss function? Do I need to change anything else? Could you please help me with this error?

shanpriya3 · 2021-10-21T13:13:36Z

Could you please explain what does these lines(189-192) do in train.py?

tmp[tmp>=0.5] = 1
tmp[tmp<0.5] = 0
tmp2[tmp2>0] = 1
tmp2[tmp2<=0] = 0

and also why you do this(205-206)?
yHaT[yHaT==1] =255
yval[yval==1] =255

I have to remove these lines for my case, right?

Qiang19990514 · 2022-03-28T02:46:43Z

请问这个你是用的哪个数据集

rw404 · 2023-02-21T19:04:45Z

@shanpriya3, the code in lines 189-192 applies an aggressive softmax - i.e. translate all predictions into binary format (either 0 or 1) to then store the mask in the format described in the repository's Readme (values 255 correspond to the object, 0 to the background).

Lines 205-206 are needed for the mask saving format described in the repository:

Based on the image, the model builds a response map: y_out = model(X_batch) on line 184;
The image is converted to numpy format, then it is assumed that the output of the model contains a probability map of whether a pixel belongs to objects, i.e. [batch_size, channels, width, height] are translated into [batch_size, num_classes, width, height] (in this case, num_classes = 3), and each position of the result contains such a number from 0 to 1 that if you add by the number of classes (dim=1) result, then you get a map (batch maps) of identical units ([batch_size, num_classes, width, height].sum(dim=1) == 1*[batch_size, width, height] - the description is formal, just to add interpretability), BUT:
- criterion = LogNLLLoss() is used as a criterion - line 111, however, this criterion is described in the metrics.py file on the 9th line and implements not LogNLLLoss, but CrossEntropy, that is, for predicting the model model(input) in the criterion object, softmax is applied first, so there is no used (in _forward_impl, forward methods).
- Then the result in the validation part before calling tmp[tmp>=0.5] = 1 in line 189, you need to call Softmax-transformation in order to interpret the model prediction (raw data) as probabilities i.e. replace y_out = model(X_batch) in line 184 with, for example, y_out = model.soft(model(X_batch)) or y_out = torch.nn.functional.softmax(model(X_batch), dim=1) .

Then, as @jeya-maria-jose mentioned, instead of modifying the mask, you need to remove these lines and assume that gt(ground truth) should contain integer values of object classes (0, 1 or 2 in this case), also for a simpler interpretation, it is easier to save the predictions of the validation set not only for the 1st channel, i.e. maybe change line 214 to cv2.imwrite(fulldir+image_filename, yHaT[0,1:,:,:].transpose(1, 2, 0)) with optional zero-padding or keeping the background layer to avoid errors saving dual channel images. The resulting mask will have num_classes-1 (no background) layers, and each layer will contain 255 only if the corresponding object is detected by the model in this pixel (for example, in the first layer in the $(i, j)$ position there will be 255, which means in $(i, j)$ is an object of the 1st class, and if $(i, j)$ contains 255 in the second layer, then the object of the 2nd class is in this position).

twofeetcat · 2023-06-30T11:57:57Z

Hello, I have a question, in training,output = model(X_batch)contains a probability map of whether a pixel belongs to objects.Does that mean that the values in tensor are numbers from 0 to 1?
Secondly, in the training phase, do I need to process y_batch(in this case, num_classes = 20, pixels of values 0,1,2,3... 19), or directly calculate the loss of it and outputloss = criterion(output, y_batch)

rw404 mentioned this issue Feb 21, 2023

Softmax fix #83

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-label Segmentation #51

Multi-label Segmentation #51

shanpriya3 commented Oct 11, 2021

jeya-maria-jose commented Oct 11, 2021

shanpriya3 commented Oct 14, 2021

shanpriya3 commented Oct 21, 2021

Qiang19990514 commented Mar 28, 2022

rw404 commented Feb 21, 2023

twofeetcat commented Jun 30, 2023

Multi-label Segmentation #51

Multi-label Segmentation #51

Comments

shanpriya3 commented Oct 11, 2021

jeya-maria-jose commented Oct 11, 2021

shanpriya3 commented Oct 14, 2021

shanpriya3 commented Oct 21, 2021

Qiang19990514 commented Mar 28, 2022

rw404 commented Feb 21, 2023

twofeetcat commented Jun 30, 2023