Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-label Segmentation #51

Open
shanpriya3 opened this issue Oct 11, 2021 · 6 comments
Open

Multi-label Segmentation #51

shanpriya3 opened this issue Oct 11, 2021 · 6 comments

Comments

@shanpriya3
Copy link

Hi, I have a ground-truth with 3 classes including background with values 0,127,255. As mentioned in #43 I changed num_classes=3 in axialnet.py
In utils.py,

mask[mask<=127] = 0
mask[mask>127] = 1

which makes the ground-truth with values 0 and 1 but I should have 0,1 and 2 for my case(with 3 classes).

I tried doing this
mask[mask<127] = 0
mask[mask==127] = 1
mask[mask>127] = 2

But I got this error. Could you please help me with this?
image

@jeya-maria-jose
Copy link
Owner

You should remove those lines with mask if you are converting it to multi-class problem. Your ground truth should just contain pixels of values 0,1,2,3 if you are working on a 3-class classification problem.

@shanpriya3
Copy link
Author

Hi, Thanks for your response. I did remove those 2 lines from my code. I have 3 classes/labels in total(including background) which have the values 0,1,2 in my ground-truth respectively. I also changed num_classes=3 in axialnet.py but when I run the code, I get this error. Does it have to do with the loss function? Do I need to change anything else? Could you please help me with this error?
image

@shanpriya3
Copy link
Author

Could you please explain what does these lines(189-192) do in train.py?

tmp[tmp>=0.5] = 1
tmp[tmp<0.5] = 0
tmp2[tmp2>0] = 1
tmp2[tmp2<=0] = 0

and also why you do this(205-206)?
yHaT[yHaT==1] =255
yval[yval==1] =255

I have to remove these lines for my case, right?

@Qiang19990514
Copy link

请问这个你是用的哪个数据集

@rw404
Copy link

rw404 commented Feb 21, 2023

@shanpriya3, the code in lines 189-192 applies an aggressive softmax - i.e. translate all predictions into binary format (either 0 or 1) to then store the mask in the format described in the repository's Readme (values 255 correspond to the object, 0 to the background).

Lines 205-206 are needed for the mask saving format described in the repository:

  1. Based on the image, the model builds a response map: y_out = model(X_batch) on line 184;
  2. The image is converted to numpy format, then it is assumed that the output of the model contains a probability map of whether a pixel belongs to objects, i.e. [batch_size, channels, width, height] are translated into [batch_size, num_classes, width, height] (in this case, num_classes = 3), and each position of the result contains such a number from 0 to 1 that if you add by the number of classes (dim=1) result, then you get a map (batch maps) of identical units ([batch_size, num_classes, width, height].sum(dim=1) == 1*[batch_size, width, height] - the description is formal, just to add interpretability), BUT:
    • criterion = LogNLLLoss() is used as a criterion - line 111, however, this criterion is described in the metrics.py file on the 9th line and implements not LogNLLLoss, but CrossEntropy, that is, for predicting the model model(input) in the criterion object, softmax is applied first, so there is no used (in _forward_impl, forward methods).
    • Then the result in the validation part before calling tmp[tmp>=0.5] = 1 in line 189, you need to call Softmax-transformation in order to interpret the model prediction (raw data) as probabilities i.e. replace y_out = model(X_batch) in line 184 with, for example, y_out = model.soft(model(X_batch)) or y_out = torch.nn.functional.softmax(model(X_batch), dim=1) .

Then, as @jeya-maria-jose mentioned, instead of modifying the mask, you need to remove these lines and assume that gt(ground truth) should contain integer values of object classes (0, 1 or 2 in this case), also for a simpler interpretation, it is easier to save the predictions of the validation set not only for the 1st channel, i.e. maybe change line 214 to cv2.imwrite(fulldir+image_filename, yHaT[0,1:,:,:].transpose(1, 2, 0)) with optional zero-padding or keeping the background layer to avoid errors saving dual channel images. The resulting mask will have num_classes-1 (no background) layers, and each layer will contain 255 only if the corresponding object is detected by the model in this pixel (for example, in the first layer in the $(i, j)$ position there will be 255, which means in $(i, j)$ is an object of the 1st class, and if $(i, j)$ contains 255 in the second layer, then the object of the 2nd class is in this position).

@rw404 rw404 mentioned this issue Feb 21, 2023
@twofeetcat
Copy link

Hello, I have a question, in training,output = model(X_batch)contains a probability map of whether a pixel belongs to objects.Does that mean that the values in tensor are numbers from 0 to 1?
Secondly, in the training phase, do I need to process y_batch(in this case, num_classes = 20, pixels of values 0,1,2,3... 19), or directly calculate the loss of it and outputloss = criterion(output, y_batch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants