Trained model does not work #73

urtepuod · 2024-01-22T11:47:37Z

Hello, I've tried to train a new model from scratch using these settings: omnipose --train --use_gpu --dir "/home/urte/3D modeller/3d_cell_detector/trainingdata/Omni_5"
--img_filter '' --mask_filter _cp_masks
--pretrained_model None
--diameter 0 --nclasses 3 --nchan3 --tyx 512,512
--learning_rate 0.1 --RAdam --batch_size 5 --n_epochs 900 --save_every 300 --verbose
The training is successful, however if I try to import the model into the GUI, I get this error:
2024-01-22 11:45:23,186 [INFO] ** TORCH GPU version installed and working. **
2024-01-22 11:45:23,188 [INFO] >>>> using GPU
ERROR: Error(s) in loading state_dict for CPnet:
size mismatch for downsample.down.res_down_0.conv.conv_0.0.weight: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).
size mismatch for downsample.down.res_down_0.conv.conv_0.0.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).
size mismatch for downsample.down.res_down_0.conv.conv_0.0.running_mean: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).
size mismatch for downsample.down.res_down_0.conv.conv_0.0.running_var: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).
size mismatch for downsample.down.res_down_0.conv.conv_0.2.weight: copying a param with shape torch.Size([32, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 1, 3, 3]).
size mismatch for downsample.down.res_down_0.proj.0.weight: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).
size mismatch for downsample.down.res_down_0.proj.0.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).
size mismatch for downsample.down.res_down_0.proj.0.running_mean: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).
size mismatch for downsample.down.res_down_0.proj.0.running_var: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).
size mismatch for downsample.down.res_down_0.proj.1.weight: copying a param with shape torch.Size([32, 3, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 1, 1, 1]).
size mismatch for output.2.weight: copying a param with shape torch.Size([4, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 32, 1, 1]).
size mismatch for output.2.bias: copying a param with shape torch.Size([4]) from checkpoint, the shape in current model is torch.Size([3]).
I had tried training with nchan 2, nclases 3, but during training it will automatically reset to nchan 3. I have grayscale images with masks produced from cellpose in my training data. I apologise if this is trivial, however this is all very new to me.

kevinjohncutler · 2024-01-29T21:41:51Z

@urtepuod sorry for the delay! You may want to email me at [email protected] to debug further. I'd like to get your model and an example image to debug. I also need your pip list. I usually see this issue when cellpose has not been fully uninstalled or if we are just working with an older version of cellpose_omni and omnipose. In the most recent version of the GUI, you can choose nchan and select "boundary field output" if you trained with nclasses 3. If you images are grayscale, however, I think the model should have been trained with no channels (but RGB grayscale could have messed that up).

marieanselmet · 2024-05-16T07:47:33Z

Hello, I have the same problem here. If I train an omnipose model with nclasses = 4, I need to specify nclasses = 4 for the inference when using this model, otherwise I obtain the same error as above since by default nclasses = 2 now. Why this choice for the default value of nclasses ? How much it would impair the performance when training an omnipose model to loose 2 output branches ?
Thanks a lot !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trained model does not work #73

Trained model does not work #73

urtepuod commented Jan 22, 2024

kevinjohncutler commented Jan 29, 2024

marieanselmet commented May 16, 2024

Trained model does not work #73

Trained model does not work #73

Comments

urtepuod commented Jan 22, 2024

kevinjohncutler commented Jan 29, 2024

marieanselmet commented May 16, 2024