Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Models: adding PyTorch TorchVision's MaskRCNN_ResNet50_FPN_V2, FasterRCNN_ResNet50_FPN_V2 and RetinaNet_ResNet50_FPN_V2] #9653

Open
2 tasks done
medphisiker opened this issue Jan 19, 2023 · 0 comments
Assignees
Labels
feature request Request new features

Comments

@medphisiker
Copy link

medphisiker commented Jan 19, 2023

Models description

Hello.

there is new intresting version of Masked RCNN model in TorchVision (link).
maskrcnn_resnet50_fpn_v2 - Improved Mask R-CNN model with a ResNet-50-FPN backbone from the Benchmarking Detection Transfer Learning with Vision Transformers paper.

maskrcnn_resnet50_fpn_v2 model gives effective increase(link) for MS COCO metric in comparision with classic maskrcnn_resnet50_fpn.

image

I see some examples of fine tuning. The code for fine tuning maskrcnn_resnet50_fpn_v2 and maskrcnn_resnet50_fpn are identical.
MMDetection framework has support for classic TorchVision's maskrcnn_resnet50_fpn fine tuning. It will be great if MMDetection framework also has support for new TorchVision's maskrcnn_resnet50_fpn_v2.

Describe the solution you'd like
It will be great if MMDetection framework also has support for new TorchVision's maskrcnn_resnet50_fpn_v2. Also there is an updated version of the these detectors, - FasterRCNN_ResNet50_FPN_V2 and RetinaNet_ResNet50_FPN_V2.

image

P.S.
Currently, we already have many excellent neural networks for detection in the MMDetection framework. But it is important that Faster and Masked RCN are multi-stage detectors. Most of the more accurate semi real-time detectors are single-stage.

In one competition, I used YOLOv7, which had a higher metric on MS COCO for detection (53). But the competitors that used the classic multistage Faster R-CNN won that gives only 37. It turned out that on a dataset with crowded objects, Faster RCNN works better than a single-stage YOLOv7, even though there is a big difference in metrics on MS COCO in the YOLOv7 slider.

Open source status

  • The model implementation is available
  • The model weights are available.

Provide useful links for the implementation

Improved Mask R-CNN v2 model with a ResNet-50-FPN backbone describes in the Benchmarking Detection Transfer Learning with Vision Transformers paper.
We have implementation of this model in PyTorch TorchVision (link).
There are [MaskRCNN_ResNet50_FPN_V2_Weights.COCO_V1] in torchvision too (https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.maskrcnn_resnet50_fpn_v2.html#torchvision.models.detection.MaskRCNN_ResNet50_FPN_V2_Weights).
There is link to merge request (pytorch/vision#5773).
It seems that @datumbox is the author of the code.

Constructs an improved Faster R-CNN v2 model with a ResNet-50-FPN backbone from Benchmarking Detection Transfer Learning with Vision Transformers paper.
We have implementation of this model in PyTorch TorchVision (link).
There are [FasterRCNN_ResNet50_FPN_V2_Weights.COCO_V1] in torchvision too (https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.fasterrcnn_resnet50_fpn_v2.html#torchvision.models.detection.FasterRCNN_ResNet50_FPN_V2_Weights).
There is link to merge request (pytorch/vision#5763).
It seems that @datumbox is the author of the code.

There is no such information about RetinaNet_ResNet50_FPN_V2, but I think that TorchVision's developers create it by the same principle.
We have implementation of this model in PyTorch TorchVision (link).
There are [RetinaNet_ResNet50_FPN_V2_Weights.COCO_V1] in torchvision too (https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.retinanet_resnet50_fpn_v2.html#torchvision.models.detection.RetinaNet_ResNet50_FPN_V2_Weights)
There is link to merge request (pytorch/vision#5756).
It seems that @datumbox is the author of the code.

As I understand on the same principle @datumbox created FasterRCNN_ResNet50_FPN_V2, MaskRCNN_ResNet50_FPN_V2 and RetinaNet_ResNet50_FPN_V2.
Perhaps you can improve the rest of the backbones that are available for these architectures in MMDetection =)
That would be just super )

@RangiLyu RangiLyu added the feature request Request new features label Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request new features
Projects
None yet
Development

No branches or pull requests

2 participants