mask2former_trt: Use the torch2trt library to convert the official Mask2former model to TensorRT

Key Highlights

The PyTorch model is directly converted to a native model built using the TensorRT API, rather than using torch_tensorrt. The converted model can run completely independently of PyTorch.
Pure Python implementation, seamlessly integrated with the original repository, easy to use with one-click conversion and personalized model configuration files.
A good example of using torch2trt to convert complex PyTorch models.
Future updates will support more features.

Challenges in Model Conversion

In the Encoder section, the msdeformattn operator is implemented as a separate custom CUDA operation, which severely impacts conventional model conversion methods, such as direct conversion to ONNX or TorchScript.
In the Decoder section, the attn_mask parameter of the native PyTorch nn.multiheadattention() operator does not support 4D tensors that include batches, leading to inconsistencies in input sizes compared to TensorRT.

Optimization Process

Added MSDeformableAttnPlugin as a custom plugin to torch2trt.
Implemented a multiheadattention in PyTorch that supports batch attn_mask parameters.
Modified a series of implementations in the model and added numerous custom converter functions for torch2trt to ensure smooth conversion.
Integrated some post-processing steps without branching if statements into the model to further enhance inference speed.

Notes

The tested version of TensorRT used in this repository is 8.6.1.6.
The inference image sizes commonly used in the original Mask2former repository are 800 and 1200. On machines with insufficient memory, conversion may lead to out-of-memory errors. It is recommended to adjust the MIN_SIZE_TEST and MAX_SIZE_TEST parameters in cfg.INPUT to modify the model's input size.
Due to differences in operator implementation, there may be discrepancies in inference results compared to native PyTorch. If you encounter unacceptable discrepancies during use, please raise an issue for specific analysis.
Do not use this repository for model training.

Usage Guide

Follow the official Mask2former library instructions to complete the installation of the native Mask2former.

See installation instructions.

Clone my maintained torch2trt library and compile the newly added MSDeformableAttnPlugin.

git submoudle init
git submoudle update
cd torch2trt

Then change the paths of the TensorRT library and header files in the CMakeLists.txt of torch2trt to your own paths, and then compile and install torch2trt.

python setup.py install
cmake -B build . && cmake --build build --target install && sudo ldconfig

Download the weights and test images, then run the script as shown below.

cd demo/
python demo_trt.py --config-file ../configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
  --input input1.jpg \
  [--other-options]
  --opts MODEL.WEIGHTS /path/to/checkpoint_file

Results Showcase

The configuration file is panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml
Original Image
Input image size is 427, 640
PyTorch test results
TensorRT test results
Interference Speed: On RTX3050 1 batch size

Pytorch2.5	Tensorrt fp32
12.25FPS	20.36FPS

TO DO

~~Support Swin backbone~~ Completed
Support semantic-segmentation models
Complete testing and debugging for batch_size > 1
fp16 int8 quantization
Convert the mask2former_video model

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
configs		configs
datasets		datasets
demo		demo
demo_video		demo_video
mask2former		mask2former
mask2former_video		mask2former_video
test		test
tools		tools
torch2trt @ f8a4c51		torch2trt @ f8a4c51
.gitignore		.gitignore
.gitmodules		.gitmodules
ADVANCED_USAGE.md		ADVANCED_USAGE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GETTING_STARTED.md		GETTING_STARTED.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
MODEL_ZOO.md		MODEL_ZOO.md
README.md		README.md
README_CN.md		README_CN.md
cog.yaml		cog.yaml
predict.py		predict.py
requirements.txt		requirements.txt
train_net.py		train_net.py
train_net_video.py		train_net_video.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mask2former_trt: Use the torch2trt library to convert the official Mask2former model to TensorRT

Key Highlights

Challenges in Model Conversion

Optimization Process

Notes

Usage Guide

Results Showcase

TO DO

About

Releases

Packages

Languages

License

lantudou/mask2former_trt

Folders and files

Latest commit

History

Repository files navigation

mask2former_trt: Use the torch2trt library to convert the official Mask2former model to TensorRT

Key Highlights

Challenges in Model Conversion

Optimization Process

Notes

Usage Guide

Results Showcase

TO DO

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages