-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] The pointPillars model got wrong output when I use TensorRT acceleration #1520
Comments
Add a little infomation, there are some worning when I run tools/test.py: 2022-12-12 16:14:11,925 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /home/yangkang/anaconda3/envs/ri_fusion/lib/python3.7/site-packages/mmdeploy/lib/libmmdeploy_tensorrt_ops.so [ ] 1/3769, 0.5 task/s, elapsed: 2s, ETA: 7918s2022-12-12 16:14:20,237 - mmdeploy - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied [ ] 2/3769, 0.8 task/s, elapsed: 3s, ETA: 4826s2022-12-12 16:14:20,700 - mmdeploy - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied ----------- AP11 Results ------------ Pedestrian [email protected], 0.50, 0.50: Overall AP11@easy, moderate, hard: ----------- AP40 Results ------------ Pedestrian [email protected], 0.50, 0.50: Overall AP40@easy, moderate, hard: |
Let me try to reproduce it with TRT8.5 later. |
And I have tested pointpillars+centerpoint on cu102+TRT8.4 and passed. |
Thanks for your reply, |
TRT8.4.3.1+CUDA11.3, running log is: 2022-12-17 10:19:01,704 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /home/yangkang/anaconda3/envs/ri_fusion/lib/python3.7/site-packages/mmdeploy/lib/libmmdeploy_tensorrt_ops.so [ ] 1/3769, 0.3 task/s, elapsed: 4s, ETA: 14365s2022-12-17 10:19:10,382 - mmdeploy - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied [ ] 2/3769, 0.5 task/s, elapsed: 4s, ETA: 7894s2022-12-17 10:19:10,766 - mmdeploy - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied [ ] 3/3769, 0.7 task/s, elapsed: 5s, ETA: 5740s2022-12-17 10:19:11,143 - mmdeploy - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied [ ] 4/3769, 0.8 task/s, elapsed: 5s, ETA: 4658s2022-12-17 10:19:11,521 - mmdeploy - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied ----------- AP11 Results ------------ Pedestrian [email protected], 0.50, 0.50: |
I tested again on another server with CUDA10.2 + TensorRT8.4.3.1. The problem was gone, I got the correct results. thanks alot. |
I have no TRT source code, just some assumptions.
|
Hi @ykqyzzs , I need your help on this. |
the pytorch version I used is 1.10.1; |
I met the same error with CUDA11.4+TensorRT8.4.0.11 , and have you found the reason? |
sorry I have no idea. I have cross-verified some cuda and trt versions, cuda11.3+trt8.4 also failed, cuda102+trt 8.4.3.1 passed. maybe you can have a try. |
This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response. |
This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now. |
Checklist
Describe the bug
Hi,
I used mmdeploy/tools/test.py to test the converted pointpillars onnx model, it successfully finished but the AP result is abnormal:
----------- AP11 Results ------------
Pedestrian [email protected], 0.50, 0.50:
bbox AP11:0.0000, 0.0000, 0.0000
bev AP11:0.0000, 0.0000, 0.0000
3d AP11:0.0000, 0.0000, 0.0000
aos AP11:0.00, 0.00, 0.00
Pedestrian [email protected], 0.25, 0.25:
bbox AP11:0.0000, 0.0000, 0.0000
bev AP11:0.0000, 0.0000, 0.0000
3d AP11:0.0000, 0.0000, 0.0000
aos AP11:0.00, 0.00, 0.00
Cyclist [email protected], 0.50, 0.50:
bbox AP11:0.0000, 0.0000, 0.0000
bev AP11:0.0000, 0.0000, 0.0000
3d AP11:0.0000, 0.0000, 0.0000
aos AP11:0.00, 0.00, 0.00
Cyclist [email protected], 0.25, 0.25:
bbox AP11:0.0000, 0.0000, 0.0000
bev AP11:0.0000, 0.0000, 0.0000
3d AP11:0.0000, 0.0000, 0.0000
aos AP11:0.00, 0.00, 0.00
Car [email protected], 0.70, 0.70:
bbox AP11:0.0000, 9.0909, 9.0909
bev AP11:0.0000, 9.0909, 9.0909
3d AP11:0.0000, 9.0909, 9.0909
aos AP11:0.00, 9.09, 9.09
Car [email protected], 0.50, 0.50:
bbox AP11:0.0000, 9.0909, 9.0909
bev AP11:0.0000, 9.0909, 9.0909
3d AP11:0.0000, 9.0909, 9.0909
aos AP11:0.00, 9.09, 9.09
Overall AP11@easy, moderate, hard:
bbox AP11:0.0000, 3.0303, 3.0303
bev AP11:0.0000, 3.0303, 3.0303
3d AP11:0.0000, 3.0303, 3.0303
aos AP11:0.00, 3.03, 3.03
----------- AP40 Results ------------
Pedestrian [email protected], 0.50, 0.50:
bbox AP40:0.0000, 0.0000, 0.0000
bev AP40:0.0000, 0.0000, 0.0000
3d AP40:0.0000, 0.0000, 0.0000
aos AP40:0.00, 0.00, 0.00
Pedestrian [email protected], 0.25, 0.25:
bbox AP40:0.0000, 0.0000, 0.0000
bev AP40:0.0000, 0.0000, 0.0000
3d AP40:0.0000, 0.0000, 0.0000
aos AP40:0.00, 0.00, 0.00
Cyclist [email protected], 0.50, 0.50:
bbox AP40:0.0000, 0.0000, 0.0000
bev AP40:0.0000, 0.0000, 0.0000
3d AP40:0.0000, 0.0000, 0.0000
aos AP40:0.00, 0.00, 0.00
Cyclist [email protected], 0.25, 0.25:
bbox AP40:0.0000, 0.0000, 0.0000
bev AP40:0.0000, 0.0000, 0.0000
3d AP40:0.0000, 0.0000, 0.0000
aos AP40:0.00, 0.00, 0.00
Car [email protected], 0.70, 0.70:
bbox AP40:0.0000, 2.5000, 2.5000
bev AP40:0.0000, 2.5000, 2.5000
3d AP40:0.0000, 2.5000, 2.5000
aos AP40:0.00, 2.50, 2.50
Car [email protected], 0.50, 0.50:
bbox AP40:0.0000, 2.5000, 2.5000
bev AP40:0.0000, 2.5000, 2.5000
3d AP40:0.0000, 2.5000, 2.5000
aos AP40:0.00, 2.50, 2.50
Overall AP40@easy, moderate, hard:
bbox AP40:0.0000, 0.8333, 0.8333
bev AP40:0.0000, 0.8333, 0.8333
3d AP40:0.0000, 0.8333, 0.8333
aos AP40:0.00, 0.83, 0.83
Then I checked the data_loader, the inputs seems correct,and I have checked the onnx model, it seems ok and visualization by netron is almost same as the pointpillars onnx file from this link(which I found from this issue,NVIDIA/TensorRT#2338):
https://drive.google.com/file/d/1FuZJWLIsJyUsUk_lM1euXzyPgagu-tXj/view?usp=sharing
I have tried the both onnx file, converted into .engine file and test, but got the same results.
So I print the outputs then found that, outputs = task_processor.single_gpu_test(model, data_loader, args.show, args.show_dir) returned the empty outputs such as:
{'boxes_3d': LiDARInstance3DBoxes(
tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}, {'boxes_3d': LiDARInstance3DBoxes(
tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}, {'boxes_3d': LiDARInstance3DBoxes(
tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}, {'boxes_3d': LiDARInstance3DBoxes(
tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}, {'boxes_3d': LiDARInstance3DBoxes(
tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}, {'boxes_3d': LiDARInstance3DBoxes(
tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}, ... ...
So, I would like to ask, what may be the cause of the the wrong test results? and how to solve it?
Thanks very much!
Reproduction
I use this command to convert model:
python mmdeploy/tools/deploy.py mmdeploy/configs/mmdet3d/voxel-detection/voxel-detection_tensorrt_dynamic-kitti-32x4.py mmdetection3d/configs/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py checkpoints/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class_20220301_150306-37dc2420.pth mmdetection3d/demo/data/kitti/kitti_000008.bin --work-dir work-dir2 --device cuda:0 --show
I use this command to test the converted model:
python ../mmdeploy/tools/test.py ../mmdeploy/configs/mmdet3d/voxel-detection/voxel-detection_tensorrt_dynamic-kitti-32x4.py ./configs/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py --model ../mmdeploy/work-dir2/end2end.engine --metrics bbox --device cuda:0
Environment
Error traceback
The text was updated successfully, but these errors were encountered: