Add FP16 support to batchedNMSPlugin #1002

pbridger · 2021-01-10T07:41:46Z

Add support for float16 boxes and scores (but not mixed precision between boxes and scores).

Gives inference results within expected numerical tolerances using SSD300 and COCO2017 validation set. (As detailed here https://paulbridger.com/posts/tensorrt-object-detection-quantized/).

Despite the "#if CUDA_ARCH >= 800" this has not been tested on Ampere, only Turing arch.

rajeevsrao · 2021-01-12T21:49:04Z

Thanks @pbridger. Can you please sign your commits with git commit --amend -s

pranavm-nvidia · 2021-01-13T22:03:39Z

@rajeevsrao Does this need to be integrated into master as well?

rajeevsrao · 2021-01-13T22:19:37Z

@rajeevsrao Does this need to be integrated into master as well?

Yes, but we will cherry-pick it internally first, test and release to master via 21.0x and then merge this change.

Signed-off-by: Tyler Zhu <[email protected]> Signed-off-by: Paul Bridger <[email protected]>

Signed-off-by: Paul Bridger <[email protected]>

Signed-off-by: Rajeev Rao <[email protected]> Signed-off-by: Paul Bridger <[email protected]>

rajeevsrao · 2021-01-14T01:33:08Z

Thanks @pbridger. Will merge this commit once the corresponding cherry-pick for master is posted alongwith the 21.02 container update.

pbridger · 2021-01-14T06:31:49Z

Cool! Many thanks @rajeevsrao and @pranavm-nvidia for all your work to include this.

rajeevsrao self-requested a review January 12, 2021 21:49

rajeevsrao self-assigned this Jan 12, 2021

rajeevsrao requested a review from pranavm-nvidia January 13, 2021 21:59

rajeevsrao added Plugins Issues when using TensorRT plugins Enhancement New feature or request Precision: INT8 Performance General performance issues labels Jan 13, 2021

pranavm-nvidia approved these changes Jan 13, 2021

View reviewed changes

Tyler-D and others added 5 commits January 14, 2021 01:36

Add configurable input size for TLT MaskRCNN Plugin

0631306

Signed-off-by: Tyler Zhu <[email protected]> Signed-off-by: Paul Bridger <[email protected]>

add fp16 capability to batchedNMSPlugins

3a735be

Signed-off-by: Paul Bridger <[email protected]>

fix serialization size

b97fe1c

Signed-off-by: Paul Bridger <[email protected]>

fix failing NMS on fp16 due to issue with box-size calculation

9e9255c

Signed-off-by: Paul Bridger <[email protected]>

Explicit casts to resolve compiler errors in __half type conversion

ce03817

Signed-off-by: Rajeev Rao <[email protected]> Signed-off-by: Paul Bridger <[email protected]>

pbridger force-pushed the release/7.2 branch from 84a1e71 to ce03817 Compare January 14, 2021 00:37

rajeevsrao approved these changes Jan 14, 2021

View reviewed changes

rajeevsrao merged commit 7ca28ec into NVIDIA:release/7.2 Jan 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FP16 support to batchedNMSPlugin #1002

Add FP16 support to batchedNMSPlugin #1002

pbridger commented Jan 10, 2021

rajeevsrao commented Jan 12, 2021

pranavm-nvidia commented Jan 13, 2021

rajeevsrao commented Jan 13, 2021

rajeevsrao commented Jan 14, 2021

pbridger commented Jan 14, 2021

Add FP16 support to batchedNMSPlugin #1002

Add FP16 support to batchedNMSPlugin #1002

Conversation

pbridger commented Jan 10, 2021

rajeevsrao commented Jan 12, 2021

pranavm-nvidia commented Jan 13, 2021

rajeevsrao commented Jan 13, 2021

rajeevsrao commented Jan 14, 2021

pbridger commented Jan 14, 2021