Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RT-DETR from DeepStream-Yolo bugs #559

Closed
Borntowarn opened this issue Nov 19, 2023 · 13 comments · Fixed by #570
Closed

RT-DETR from DeepStream-Yolo bugs #559

Borntowarn opened this issue Nov 19, 2023 · 13 comments · Fixed by #570
Assignees
Labels

Comments

@Borntowarn
Copy link

Borntowarn commented Nov 19, 2023

I successfully ran example from samples/intersection_traffic_meter with standart yolov8 and add some other yolo model from DeepStream-Yolo repo by its guide. But there is a problem with a RT-DETR model. I exported it by instructions and it works well separatelly, but when I start savant module I get this error ValueError: The key 'people' is expected to be fully qualified name of the form 'model_name.object_label'. INFO insight::savant::intersection_traffic_meter'. It always happens with any detected objects on the frame.

If I disable every Units except the model and add standart DrawFunc, DETR works correctly and BBoxes are drown on output frames. module.yml and result:

name: intersection_traffic_meter

parameters:
  frame:
    width: 1920
    height: 1080
  output_frame:
    codec: ${oc.env:CODEC, 'h264'}
  draw_func: {}
  detected_object_label: transport
  send_stats: True
  batch_size: 1

pipeline:

  elements:
    - element: nvinfer@detector
      name: detr
      model:
        format: onnx
        model_file: detr.onnx
        config_file: config_infer_primary_rtdetr.txt
        workspace_size: 6144

image

I tried to run clear module in devcontainers with template but face to the similar problem - any yolo runs successfully, RT-DETR craches on example img with logs:

ERROR python::exception> [trace_id=568a09a792ac1ea9755984eb05ed717e, python.exception.value=The key 'tie' is expected to be fully qualified name of the form 'model_name.object_label'., python.exception.type=<class 'ValueError'>, python.exception.traceback=Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/savant/deepstream/pipeline.py", line 754, in update_frame_meta
    self._update_meta_for_single_frame(
  File "/usr/local/lib/python3.8/dist-packages/savant/deepstream/pipeline.py", line 781, in _update_meta_for_single_frame
    obj_meta, parent_id = nvds_obj_meta_output_converter(
  File "/usr/local/lib/python3.8/dist-packages/savant/deepstream/metadata.py", line 33, in nvds_obj_meta_output_converter
    model_name, label = parse_compound_key(nvds_obj_meta.obj_label)
, python.version=3.8.10 (default, May 26 2023, 14:05:08) 
[GCC 9.4.0]] Exception occurred
Aliases for entries in sys.path:
    <distpkg>: /usr/local/lib/python3.8/dist-packages
Traceback (most recent call last):
    <distpkg> /savant/deepstream/pipeline.py:754  update_frame_meta self._update_meta_for_single_frame(
    <distpkg> /savant/deepstream/pipeline.py:781  _update_meta_for_single_frame   obj_meta, parent_id = nvds_obj_meta_output_converter(
    <distpkg> /savant/deepstream/metadata.py:33   nvds_obj_meta_output_converter  model_name, label = parse_compound_key(nvds_obj_meta.obj_label)
ValueError: The key 'tie' is expected to be fully qualified name of the form 'model_name.object_label'

Also in template module in Jaeger I get different logf for YOLO and DETR:
YOLO
image

DETR
image

Logs from template module from devcontainers:
logs_detr.txt
logs_yolo.txt

@bwsw
Copy link
Contributor

bwsw commented Nov 19, 2023

There are two possible bugs:

  • exception
  • invalid parent span (maybe connected with the exception)

@Borntowarn
Copy link
Author

@bwsw Are you able to run DETR normally?

@bwsw bwsw added the maybebug label Nov 20, 2023
@bwsw bwsw added this to 0.2.7 Nov 20, 2023
@bwsw bwsw added this to the 0.2.7 milestone Nov 20, 2023
@bwsw bwsw moved this to Todo in 0.2.7 Nov 20, 2023
@bwsw bwsw moved this from Todo to In Progress in 0.2.7 Nov 20, 2023
@bwsw bwsw moved this from In Progress to Todo in 0.2.7 Nov 20, 2023
@abramov-oleg abramov-oleg moved this from Todo to In Progress in 0.2.7 Nov 20, 2023
@abramov-oleg
Copy link
Collaborator

@bwsw Are you able to run DETR normally?

@Borntowarn Hello. I've implemented a quick and easy Savant module that uses RT-DETR model. You can take a look at it here: #562.

I haven't been able to reproduce the bug so far.

It's not really clear from the issue description when the bug manifests. Judging by the quoted module.yml and the screenshot, your RT-DETR module works fine when there's only the detector unit + default drawfunc in the pipeline. Is that correct?

@abramov-oleg
Copy link
Collaborator

@Borntowarn

OK, I think I've got it. The intersection_traffic_meter sample module crashes with the reported message if I replace the yolov8m detector with the RT-DETR detector. I'm looking into it.

@Borntowarn
Copy link
Author

@abramov-oleg
Yep, thank you for help

@abramov-oleg
Copy link
Collaborator

abramov-oleg commented Nov 21, 2023

@Borntowarn

Can you please also share full module config that manifests the bug? I.e. how you configured the RT-DETR detector unit.

And the nvinfer config file (something like config_infer_primary_rtdetr.txt) if it is used.

Thank you.

@Borntowarn
Copy link
Author

Borntowarn commented Nov 21, 2023

@abramov-oleg
I tried to change

#gie-unique-id=1
#process-mode=1

but there is no changes

config_infer_primary_rtdetr.txt:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
onnx-file=detr.onnx
model-engine-file=model_b1_gpu0_fp16.engine
#int8-calib-file=calib.table
labelfile-path=labels.txt
batch-size=1
network-mode=2
num-detected-classes=80
interval=0
#gie-unique-id=1
#process-mode=1
network-type=0
cluster-mode=2
maintain-aspect-ratio=0
#workspace-size=2000
parse-bbox-func-name=NvDsInferParseYolo
#parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=/opt/savant/lib/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all]
nms-iou-threshold=0.45
pre-cluster-threshold=0.25
topk=300

@abramov-oleg
Copy link
Collaborator

@Borntowarn

I tried to change

#gie-unique-id=1
#process-mode=1

but there is no changes

This is expected, as these properties are managed by Savant.

Thank you for sharing the nvinfer config file. Nothing unexpected there, but it helps to clear up the case.

It will also be helpful to see your RT-DETR unit config from the module.yml that produces the bug.

@abramov-oleg abramov-oleg linked a pull request Nov 22, 2023 that will close this issue
@Borntowarn
Copy link
Author

Borntowarn commented Nov 22, 2023

@abramov-oleg
Thank you for answer! I've just updated model name and config in basic yolov8 module.
Here is a detector module.yml:

parameters:
  ...
  detected_object_label: transport

elements:

    - element: pyfunc
      module: module.src.line_crossing
      class_name: ConditionalDetectorSkip
      kwargs:
        config_path: ${oc.env:PROJECT_PATH}/module/configs/polygon_config.yml

    - element: nvinfer@detector
      name: detr
      model:
        format: onnx
        model_file: detr.onnx
        config_file: config_infer_primary_rtdetr.txt
        # max GPU RAM used to build the engine, 6GB by default
        # set lower than total GPU RAM available on your hardware
        workspace_size: 6144
        output:
          objects:
            # COCO bicycle
            - class_id: 1
              label: ${parameters.detected_object_label}
              selector:
                module: savant.selector.detector
                class_name: BBoxSelector
                kwargs:
                  confidence_threshold: 0.2
            # COCO car
            - class_id: 2
              label: ${parameters.detected_object_label}
              selector:
                module: savant.selector.detector
                class_name: BBoxSelector
                kwargs:
                  confidence_threshold: 0.2
            # COCO motorcycle
            - class_id: 3
              label: ${parameters.detected_object_label}
              selector:
                module: savant.selector.detector
                class_name: BBoxSelector
                kwargs:
                  confidence_threshold: 0.2
            # COCO bus
            - class_id: 5
              label: ${parameters.detected_object_label}
              selector:
                module: savant.selector.detector
                class_name: BBoxSelector
                kwargs:
                  confidence_threshold: 0.2
            # COCO truck
            - class_id: 7
              label: ${parameters.detected_object_label}
              selector:
                module: savant.selector.detector
                class_name: BBoxSelector
                kwargs:
                  confidence_threshold: 0.2

I noticed model crash when it detect any object not from nvinfer detection labels

@Borntowarn
Copy link
Author

I investigated some info about this problem. If add num_detected_classes=1 and only label 0 for detected people to module it won't crash on next detections. But if num_detected_classes=3 (cars and bicycles from labels.txt) it will crash. So if there is any detected class not from the output param, module crashes.

Hope it helps.

@abramov-oleg
Copy link
Collaborator

@Borntowarn

I investigated some info about this problem. If add num_detected_classes=1 and only label 0 for detected people to module it won't crash on next detections. But if num_detected_classes=3 (cars and bicycles from labels.txt) it will crash. So if there is any detected class not from the output param, module crashes.

Hope it helps.

Thank you for sharing the detector unit config and your findings. They confirm the source of the problem for me.

The fix is ready and currently in review. Should be merged soon.

@Borntowarn
Copy link
Author

Borntowarn commented Nov 23, 2023

@abramov-oleg

The fix is ready and currently in review. Should be merged soon.

Thanks a lot! I have already cloned your PR and tested it. It work fine! Should I wait for PR or can close issue now?

@bwsw
Copy link
Contributor

bwsw commented Nov 23, 2023

We close automatically when it merges.

@bwsw bwsw closed this as completed in #570 Nov 28, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in 0.2.7 Nov 28, 2023
@bwsw bwsw removed this from the 0.2.7 milestone Feb 7, 2024
@bwsw bwsw removed this from 0.2.7 Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants