Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Component manager load node service queue overflows and fails to launch some nodes #2362

Closed
xmfcx opened this issue Nov 8, 2023 · 1 comment · Fixed by #2363
Closed

Component manager load node service queue overflows and fails to launch some nodes #2362

xmfcx opened this issue Nov 8, 2023 · 1 comment · Fixed by #2363

Comments

@xmfcx
Copy link
Contributor

xmfcx commented Nov 8, 2023

Bug report

Required Info:

  • Operating System:
    • Ubuntu 22.04
  • Installation type:
    • binaries
  • Version or commit hash:
    • humble
  • DDS implementation:
    • CycloneDDS
  • Client library (if applicable):
    • rclcpp_components

Steps to reproduce issue

import launch
from launch_ros.actions import ComposableNodeContainer
from launch_ros.descriptions import ComposableNode
from launch_ros.actions import LoadComposableNodes


def generate_launch_description():
    container = ComposableNodeContainer(
            name='my_container',
            package='rclcpp_components',
            executable='component_container',
            namespace='',
            output='log',
    )

    loaders = []
    for i in range(30):
      
        node = ComposableNode(
            package='composition',
            plugin='composition::Talker',
            name='talker_' + str(i),
        )
        
        loaders.append(
          LoadComposableNodes(
                composable_node_descriptions=[node],
                target_container="my_container",
            )
        )
    
    return launch.LaunchDescription([container, *loaders])
  • Save this launch file to a file called composition_overload.py.
  • Source ROS 2 humble: source /opt/ros/humble/setup.bash
  • Run the launch file: ros2 launch composition_overload.py
  • Run ros2 component list

Expected behavior

To see all talkers loaded into the container (30 for this case):

$ ros2 component list
/my_container
  1  /talker_0
  2  /talker_1
  3  /talker_2
  4  /talker_3
  5  /talker_4
... (omitted for clarity)
 25  /talker_24
 26  /talker_25
 27  /talker_26
 28  /talker_27
 29  /talker_28
 30  /talker_29

Actual behavior

About 2/3 is loaded.

$ ros2 component list
/my_container
  1  /talker_0
  2  /talker_1
  3  /talker_2
  4  /talker_3
  5  /talker_4
  6  /talker_11
  7  /talker_12
  8  /talker_28
  9  /talker_20
  10  /talker_21
  11  /talker_22
  12  /talker_23
  13  /talker_24
  14  /talker_19
  15  /talker_25
  16  /talker_26
  17  /talker_27
  18  /talker_29

Additional information

@VRichardJP has found out that

loadNode_srv_ = create_service<LoadNode>(
"~/_container/load_node",
std::bind(&ComponentManager::on_load_node, this, _1, _2, _3));
was the cause of this issue.

Simultaneous calls fill up the QoS queue of the loadNode_srv_ service, which has 10 queue size by default.

Related issue:

Potential solutions

Solution A

Increase the default queue size for these load time critical services (loadNode_srv_, unloadNode_srv_) to a large number like 1000?
Pros:

  • This wouldn't change the interface
  • Probably higher numbers won't be needed, since I think this is the first time this issue is reported?
  • The service LoadNode.srv is a small service, so having a lot of items to the queue doesn't increase the computational burden a lot.

Cons:

  • If for some reason someone needs a larger queue, we would have to repeat this issue.
Solution B

Parameterize the queue size for loadNode_srv_, unloadNode_srv_.
Pros:

  • Can be set to any number by the user
  • With default values, it would act the same

Cons:

  • Now it takes a parameter, launch interface changes (backward compatible)

We are open to any other solutions, ideas and additional design considerations while implementing this to rolling.

cc. @mitsudome-r @isamu-takagi @wep21 @esteve

@clalancette
Copy link
Contributor

Increase the default queue size for these load time critical services (loadNode_srv_, unloadNode_srv_) to a large number like 1000?

I think we should try this to start with, though I'm going to suggest we start a bit more conservatively. The current queue size is 10, so I think if we bumped it to 100 or 200 to start with, that should be more than enough. If you'd like to open a PR doing that, I'd be happy to review it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants