Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminate called after throwing an instance of 'rclcpp::exceptions::RCLError' with Nav2 #8

Closed
ruffsl opened this issue Mar 3, 2021 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@ruffsl
Copy link
Member

ruffsl commented Mar 3, 2021

System Info

  • OS
    • Ubuntu 20.04
  • ROS version and installation type
    • foxy/binary
  • RTI Connext DDS version and installation type
    • rti_connext_dds-6.0.1-eval-x64Linux3gcc5.4.0.run
  • RMW version or commit hash

Bug Description

The Nav2 stack crashes immediately upon startup when using rmw_connextdds.

Expected Behavior

The nav2 stack works as expected when using rmw_cyclonedds or rmw_fastrtps_cpp, at least when DDS Security is disabled.

How to Reproduce

Clone the turtlebot3_demo repo from the ros security working group:

https://github.com/ros-swg/turtlebot3_demo.git

Apply the following patch to enable and use rmw_connextdds

diff --git a/Dockerfile b/Dockerfile
index 18c71f1..21a4194 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -37,20 +37,20 @@ RUN apt-get update && apt-get install -y \
       vim \
     && rm -rf /var/lib/apt/lists/*
 
-# # install RTI Connext DDS
-# # set up environment
-# ENV NDDSHOME /opt/rti.com/rti_connext_dds-6.0.1
-# WORKDIR $NDDSHOME
-# COPY ./rti ./
-# RUN yes | ./rti_connext_dds-6.0.1-eval-x64Linux3gcc5.4.0.run && \
-#     mv y/*/* ./ && rm -rf y
-# # set RTI DDS environment
-# ENV CONNEXTDDS_DIR $NDDSHOME
-# ENV PATH "$NDDSHOME/bin":$PATH
-# ENV LD_LIBRARY_PATH "$NDDSHOME/lib/x64Linux3gcc5.4.0":$LD_LIBRARY_PATH
-# # set RTI openssl environment
-# ENV PATH "$NDDSHOME/third_party/openssl-1.1.1d/x64Linux4gcc7.3.0/release/bin":$PATH
-# ENV LD_LIBRARY_PATH "$NDDSHOME/third_party/openssl-1.1.1d/x64Linux4gcc7.3.0/release/lib":$LD_LIBRARY_PATH
+# install RTI Connext DDS
+# set up environment
+ENV NDDSHOME /opt/rti.com/rti_connext_dds-6.0.1
+WORKDIR $NDDSHOME
+COPY ./rti ./
+RUN yes | ./rti_connext_dds-6.0.1-eval-x64Linux3gcc5.4.0.run && \
+    mv y/*/* ./ && rm -rf y
+# set RTI DDS environment
+ENV CONNEXTDDS_DIR $NDDSHOME
+ENV PATH "$NDDSHOME/bin":$PATH
+ENV LD_LIBRARY_PATH "$NDDSHOME/lib/x64Linux3gcc5.4.0":$LD_LIBRARY_PATH
+# set RTI openssl environment
+ENV PATH "$NDDSHOME/third_party/openssl-1.1.1d/x64Linux4gcc7.3.0/release/bin":$PATH
+ENV LD_LIBRARY_PATH "$NDDSHOME/third_party/openssl-1.1.1d/x64Linux4gcc7.3.0/release/lib":$LD_LIBRARY_PATH
 
 # install overlay dependencies
 ARG OVERLAY_WS
diff --git a/configs/secure.conf b/configs/secure.conf
index b77a0d0..bd9166d 100644
--- a/configs/secure.conf
+++ b/configs/secure.conf
@@ -6,7 +6,7 @@ send-keys 'glances' Enter
 
 setenv FOO "foo"
 # setenv RMW_IMPLEMENTATION rmw_connext_cpp
-# setenv RMW_IMPLEMENTATION rmw_connextdds
+setenv RMW_IMPLEMENTATION rmw_connextdds
 setenv ROS_SECURITY_ENABLE true
 setenv ROS_SECURITY_STRATEGY Enforce
 setenv ROS_SECURITY_KEYSTORE $TB3_DEMO_DIR/keystore
diff --git a/configs/unsecure.conf b/configs/unsecure.conf
index 013fd54..9a078f8 100644
--- a/configs/unsecure.conf
+++ b/configs/unsecure.conf
@@ -6,6 +6,6 @@ send-keys 'glances' Enter
 
 setenv FOO "foo"
 # setenv RMW_IMPLEMENTATION rmw_connext_cpp
-# setenv RMW_IMPLEMENTATION rmw_connextdds
+setenv RMW_IMPLEMENTATION rmw_connextdds
 
 source configs/common.conf
diff --git a/overlay/overlay.repos b/overlay/overlay.repos
index 7ab9b81..73be2a7 100644
--- a/overlay/overlay.repos
+++ b/overlay/overlay.repos
@@ -1,8 +1,8 @@
 repositories:
-  # rticommunity/rmw_connextdds:
-  #   type: git
-  #   url: https://github.com/rticommunity/rmw_connextdds.git
-  #   version: master
+  rticommunity/rmw_connextdds:
+    type: git
+    url: https://github.com/rticommunity/rmw_connextdds.git
+    version: master
   # ROBOTIS-GIT/turtlebot3:
   #   type: git
   #   # url: https://github.com/ROBOTIS-GIT/turtlebot3.git

Download the rti_connext_dds-6.0.1-eval-x64Linux3gcc5.4.0.run setup file and place this with your rti_license.dat file in the rti folder in the repo.

Follow instructions in Setting the Demo for installing docker and rocker, etc. Then build and run either the demo with or without DDS security, using byobu to switch to the navigation tmux tab to either view the stdout or debug launching nav2 stack.

docker build . -t rosswg/turtlebot3_demo
rocker --x11 --nvidia rosswg/turtlebot3_demo "byobu -f configs/unsecure.conf attach"
rocker --x11 --nvidia rosswg/turtlebot3_demo "byobu -f configs/secure.conf attach"

Workarounds

I recall this same demo working with rmw_connext_cpp at least without DDS Security enabled. This can be tested again by building the docker image using rmw_connext_cpp and connext 5.3.1 instead:

diff --git a/Dockerfile b/Dockerfile
index 18c71f1..ebd1479 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -73,27 +73,27 @@ RUN . /opt/ros/$ROS_DISTRO/setup.sh && \
       --symlink-install \
       --mixin $OVERLAY_MIXINS
 
-# # install RTI Connext
-# ENV RTI_NC_LICENSE_ACCEPTED yes
-# RUN apt-get update && apt-get install -y \
-#       ros-$ROS_DISTRO-rmw-connext-cpp \
-#     && rm -rf /var/lib/apt/lists/*
-# # set up environment
-# ENV NDDSHOME /opt/rti.com/rti_connext_dds-5.3.1
-# ENV PATH "$NDDSHOME/bin":$PATH
-# ENV LD_LIBRARY_PATH "$NDDSHOME/lib/x64Linux3gcc5.4.0":$LD_LIBRARY_PATH
-# # install RTI Security
-# WORKDIR $NDDSHOME
-# # ADD https://s3.amazonaws.com/RTI/Bundles/5.3.1/Evaluation/rti_connext_dds_secure-5.3.1-eval-x64Linux3gcc5.4.0.tar.gz ./
-# # RUN tar -xvf rti_connext_dds_secure-5.3.1-eval-x64Linux3gcc5.4.0.tar.gz -C ./
-# COPY ./rti ./
-# RUN rtipkginstall rti_security_plugins-5.3.1-eval-x64Linux3gcc5.4.0.rtipkg && \
-#     rtipkginstall openssl-1.0.2n-5.3.1-host-x64Linux.rtipkg && \
-#     tar -xvf openssl-1.0.2n-target-x64Linux3gcc5.4.0.tar.gz
-# ENV PATH "$NDDSHOME/openssl-1.0.2n/x64Linux3gcc5.4.0/release/bin":$PATH
-# ENV LD_LIBRARY_PATH "$NDDSHOME/openssl-1.0.2n/x64Linux3gcc5.4.0/release/lib":$LD_LIBRARY_PATH
-# # install RTI QoS
-# ENV NDDS_QOS_PROFILES "$NDDSHOME/NDDS_QOS_PROFILES.xml"
+# install RTI Connext
+ENV RTI_NC_LICENSE_ACCEPTED yes
+RUN apt-get update && apt-get install -y \
+      ros-$ROS_DISTRO-rmw-connext-cpp \
+    && rm -rf /var/lib/apt/lists/*
+# set up environment
+ENV NDDSHOME /opt/rti.com/rti_connext_dds-5.3.1
+ENV PATH "$NDDSHOME/bin":$PATH
+ENV LD_LIBRARY_PATH "$NDDSHOME/lib/x64Linux3gcc5.4.0":$LD_LIBRARY_PATH
+# install RTI Security
+WORKDIR $NDDSHOME
+# ADD https://s3.amazonaws.com/RTI/Bundles/5.3.1/Evaluation/rti_connext_dds_secure-5.3.1-eval-x64Linux3gcc5.4.0.tar.gz ./
+# RUN tar -xvf rti_connext_dds_secure-5.3.1-eval-x64Linux3gcc5.4.0.tar.gz -C ./
+COPY ./rti ./
+RUN rtipkginstall rti_security_plugins-5.3.1-eval-x64Linux3gcc5.4.0.rtipkg && \
+    rtipkginstall openssl-1.0.2n-5.3.1-host-x64Linux.rtipkg && \
+    tar -xvf openssl-1.0.2n-target-x64Linux3gcc5.4.0.tar.gz
+ENV PATH "$NDDSHOME/openssl-1.0.2n/x64Linux3gcc5.4.0/release/bin":$PATH
+ENV LD_LIBRARY_PATH "$NDDSHOME/openssl-1.0.2n/x64Linux3gcc5.4.0/release/lib":$LD_LIBRARY_PATH
+# install RTI QoS
+ENV NDDS_QOS_PROFILES "$NDDSHOME/NDDS_QOS_PROFILES.xml"
 
 # generate artifacts for keystore
 ENV TB3_DEMO_DIR $OVERLAY_WS/..

Remember to change or setenv the RMW_IMPLEMENTATION variable as need in the tmux configs.

Additional context

The issue seems to persist regardless of whether security is or isn't enabled, example stdout here:

ros-swg/turtlebot3_demo#34 (comment)

@asorbini
Copy link
Collaborator

asorbini commented Mar 3, 2021

HI @ruffsl, I'm able to reproduce the error and I'm investigating the root cause. I'll report back as soon as I have a better idea, thank you for your report for now!

@asorbini
Copy link
Collaborator

asorbini commented Mar 3, 2021

Actually, I'm not 100% sure I'm getting the same error: on my side, the "error" manifests itself as a crash of rviz2 a few seconds after lunch. Other components seem to be working (or at the very least they don't crash). Does this match your experience?

@ruffsl
Copy link
Member Author

ruffsl commented Mar 3, 2021

Yes, that about matches my experience. The navigation lunch file starts a number of nodes, rviz among them that briefly appears but quickly crashes after lunch. The policy file within the policy folder includes a listing of all the nodes and their topics, services, and actions used. It seems no longer to be exhaustive, so it's temporarily commented out in the root level enclave, swapped with forgiving * like sros2 permission profile instead.

@asorbini
Copy link
Collaborator

asorbini commented Mar 5, 2021

I discovered a problem in the way some objects were registered on the DomainParticipant (wrappers to the ROS message type supports), and fixing that seems to have resolved the crash.

The fix has been pushed to master, please let me know if you experience any other issue.

I tested the demo in the unsecure configuration with rmw_connextdds using both RTI Connext DDS 5.3.1 and RTI Connext DDS 6.0.1 and it seems to be working (but I might be missing something in my tests since I'm not too familiar with it).

I'll wait to hear from you before closing the issue, and I'll test with DDS Security while I wait.

@asorbini asorbini added the bug Something isn't working label Mar 5, 2021
@ruffsl
Copy link
Member Author

ruffsl commented Mar 5, 2021

I'll wait to hear from you before closing the issue, and I'll test with DDS Security while I wait.

I just tested the demo again using 75975f3 , and both the navigation and mapping sections of the demo are working like a charm with rmw_connextdds. Even security with the permissive * policy is working as well!

I now finally have a rmw with a working baseline using Secure DDS on a real world ROS2 Foxy stack with which to debug our minimal spanning access control permission policy for the rest of the demo. I can see the previous policy is now missing a few permissions, or that some topics have changed since last audited. :D

[waypoint_follower-10] [D0500|CREATE Topic|T=rr/waypoint_follower_rclcpp_node/get_parametersReply] RTI_Security_AccessControl_check_create_topic:topic not allowed: cannot be published or subscribed
[waypoint_follower-10] [D0500|CREATE Topic|T=rr/waypoint_follower_rclcpp_node/get_parametersReply] DDS_DomainParticipantTrustPlugins_checkCreateTopic:!security function check_create_topic
[waypoint_follower-10] [D0500|CREATE Topic|T=rr/waypoint_follower_rclcpp_node/get_parametersReply] DDS_DomainParticipant_create_topic_disabledI:SECURITY ERROR: denied permissions
...
[gzserver-1] [D0300|CREATE Topic|T=rr/get_model_listReply] RTI_Security_AccessControl_check_create_topic:topic not allowed: cannot be published or subscribed
[gzserver-1] [D0300|CREATE Topic|T=rr/get_model_listReply] DDS_DomainParticipantTrustPlugins_checkCreateTopic:!security function check_create_topic
[gzserver-1] [D0300|CREATE Topic|T=rr/get_model_listReply] DDS_DomainParticipant_create_topic_disabledI:SECURITY ERROR: denied permissions

Nice work @asorbini , thanks for the quick response! Are there any plans to include rmw_connextdds into the nightly ros2 builds so downstream packages like nav2 could include it in their matrix of rmw CI jobs? E.g:

https://github.com/osrf/docker_images#repo-info-3
https://github.com/ros-planning/navigation2/blob/936b163bcd25377012ba123082b78afd0b82f4ee/.circleci/config.yml#L451-L456


This isn't a priority, but might you know why rmw_connextdds now works where rmw_connext_cpp was failing with security using the permissive * policy mentioned?

@asorbini
Copy link
Collaborator

asorbini commented Mar 5, 2021

I just tested the demo again using 75975f3 , and both the navigation and mapping sections of the demo are working like a charm with rmw_connextdds. Even security with the permissive * policy is working as well!

That's awesome, great to hear it's working as expected! I'll go ahead and close the issue.

I can see the previous policy is now missing a few permissions, or that some topics have changed since last audited. :D

Is this because you need to update your permissions file for some additional endpoints? Unless rmw_connextdds is not "mangling" the topic name correctly (but that doesn't look to be the case for a client's reply topic).

Are there any plans to include rmw_connextdds into the nightly ros2 builds so downstream packages like nav2 could include it in their matrix of rmw CI jobs?

I'm trying to get to the bottom of some test failures that occur consistently on the ROS2 CI machines but not in my local test environment, and as soon as those are resolved rmw_connextdds should start being built by nightly jobs, and soon after replace rmw_connext_cpp in Rolling (and later Galactic). I'm planning to make a community announcement on discourse before then, but I've been mostly focused on testing and stabilization and haven't got around to making it yet.

This isn't a priority, but might you know why rmw_connextdds now works where rmw_connext_cpp was failing with security using the permissive * policy mentioned?

Is the RMW implementation the only change, or did you also switch from Connext 5.3.1 to 6.0.1? If you upgraded the Connext version too, it might be possible that a wildcard permission was not supported earlier, and it is now (I haven't verified in the release notes). Looking at the RMW implementations, I don't see anything that rmw_connextdds might be doing more than rmw_connext_cpp. If anything, I noticed that it might be missing something (some environment variables to configure secure logging).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants