Skip to content
This repository has been archived by the owner on Sep 16, 2022. It is now read-only.

Port vehicle_cmd_gate to ROS2 #3

Merged
merged 2 commits into from
Oct 13, 2020
Merged

Conversation

nnmm
Copy link
Contributor

@nnmm nnmm commented Oct 6, 2020

  • See https://github.com/nnmm/autoware.iv.universe/pull/1/files for a ROS1 <-> ROS2 diff

  • Tested the build with ROS2 Foxy by

    • following this but entering an AutowareAuto ADE environment instead (Foxy .aderc taken from here)
    • running colcon build --packages-up-to vehicle_cmd_gate in the ros2 branch of the AutowareArchitectureProposal repo (with this branch checked out in the autoware/autoware.iv sub-repo)
    • running the build/vehicle_cmd_gate/vehicle_cmd_gate binary, which doesn't print anything
    • running ros2 launch vehicle_cmd_gate vehicle_cmd_gate.launch.xml
  • I had to create two new messages in autoware_control_msgs to replace the bool message type

  • I decided to make the VehicleCmdGate a Node. It previously had two NodeHandles, a public one and a private one ("~"). The public one was unused and could be removed. Private nodes are not supported in ROS2, so I simply removed the private namespace. I found a ROS Answers post that recommends using public NodeHandles anyway for topics.

  • Regarding ros::Timer: In my understanding, rclcpp::WallTimer doesn't allow for using the /clock topic so it's not fully equivalent. An rclcpp::GenericTimer works however, with some manual glue code.

Copy link

@cho3 cho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nnmm The changes look good to me.

Note I have only checked this for correct usage of ROS 2 constructs, namely:

  1. Publishers
  2. Subscriptions and callbacks
  3. Timers
  4. Parameters
  5. Spinning/executor

All of the constructs are being used correctly as far as I can see. I have not checked for style, or other forms of code correctness.

: Node("vehicle_cmd_gate"), is_engaged_(false), is_emergency_(false)
{
rclcpp::QoS durable_qos{1};
durable_qos.transient_local();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be 100% clear, you do want to have a history depth of 1, and you do want to have transient_local, correct?

A transient_local subscriber can only match with a transient_local publisher, and AFAIK the default QoS settings are volatile, which could lead to some hard to trace issues during integration.

E: I just noticed that only the publishers are set up as transient_local.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this corresponds to the earlier behavior of queue size 1 and latch = true.

Copy link

@cho3 cho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, regarding your questions:

Reviewer: please confirm whether this is fine given your use case. I found a ROS Answers post that recommends using public NodeHandles anyway for topics.

Again, all of the changes look fine and appear to be more or less idiomatic ROS 2. So the use of only an implicitly public NodeHandle equivalent seems fine.

Can you let me know whether a rclcpp::WallTimer fulfills your use case or whether the timer should stop when simulation time stops?

I cannot answer this, so I will defer to the original author/code maintainer.

@dejanpan
Copy link

dejanpan commented Oct 7, 2020

@nnmm I came to the point at which I could start the launch file ros2 launch vehicle_cmd_gate vehicle_cmd_gate.launch.xml

but then it segfaults:

[INFO] [launch]: All log files can be found below /home/dejan.pangercic/.ros/log/2020-10-06-16-54-27-815049-ade-18178
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [vehicle_cmd_gate-1]: process started with pid [18180]
[vehicle_cmd_gate-1] free(): double free detected in tcache 2
[ERROR] [vehicle_cmd_gate-1]: process has died [pid 18180, exit code -6, cmd '/home/dejan.pangercic/aap_ws/install/vehicle_cmd_gate/lib/vehicle_cmd_gate/vehicle_cmd_gate --ros-args -r __node:=vehicle_cmd_gate --params-file /home/dejan.pangercic/aap_ws/install/vehicle_cmd_gate/share/vehicle_cmd_gate/config/vehicle_cmd_gate.yaml -r input/engage:=/autoware/engage -r input/emergency:=/system/emergency/is_emergency -r input/gate_mode:=/remote/gate_mode_cmd -r input/auto/control_cmd:=trajectory_follower/control_cmd -r input/auto/turn_signal_cmd:=/planning/turn_signal_decider/turn_signal_cmd -r input/auto/shift_cmd:=/control/shift_decider/shift_cmd -r input/remote/control_cmd:=/remote/control_cmd -r input/remote/turn_signal_cmd:=/remote/turn_signal_cmd -r input/remote/shift_cmd:=/remote/shift_cmd -r input/emergency/control_cmd:=/system/emergency/control_cmd -r input/emergency/turn_signal_cmd:=/system/emergency/turn_signal_cmd -r input/emergency/shift_cmd:=/system/emergency/shift_cmd -r output/vehicle_cmd:=vehicle_cmd -r output/control_cmd:=/control/control_cmd -r output/shift_cmd:=/control/shift_cmd -r output/turn_signal_cmd:=/control/turn_signal_cmd -r output/gate_mode:=/control/current_gate_mode'].

I then ran out of time but I think that you will be able to quickly debug this.

@dejanpan
Copy link

dejanpan commented Oct 7, 2020

@nnmm regarding this comment: #3 (review)

@sumanth-nirmal thinks that the following will also work (from https://github.com/ros2/rclcpp/blob/master/rclcpp/include/rclcpp/create_timer.hpp#L57):

rclcpp::create_timer(this,
this->get_clock(), // this should be inside rclcpp node and thus get the node clock
rclcpp::Duration(std::chrono::milliseconds(timer_ms)),
std::bind(&foo::cb, this))

To test this do the following:

  1. ros2 param set listener use_sim_time true
  2. record and play back a rosbag 1 with the --clock option (because rosbag2 does not support --clock yet)
  3. start ros1_bridge

If everything works the callback should be triggered.


If above does not work - talk to us and we will surrender to the alternative explanation that William offered (currently still in my chat).

@dejanpan
Copy link

dejanpan commented Oct 7, 2020

Putting my fully question into open to others can see it too:

We are trying to port https://github.com/tier4/autoware.iv.universe/blob/master/control/vehicle_cmd_gate/src/vehicle_cmd_gate.cpp#L117 as https://github.com/tier4/autoware.iv.universe/pull/3/files (ctrl+f for timer_ )
Since rclcpp::WallTimer doesn't allow for using the /clock topic this is not the same as ros::Timer.

Now William says this:

  1. this is a known issue Extend Time implementation into Timer and Rate objects ros2/rclcpp#465
  2. You could create a thread and wait on it, periodically checking the current time (which does respect ROS time) and executing it if it is ready.
  3. There is a way to get a callback anytime the time changes due to the /clock topic and that would be useful but I don’t have an example of that off hand.
  4. It really needs to be done properly and you might need to consult with tully if you’d like to try.

@dejanpan
Copy link

dejanpan commented Oct 7, 2020

Recap:

  1. You can get this .aderc file to develop against ROS 2 Foxy
  2. This PR has been reviewed - most of the fixes are easy
  3. What is more complex is:
    1. Port vehicle_cmd_gate to ROS2 #3 (comment)
    2. Port vehicle_cmd_gate to ROS2 #3 (comment)
  4. https://github.com/tier4/AutowareArchitectureProposal/pull/67/files is an evolving document for best practices. Lets merge and keep extending it.

@tfoote
Copy link

tfoote commented Oct 7, 2020

this is a known issue ros2/rclcpp#465

This is definitely something people need to do a good port.

There is a way to get a callback anytime the time changes due to the /clock topic and that would be useful but I don’t have an example of that off hand.

An example is here: https://github.com/ros2/rclcpp/blob/c88cc649d32cb3a0ca7eb3c2b92d678303adb0d5/rclcpp/test/rclcpp/test_time.cpp#L65-L69 set the thresholds to zero.

For the short term, I think that you could make a small helper class leveraging the jump callbacks and some signaling to wake the sleep or at worst have the sleep polling based on the clock.

And then we should try to make sure to push the priority up of the time implementation improvements.

@nnmm
Copy link
Contributor Author

nnmm commented Oct 7, 2020

@mitsudome-r, could you review?

@TakaHoribe
Copy link
Contributor

The handling of timer needs to be discussed.

Just report that ros2 launch vehicle_cmd_gate vehicle_cmd_gate.launch.xml worked properly in my environment.

ros2 node info /vehicle_cmd_gate shows below.

horibe@horibe-ThinkPad-X1-Extreme:~$ ros2 node info /vehicle_cmd_gate 
/vehicle_cmd_gate
  Subscribers:
    /autoware/engage: autoware_control_msgs/msg/EngageMode
    /control/shift_decider/shift_cmd: autoware_vehicle_msgs/msg/ShiftStamped
    /parameter_events: rcl_interfaces/msg/ParameterEvent
    /planning/turn_signal_decider/turn_signal_cmd: autoware_vehicle_msgs/msg/TurnSignal
    /remote/control_cmd: autoware_control_msgs/msg/ControlCommandStamped
    /remote/gate_mode_cmd: autoware_control_msgs/msg/GateMode
    /remote/shift_cmd: autoware_vehicle_msgs/msg/ShiftStamped
    /remote/turn_signal_cmd: autoware_vehicle_msgs/msg/TurnSignal
    /system/emergency/control_cmd: autoware_control_msgs/msg/ControlCommandStamped
    /system/emergency/is_emergency: autoware_control_msgs/msg/EmergencyMode
    /system/emergency/shift_cmd: autoware_vehicle_msgs/msg/ShiftStamped
    /system/emergency/turn_signal_cmd: autoware_vehicle_msgs/msg/TurnSignal
    /trajectory_follower/control_cmd: autoware_control_msgs/msg/ControlCommandStamped
  Publishers:
    /control/control_cmd: autoware_control_msgs/msg/ControlCommandStamped
    /control/current_gate_mode: autoware_control_msgs/msg/GateMode
    /control/shift_cmd: autoware_vehicle_msgs/msg/ShiftStamped
    /control/turn_signal_cmd: autoware_vehicle_msgs/msg/TurnSignal
    /parameter_events: rcl_interfaces/msg/ParameterEvent
    /rosout: rcl_interfaces/msg/Log
    /vehicle_cmd: autoware_vehicle_msgs/msg/VehicleCommand
  Service Servers:
    /vehicle_cmd_gate/describe_parameters: rcl_interfaces/srv/DescribeParameters
    /vehicle_cmd_gate/get_parameter_types: rcl_interfaces/srv/GetParameterTypes
    /vehicle_cmd_gate/get_parameters: rcl_interfaces/srv/GetParameters
    /vehicle_cmd_gate/list_parameters: rcl_interfaces/srv/ListParameters
    /vehicle_cmd_gate/set_parameters: rcl_interfaces/srv/SetParameters
    /vehicle_cmd_gate/set_parameters_atomically: rcl_interfaces/srv/SetParametersAtomically
  Service Clients:

  Action Servers:

  Action Clients:

@nnmm nnmm force-pushed the vehicle_cmd_gate branch from 1bcc3e0 to 9adf5ee Compare October 8, 2020 11:22
@nnmm
Copy link
Contributor Author

nnmm commented Oct 9, 2020

@sumanth-nirmal It seems your suggestions regarding the timer doesn't work unfortunately. When I replace

timer_ = create_wall_timer(500ms, timer_callback);

with

timer_ = rclcpp::create_timer(this, this->get_clock(), rclcpp::Duration(500ms), timer_callback);

in the minimal_timer example (lambda version), the node exits immediately, without any simulation running. Is that a bug?

@dejanpan
Copy link

@cho3 recommendation:

Timers in ROS 1 vs ROS 2

Caveat: I am not an expert in the full breadth of ROS 1, ROS 2, or clocks. I have decent working knowledge of a subset of ROS 1 and 2, and working knowledge of clocks.

tl;dr

  • Relying on the system clock is bad (nondeterministic, unreliable), prefer not to do it
  • Prefer to process data purely based on a data perspective (i.e. only look at the data timestamps; don't look at the system clock for anything other than measuring timeouts (e.g. anytime algorithms)
  • ROS 2 WallTimer is not an exact equivalent to the ROS 1 timer due to lack of /clock support
    • Instantiating a GenericTimer with the Clock and associated TimeSource may achieve the desired behavior; no convenience functions exist for this
  • For Autoware.IV, WallTimer is the closest immediate equivalent, but the resulting stack may not work as expected on non-target hardware and/or when replaying bag data

ROS 1 vs ROS 2

The ros::Timer and the ROS 2 GenericTimer allow for periodic executions of functions (ne callbacks). This is a convenient pattern to use when a result is desired independently of the data, or when used as a source of data.

The ROS 2 WallTimer is not an exact equivalent to the ros::Timer, since it implicitly assumes a wall clock (RCL_STEADY_TIME clock), and not a generic clock. This implies that it will not obey the /clock topic populated by simulation time. By contrast, the GenericTimer provides an interface to supply a clock, which can then be hooked up to a TimeSource object. The TimeSource can be set up so that its time is set via the /clock callback. These taken together imply it should be possible to replicate the simulation//clock-driven timer callback with ROS 2.

Unfortunately, one additional problem remains. ros2 bag does not record /clock (aka sim time) whereas rosbag does. This implies that in order to get the same behavior in ROS 2, either:

  • rosbag along with the ros1_brdge must be used
  • Some explicit time source must be used and explicitly recorded by ros2 bag

The Problem with Timer-Driven Patterns

It is well understood that a polling or timer-driven pattern increases jitter (i.e. variance of latency). (Consider, for example: if every data processing node in a chain operates on a timer what is the best and worst case latency?) As a consequence for more timing-sensitive applications, it is generally not preferred to use a timer-driven pattern.

On top of this, it is also reasonably well known that use of the clock is nondeterministic and internally this has been a large source of frustration with bad, or timing sensitive tests. Such tests typically require specific timing and/or implicitly require a certain execution order (loosely enforced by timing assumptions rather than explicitly via the code).

As a whole, introducing the clock explicitly (or implicitly via timers) is problematic because it introduces additional state, and thus assumptions on the requirements for the operation of the component. Consider also leap seconds and how that might ruin the operation and/or assumptions needed for the proper operation of the component.

Preferred Patterns

In general, a data-driven pattern should be preferred to a timer-driven pattern. One reasonable exception to this guideline is the state estimator/filter at the end of localization. A timer-driven pattern in this context is useful to provide smooth behavior and promote looser coupling between the planning stack and the remainder of the stack.

The core idea behind a data-driven pattern is that as soon as data arrives, it should be appropriately processed. Furthermore, the system clock (or any other source of time) should not be used to manipulate data or the timestamps. This pattern is valuable since it implicitly cuts down on hidden state (being the clock), and thus simplifies assumptions needed for the node to work.

For examples of this kind of pattern, see the lidar object detection stack in Autoware.Auto. By not using any mention of the clock save for in the drivers, the stack can run equivalently on bag data, simulation data, or live data. A similar pattern with multiple inputs can be seen in the MPC implementation both internally and externally.

@nnmm nnmm force-pushed the vehicle_cmd_gate branch from 9adf5ee to 0d9da63 Compare October 12, 2020 08:22
auto timer_callback = std::bind(&VehicleCmdGate::onTimer, this);
auto period = std::chrono::duration_cast<std::chrono::nanoseconds>(
std::chrono::duration<double>(update_period_));
timer_ = std::make_shared<rclcpp::GenericTimer<decltype(timer_callback)>>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TakaHoribe @fred-apex-ai This is the timer replacement. I tried it out with a minimal publisher that sends /clock messages and a timer like this. The timer stops when no messages are sent and sim time is enabled, and goes faster when the published clock goes faster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'll test with this code in my package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nnmm I will try it out, too

@nnmm
Copy link
Contributor Author

nnmm commented Oct 12, 2020

@TakaHoribe Could you do the final review and merge?

@dejanpan
Copy link

@wjwwood see the before mentioned recommendation for ros::Timer alternative in ROS 2:
https://github.com/tier4/Pilot.Auto/pull/3#issuecomment-706732396.

It also seems that accelerating or slowing down the timer works: https://github.com/tier4/Pilot.Auto/pull/3#discussion_r503121450.

@TakaHoribe TakaHoribe self-requested a review October 13, 2020 07:30
Copy link
Contributor

@TakaHoribe TakaHoribe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've confirmed it works in my local.The timer callback runs correctly for both real time and sim time.(tested with ros1_bridge to publish clock topic)

The code also looks ok.

@TakaHoribe TakaHoribe merged commit 5eed6f2 into tier4:ros2 Oct 13, 2020
cvasfi pushed a commit to cvasfi/Pilot.Auto that referenced this pull request Oct 28, 2020
* Port vehicle_cmd_gate to ROS2

* Sim-time-respecting timer
TakaHoribe pushed a commit that referenced this pull request Jan 10, 2021
rename *.launch files to *.launch.xml
taichiH pushed a commit that referenced this pull request Nov 10, 2021
* Port vehicle_cmd_gate to ROS2

* Sim-time-respecting timer
@tkimura4 tkimura4 mentioned this pull request Nov 10, 2021
10 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants