Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use_tf_static should be false by default #105

Closed
christian-rauch opened this issue Feb 25, 2019 · 12 comments
Closed

use_tf_static should be false by default #105

christian-rauch opened this issue Feb 25, 2019 · 12 comments

Comments

@christian-rauch
Copy link

The behaviour of publishing fixed joints changed significantly since the introduction of use_tf_static. By default, fixed joint transforms are published once to tf_static with the earliest possible timestamp that robot transformations can have.
When publishing joint positions and logging the robot state (as the set of its TFs) some time after the robot_state_publisher has been started, the resulting bagfile will contain static TFs from a time before the logging actually started and hence is not a correct representation of the robot state.

Although the static transforms are latched (and hence logged), they are not latched by default when replaying a rosbag. As a consequence it is not possible to start any node that needs the full robot state (e.g. RViz) after the replay started.

When reading the TFs into a TF buffer (which caches TFs for 10s by default), this creates distinct TF trees and results in failed lookups:

Lookup would require extrapolation at time 1551113242.324176550, but only time 1551109106.787312031 is in the buffer
Lookup would require extrapolation at time 1551113242.357692480, but only time 1551109106.787312031 is in the buffer
Lookup would require extrapolation into the future.  Requested time 1551113242.391208649 but the latest data is at time 1551113242.357692480
Lookup would require extrapolation into the future.  Requested time 1551113242.424724579 but the latest data is at time 1551113242.357692480

In this particular case the static transforms were published at 1551109106.787312031 and the non-static transforms were published at 1551113242.391208649.

By publishing fixed joints as static TF periodically together with the non-static TF, the robot_state_publisher would provide a complete state of the robot at any point in time and the before mentioned issues would be circumvented.

@sloretz
Copy link
Contributor

sloretz commented Feb 25, 2019

Although the static transforms are latched (and hence logged), they are not latched by default when replaying a rosbag.

Was the bag file filtered at some point? It looks like rosbag forgets that topics are latched if the bag is filtered ros/ros_comm#685 .

I'm not sure about disabling use_tf_static by default. I would be concerned about an increase in data sent over the wire, especially since so many nodes use tf.

@clalancette
Copy link
Contributor

I'm not sure about disabling use_tf_static by default. I would be concerned about an increase in data sent over the wire, especially since so many nodes use tf.

Right, that's my concern as well, and (I'm assuming) the reason it is currently set to true. Just looking a bit into the history, it looks like this was false in indigo, and then was changed to true for Jade and newer, probably to reduce the amount of data. So switching back has its own set of downsides here, which is why it isn't so clearcut (and also why there is a switch for it :).

@christian-rauch
Copy link
Author

christian-rauch commented Feb 26, 2019

@sloretz
No, the issue with distinct TF trees is occurring with the original bagfile and it appears that tf_static is not latched. E.g. rostopic echo /tf_static only shows transforms, if I start it before rosbag play. How can I verify which topics are stored as latched in the bagfile?

@clalancette
Is the reason for having latched static TFs really just saving storage and bandwidth? Usually, robots have more "dynamic" TFs than static TFs and it would probably suffice to publish static TFs at a much lower frequency (e.g. 1Hz). bagfile compression can take care of redundant message content. This way, you can obtain the full state of your robot within 1s of subscribing to all TF topics at any point in time.

The concept of latching in combination with stamped messages also appears contradicting to me. I think a latched message should either have the timestamp of when it is published or have no timestamp at all. E.g. storing a message with a timestamp from before the logging actually started is implausible.

In general, I prefer correctness (in having a valid complete robot state at any point in time) over storage efficiency.

@sloretz
Copy link
Contributor

sloretz commented Feb 27, 2019

Although the static transforms are latched (and hence logged), they are not latched by default when replaying a rosbag.

No, the issue with distinct TF trees is occurring with the original bagfile and it appears that tf_static is not latched. E.g. rostopic echo /tf_static only shows transforms, if I start it before rosbag play.

I can't reproduce the issue on melodic. It looks like rosbag correctly publishes a latched message.

Creating the bag

roscore &
run tf2_ros static_transform_publisher 1 2 3 0 0 0 1 foo bar &
rosbag  record --output-name "latch_test.bag" --all
# killed the static transform publisher and rosbag

Playing back the rosbag with --keep-alive since there is a latched topic

rosbag play --keep-alive latch_test.bag

In another terminal the message was received on a subscriber created after playback started, so it is being latched.

$ rostopic echo /tf_static 
transforms: 
  - 
    header: 
      seq: 0
      stamp: 
        secs: 1551225061
        nsecs:  91014945
      frame_id: "foo"
    child_frame_id: "bar"
    transform: 
      translation: 
        x: 1.0
        y: 2.0
        z: 3.0
      rotation: 
        x: 0.0
        y: 0.0
        z: 0.0
        w: 1.0
---

How can I verify which topics are stored as latched in the bagfile?

It looks like the latched status is stored in the connection record header. I didn't see that info being output by the command line tool, but here's where rosbag reads that when playing a bag file: https://github.com/ros/ros_comm/blob/943f37382809da20170b6b4715f46c1a13363c97/tools/rosbag/src/player.cpp#L61-L65 . A quick way to tell if there is any latched topic in a bag file is:

$ grep latch_test.bag -e "latching=1"
Binary file latch_test.bag matches

@christian-rauch
Copy link
Author

Thank you for looking into this.
I haven't used --keep-alive because the description suggests that this will latch all topics in the bag (IMHO). I expected that latched topics that are logged in a bag file are also by default replayed with latching to reproduce the correct behaviour. I also do not see how the latching behaviour can be achieved with the rosbag API. The tf_static topic is lost when reading the bag file from a different start_time, e.g. t+10s.

In any case, even when replaying with active latching (-k), I get distinct TF trees, when starting rqt with the TF Tree plugin after rosbag play logfile.bag -k. I only get a complete TF tree, if I start replay after rqt with TF Tree plugin.

To make my case:
When logging the state of a robot as it's set of TFs, the resulting bagfile must contain this complete representation at any point in time. And this must be by default. Therefore, use_tf_static must be false by default to provide a complete TF tree at any point in time (or at least within the publishing frequency).

@sloretz
Copy link
Contributor

sloretz commented Feb 27, 2019

I haven't used --keep-alive because the description suggests that this will latch all topics in the bag (IMHO).

By default rosbag play exits after publishing all messages in the bag. --keep-alive keeps the rosbag process alive after reaching the end of the bag. It does not latch all topics.

I also do not see how the latching behaviour can be achieved with the rosbag API.

Are you asking how to use the rosbag API to check if there was a latched publisher on a topic? Rosbag player uses a rosbag::View to get connection info, and then checks if it has the latching field with a value of 1.

In any case, even when replaying with active latching (-k), I get distinct TF trees, when starting rqt with the TF Tree plugin after rosbag play logfile.bag -k. I only get a complete TF tree, if I start replay after rqt with TF Tree plugin.

This sounds like a bug. Does the connection info for the /tf_static topic have latching=1?

When logging the state of a robot as it's set of TFs, the resulting bagfile must contain this complete representation at any point in time. And this must be by default. Therefore, use_tf_static must be false by default to provide a complete TF tree at any point in time (or at least within the publishing frequency).

If I understand correctly, there is a bag file with static and dynamic transforms, and it needs to be played to give some tool the complete tf tree. It sounds like something interesting happens in the middle of the bag file, so playback is starting from the middle. Rosbag doesn't publish the static transforms because the start time is after when the static transforms were published. I disagree that this is the default use case for robot_state_publisher.

If the goal is having static transforms regardless when the bag was started, I recommend telling rosbag not to publish /tf_static and launching robot_state_publisher with the same urdf. Another option would be a PR to ros/ros_comm to add a flag to rosbag play to publish latched messages prior to the start time.

@christian-rauch
Copy link
Author

Are you asking how to use the rosbag API to check if there was a latched publisher on a topic?

No. I meant that if I start reading the bagfile with a different start_time, the topic tf_static won't show up, i.e. it does not behave as if played back with -k.

This sounds like a bug. Does the connection info for the /tf_static topic have latching=1?

I am using the python API for rosbag, btw, and it looks like there are no wrappers for these connection info methods that you mentioned. However, the grepping approach (latching=1) shows a match.
It also appears that I have two sources of static TFs, one from the openni2 node with transforms of camera frames that appear to be latched, and the second source from the state publisher about fixed joints which appear to be not latched:
When I (1) replay with -k and (2) rostopic echo /tf_static, it will show some camera TFs. The other way around, I will get all static TFs, including those from the state publisher.

I might be able to share a minimal example bagfile with this issue to help debugging.

Nevertheless, even if these issues get resolved in the different parts of logging and replay, I wanted to point out the latching approach is fragile (at least from my perspective). E.g. by default rosbag play ignores the latching attribute of topics and it is very likely that the latching attribute gets lost when post-processing bagfiles.
Overall, publishing periodically on a topic is more robust than relying on the latching behaviour. I am therefore asking to revert the default state publisher behaviour to periodically publish static transforms (with optional lower frequency to save bandwidth and space).

@peci1
Copy link

peci1 commented Mar 5, 2019

You've bounced into a known issue with multiple static tf publishers.

ros/ros_comm#146
ros/geometry2#181

I've written a workaround package that basically muxes all ever published static TFs into one big message and republishes it, hoping it is the last publisher and gets latched. https://github.com/tradr-project/static_transform_mux . I haven't released this package because I hoped it might get merged into geometry2, but as the issue has been dead for about 2 years, I'll consider releasing it into melodic. What it doesn't help with is replaying the bagfile from later point than the original start time - then the static TFs get lost.

@davidbsp
Copy link

davidbsp commented May 8, 2019

How hard would it be to have the recording script inspecting all static_tfs at recording startup and provide the option to republish (all of) them as a simple tf at a configurable interval (e.g. 1Hz) ?

@doisyg
Copy link
Contributor

doisyg commented May 14, 2019

Discovering the hard way this issue too, resulting in multiple unclean recorded bags

@peci1
Copy link

peci1 commented May 14, 2019

@doisyg Give a try to the above mentioned static_transform_mux. You can first run it, then play the bag that contains the static transforms, and then you can play-record all the other bags and the static TFs will be there :) Or you can use the python interface to rosbag to find the transforms in a script and copy them to all bags.

@clalancette
Copy link
Contributor

clalancette commented Aug 27, 2019

Closing based on the reasoning in #109 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants