-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support --repeat-latched
in ROS 2 (transient local)
#1159
Comments
@amacneil Yes we have the reason to not do this for the sake to have deterministic replay. |
I'll admit I am not super familiar with how tf_static and We have users who would like to open one file from a split bag and have the transforms work correctly. Your comment in that issue seems to be based on the fact that you don't have a way in rosbag2 to "play" a split bag file, but we do support opening a single file in Foxglove. I think it would be good to get more feedback from users on whether they need this feature. |
If users want something it doesn't means that we should jump in and implement what they want without considering other cases and drawbacks. My main point not in the case that rosbag2 doesn't support playing a split bag rather that we need to have determinism in replay. Basically we shall not inject or copy messages which we didn't receive on the wires in any cases. We have a lot another cases when transient local durability uses for instance camera images or LIDAR streams - and what? We should copy all messages from camera topic from previous file to the next file when doing split? If someone want to repeat |
My understanding of transient local durability is there is a user-configured history size, so it would only copy all messages from the previous file for a topic if the previous file was very short or the history size was unusually large. The default history size is 10, and 5 for the sensor QoS profile. |
@jhurliman Though the way how it's organized in DDS and ROS2 is that publisher has cache with N number of messages when transient local durability is used and when new subscriber joining to publisher - subscriber will be getting the last N messages from publisher cache rather than just one latest message. The case when transient local durability uses for It could be another cases when transient local uses for camera images topics or LIDAR and there are continuous rgb data stream. Newly created subscriber will get N last images from cache and then all other following sequences from ongoing running stream since publisher will continuously publish new messages. @jhurliman @amacneil After some consideration I tend to agree that we can implement this feature but not as default behavior. Only as an option with parameter given via CLI. Some technical details and consideration how to implement this feature.
Thoughts? |
--repeat-latched
in ROS 2--repeat-latched
in ROS 2 (transient local)
@amacneil @MichaelOrlov Hi everyone, are there any updates on this feature? |
@ga58lar Unlikely we will implement this feature since it contradicts the entire DDS communication approach and unavoidably will introduce a significant impact on the performance of the split operation and will cause messages drops during recording. I would suggest a workaround. |
The way I understand this, there are these different cases being considered here:
I believe that Case 1 actually already works correctly. When you start playback from the beginning, publishers are created with the appropriate QoS -- so if the original topic in the recorded system had all Case 2 on the other hand would still exhibit the same behavior but only for messages originally published after the timestamp of the start of playback. To give the publishers the internal information, we could do a check for latched publishers and do a "burst publish" of their topic from the beginning of the bag up to the starting point, which would act that way. Case 3 would prove more difficult, though I don't see it as an engineering impossibility. We might create a separate entry type, a "internal latch-buffer" topic at the beginning of new splits, that is not played back during normal operation, but used only used for the "burst publish latch priming" if starting playback from that file. Unfortunately the History Depth (latching count) of the publisher is not always broadcast such that rosbag2 would know what it is. DDS considers History Depth (as well as Lifespan but that's less relevant here) to be an internal QoS configuration, other participants don't need to know about it, so it's not guaranteed in every implementation to be conveyed in discovery. Like @MichaelOrlov did bring up, though, the potential extra features I mentioned are not without performance cost |
Hey, I just encountered this issue when trying to visualize split bag files in Foxglove. @amacneil or @MichaelOrlov I understand that this is WIP but did you find a workaround I could use at the moment, either on the foxglove or rosbag2 side? The only option for me is to publish /tf_static more frequently which beats its purpose, or transfer my static TFs to /tf and publish at high rate which is a waste of resources. Thanks Maybe @ga58lar you have an idea as well? P.S: @MichaelOrlov your proposed workaround would not work in my use case unfortunately |
@tonynajjar Unfortunately, I have not found a really good solution that does not defeat the criticism raised. https://github.com/ros/robot_state_publisher/blob/ros2/src/robot_state_publisher.cpp#L151
https://github.com/ros/robot_state_publisher/blob/ros2/src/robot_state_publisher.cpp#L364
It would be better though to subscribe to |
Thanks for sharing @ga58lar, it does look like an okay workaround when all your static transforms come from the robot_state_publisher; unfortunately it's not the case for me, they come from at least 3 different sources. I find it quite dirty to enable the regular republishing of /tf_static from all the sources :/ |
For anyone wanting to implement this in their own fork in the future I implemented it on humble: angsa-robotics#1 (inspired from #1051). I don't like maintaining a long-term fork but I ran out of options. @MichaelOrlov I understand why you don't want this to be the default behavior but I think having the option to enable it is not so bad |
As @tonynajjar mentioned, this feature is very useful for doing analysis of single split files, either with static analysis (e.g. Foxglove) or even with playback. By embedding transient local topics in each split, then playing back each split appears as if the recorder was just a late subscriber at the time of the split start. So while yes, this feature is not exactly correct, it is incredibly useful, and I think it makes sense to have as an opt-in feature. |
@Aposhian @tonynajjar OK, it looks like it draws a picture for the feature design proposal.
|
@MichaelOrlov do you have any issues with @tonynajjar's implementation? https://github.com/angsa-robotics/rosbag2/pull/1/files |
@tonynajjar @Aposhian We just need to rewrite it to adhere to the plan proposal in the #1159 (comment) |
One issue I have with this proposal is that the queue depth could be different per topic. As an example:
|
@emersonknapp You mentioned in today's tooling WG meeting that it was a way to know about the publisher queue size but not all DDS implementations support it. |
@MichaelOrlov the History Depth is the number of messages that get persisted in the queue and re-sent to new participants when using Transient Local Durability. In at least one of the dds implementations, this History Depth is filled correctly when we discover the reported qos in get endpoint info by topic https://github.com/ros2/rosbag2/blob/rolling/rosbag2_transport%2Fsrc%2Frosbag2_transport%2Frecorder.cpp#L697 In the other one, the History Depth number is always filled with the default value (I think that's 10). I can't remember right now which one of cyclone vs fastdds does which behavior. If I remember correctly, the dds spec doesn't dictate this behavior because History Depth does not affect matching compatibility, so it's not required to be reported. I could be wrong about that, in which case this is a bug in the one implementation. |
I traced it down that all 3 DDS rmw layers (CycloneDDS, FastDDS, and ConnextDDS) use graph cache |
Okay. I've tried to research topic about publisher history depth and how it is getting to the
For As a follow-up, will need to write a simple unit test with a publisher and subscriber node and a call to the |
cc: @mjcarroll This feature, However, the problem is that almost everybody wants this feature ASAP. |
Description
ROS 1 supported a
--repeat-latched
feature (ros/ros_comm#1850) to repeat latched topics at the start of each bag split.As far as I can tell, rosbag2 does not have this feature. Is there any reason we should not add it?
The text was updated successfully, but these errors were encountered: