Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic/adaptive subscription QoS #1868

Open
emersonknapp opened this issue Jan 19, 2022 · 7 comments
Open

Automatic/adaptive subscription QoS #1868

emersonknapp opened this issue Jan 19, 2022 · 7 comments
Assignees

Comments

@emersonknapp
Copy link
Collaborator

emersonknapp commented Jan 19, 2022

Feature request

Feature description

Provide the ability for a subscription to automatically choose a QoS based on detected QoS of publishers in the graph. Most times for the business logic of robot applications, we want to explicitly specify QoS policies to clearly lay out communication behavior requirements.

However, some tooling and infrastructure by design does not care about the exact behavior and just wants to match with publishers no matter what, at as high a level of fidelity/service as possible. Three use cases come to mind up front:

  • rosbag2 - wants to subscribe to all publishers regardless of QoS, to record all traffic possible
  • ros2cli - ros2 topic echo in my view should always start printing messages, if there is a publisher providing them, regardless of QoS mismatch
  • topic_tools - these generic tools like relay, throttle and mux generally want to receive messages from publishers and imitate those publisher's behaviors, with modifications

This feature API might be:

  • A special Subscription subclass AdaptiveQoSSubscription that provides this behavior
  • A SubscriptionOptions value that turns this on
  • A special QoS profile object AdaptiveSubscriptionQoSProfile that acts as a flag to enable the behavior

Behavior considerations:

  • Does subscription continue noticing for new publishers to re-adapt if necessary? Or, does it only adapt to publishers available at subscription time? Probably the former.
  • A subscription could use a "maximally lax"/"minimal strict" QoS profile request such that it matches with all possible publishers, but a more desirable behavior is to get the highest QoS possible - such that if a publisher is reliable with some durabilty, those latched messages are received, and reliable communication is used.
  • When there are multiple publishers on a topic, offering different QoS profiles, that's the most difficult case. rosbag2 uses adaptive logic (see links below) to craft a single QoS request that will match all publishers. This may downgrade the quality of service from one higher-qos publisher to match with a lower-qos one as well. This seems to be the only tradeoff we can make, given that we can't subscribe to a specific publisher individually - but I am open to alternative suggestions.

Implementation considerations

Will need this in both rclpy and rclcpp. Python is necessary to expose to ros2cli and some topic_tools. Given that, maybe this feature belongs in rcl or maybe even in some RMW utility package.

There are two current implementations that I know of for this feature, in rosbag2 and in topic_tools. Using those as a starting point for a "core" implementation makes sense to me.

See https://github.com/ros2/rosbag2/blob/master/rosbag2_transport/src/rosbag2_transport/qos.cpp for rosbag2 logic - those tools used around https://github.com/ros2/rosbag2/blob/master/rosbag2_transport/src/rosbag2_transport/recorder.cpp#L238

@jacobperron
Copy link
Member

As another pointer, there's another implementation of this in domain_bridge (inspired by the rosbag2 logic).

Here are other threads discussing this feature:

I think it makes sense to have two functions in rcl (and maybe implemented inrmw): one for getting the best possible QoS for a given publisher and a second function for a given subscription. We can then provide rclcpp and rclpy wrappers for the rcl functions.

@emersonknapp I think these functions could easily satisfy your second and third behavioral considerations: get the highest possible QoS and account for multiple entities with different QoS. However, doing something like re-adapting to new pub/subs on the graph sounds difficult.

We considered re-adapting QoS during runtime for the domain_bridge, but it is not clear how to achieve this without recreating the publisher or subscription in question. If we recreate a subscription, then it becomes difficult to deal with messages that may be lost (or duplicated) during the transition. I'd like to hear from others about how this kind of feature might best be achieved. IMO, it seems like it is not worth the effort.

@fujitatomoya
Copy link
Collaborator

@jacobperron

However, doing something like re-adapting to new pub/subs on the graph sounds difficult.
but it is not clear how to achieve this without recreating the publisher or subscription in question.

I think we do need to have new RMW interface to reset the QoS for publisher and subscriber at runtime to achieve this re-adapting QoS? if i am not mistaken, it can be done with DDS. CC: @MiguelCompany @eboasson

@eboasson
Copy link
Contributor

You're opening a can of worms here ... the very short answer is that in DDS it is impossible to make a data reader that will "reliably" get data from all writers for the same topic without getting duplicates:

  • The "ownership" would require having one for "shared" and one for "exclusive" ownership. Fortunately, "ownership" didn't make its way into ROS 2 (a good thing and not just because of this).
  • If best-effort writers exist, the reader would have to be best-effort to get the data. But if there are also reliable writers, then you'd need to additionally have a reliable reader to make sure you get everything you should be getting. The vast majority (perhaps even all) of the data would then also be visible in the best-effort reader.
  • The volatile/transient-local distinction gives you similar problems: the volatile reader would get all the data, except for the historical data from the transient-local writer. So you'd have to create a transient-local reader just to get the historical data, and then you get all newly published transient-local data twice.

Furthermore, some of the QoS settings that do this problematic request-offered (RxO) matching and that are supported by ROS 2 cannot be changed: reliability, durability and liveliness are fixed at creation time of a reader/writer. If you really want to go ahead with this, I'd suggest recreating the publisher/subscription, make sure that the new one is discovered before deleting the old one and deal with any double arrivals separately.

The only alternative would be DDS spec changes or vendor-specific extensions to implement different matching rules, which could be done in such a way that there'd be no need to adapt anything. Sadly, vendor-specific will get messy if interoperability is desired and the spec changes more slowly than a glacier melts in an ice age ...

@fujitatomoya
Copy link
Collaborator

@eboasson appreciate for your quick and informative comments.

I'd suggest recreating the publisher/subscription, make sure that the new one is discovered before deleting the old one and deal with any double arrivals separately.

i see. at least i do not have any other ways to achieve this. and it seems like it is not worth the effort...

@emersonknapp
Copy link
Collaborator Author

@jacobperron (and others) I think it makes sense to provide the utilities to adapt a subscription to a known set of publisher qos profiles. We could skip re-adapting to newly discovered publishers, in the requirements for this particular ticket - and revisit that as its own feature, if desired, later.

@akash-roboticist
Copy link

With the assumption that publishers exist at subscription creation time and not adapt to new publishers, listing here a couple of approaches to this. My thought was to add the detection logic in rcl and update/add wrapper to the subscription initialization interface rcl_subscription_init() to use this and adapt to the publisher QoS.

Approach 1

  • Add an additional rmw_qos_profile_t (eg. rmw_qos_profile_autodetect) to rmw/qos_profiles.h ; this would be used to trigger the publisher QoS lookup logic (which would in turn use rcl_get_publishers_info_by_topic()) while using rcl_subscription_init().
  • On determining the QoS, the subscriber would be set with the appropriate QoS value.

Pros:

  • Bulk of the changes would be to rcl (for logic), auto detect profile addition to rmw (qos_profiles.h) and corresponding updates to rclcpp (qos.hpp), rclpy (qos.py) ; enabling users to call rcl_subscription_init() using exiting methods even while invoking this additional autodetection feature.

Cons:

  • Extending rmw_qos_profile_t seems like an overkill for something that can be accomplished by passing just a boolean flag (to trigger QoS detection logic).

Approach 2

  • Write a new subscription initialization function explicitly for this purpose eg. rcl_adaptive_qos_subscription_init(), which would essentially be a wrapper around rcl_subscription_init(), that does not need the subscription options.
  • Instead, it first uses the QoS lookup logic and then use that to call the original function rcl_subscription_init().

Pros:

  • No change to currently in-use methods
  • Opens up for easier exploration in the future.
  • No edits to QoS profiles

Cons:

  • Addiitons needed to rclcpp, rclpy to be able to make use of this new method.

@jacobperron
Copy link
Member

jacobperron commented Apr 19, 2022

After some iteration, I've proposed adding new policy enum values at the RMW layer for selecting the 'best available' policy (ie. the policy that will match the most endpoints while maintaining the highest level of service). Feedback is welcome: ros2/rmw#320

This delegates to the middleware for how to implement the matching logic. For DDS implementations, we provide a common implementation that will query discovered endpoints at the time of creating a subscription/publisher and use that information to adapt the QoS policies: ros2/rmw_dds_common#60

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants