Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Design] Add design on recording and replay service #1359

Merged
merged 8 commits into from
Jul 11, 2023

Conversation

Barry-Xu-2018
Copy link
Contributor

Related to #773

@Barry-Xu-2018 Barry-Xu-2018 requested a review from a team as a code owner May 29, 2023 03:06
@Barry-Xu-2018 Barry-Xu-2018 requested review from gbiggs and james-rms and removed request for a team May 29, 2023 03:06
Copy link
Contributor

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Barry-Xu-2018 i have several comments.

@ihasdapie @emersonknapp thanks for sharing the status, since we have been away from rosbag2 activity for a while, it would be really appreciated if you can share details and thoughts on this.

CC: @MichaelOrlov

docs/design/rosbag2_record_replay_service.md Show resolved Hide resolved
docs/design/rosbag2_record_replay_service.md Outdated Show resolved Hide resolved
docs/design/rosbag2_record_replay_service.md Show resolved Hide resolved
docs/design/rosbag2_record_replay_service.md Outdated Show resolved Hide resolved
docs/design/rosbag2_record_replay_service.md Outdated Show resolved Hide resolved
Copy link
Member

@ihasdapie ihasdapie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments. I think some additional opinions on playback behaviour should be considered before doing anything concrete

docs/design/rosbag2_record_replay_service.md Show resolved Hide resolved
docs/design/rosbag2_record_replay_service.md Outdated Show resolved Hide resolved
docs/design/rosbag2_record_replay_service.md Outdated Show resolved Hide resolved
docs/design/rosbag2_record_replay_service.md Show resolved Hide resolved
docs/design/rosbag2_record_replay_service.md Outdated Show resolved Hide resolved
@MichaelOrlov MichaelOrlov requested a review from emersonknapp June 1, 2023 03:29
@Barry-Xu-2018 Barry-Xu-2018 changed the title Add design on recording and replay service [Design] Add design on recording and replay service Jun 1, 2023
@Barry-Xu-2018
Copy link
Contributor Author

While recording finish, a YAML file will be created to save some information for rosbag info.
After the above discussion, we want to provide the number of request/response for service. To get this information, we must deserialize the service event message and parse the content to get request/response info.
In order to keep record performance, I think deserialization should not be processed during record phase.

So, after recording, created YAML file only save service name without request/response info. While executing rosbag info command, record file will be scanned (deserialization and parse content for service event message) and get the number of request/response. And then update YAML file. That is, the next time you execute "rosbag info," you will have complete information without the need to scan the recording file again.

Of course, this approach would result in users not being able to obtain the complete information of services when they immediately view the YAML file after recording.

If you have any better suggestions, please share them with me.

@MichaelOrlov
Copy link
Contributor

@Barry-Xu-2018 Thanks for initiating discussion about service introspection recording and replay.
I have added item about it in agneda of the next Rosbag2 WG meeteng https://docs.google.com/document/d/1Dsg_9XZQPhihpKQGQWMYTz2doGH4P2cAaNqr60cuNgw/edit#heading=h.v4jyd0vn7d0k which is scheduled on tomorrow Friday 9:00 PST.
I would be glad to have a live discussion about it.
If timing is not good for you please let me know, perhaps we can reschedule meeting on the Monday afternoon PST.

Meanwhile I have a few comments in regards to the design in general.

  1. I agree that deserilizing and trying to parse any messages during recording is likely will cost in the performance penalty and eventually may lead to the highher probability of the messages drops.
  2. Updating YAML file after calling ros2 bag2 info command looks doable but not very nice by design. First of all it might be turns out that there are no message definition for the service type which is going to be deserialized on the machine where calling ros2 bag info command. We can do writing a new yaml file and deleting the old one. But the tricky part is that metadata.yaml file duplicatied inside bag file and we can't change it at least for MCAP backend.
  3. Many questions arise for replay case. How to replay servcies? As far as I understand from the design doc it will be desirable to replay service requests. But how to implement this?
    We simply don't have API for this, because service requests was wrapped in to the service event and we store it in the serialized form.

@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov

If timing is not good for you please let me know, perhaps we can reschedule meeting on the Monday afternoon PST.

It is 12:00 am Saturday in my local time.
I think I can attend this meeting.

@Barry-Xu-2018
Copy link
Contributor Author

  1. Updating YAML file after calling ros2 bag2 info command looks doable but not very nice by design. First of all it might be turns out that there are no message definition for the service type which is going to be deserialized on the machine where calling ros2 bag info command.

Service type can be gotten while recording phase (according to service event type). So it can be stored to metadata.yaml file like service name. Besides, service type can be gotten from MCAP file.

We can do writing a new yaml file and deleting the old one. But the tricky part is that metadata.yaml file duplicatied inside bag file and we can't change it at least for MCAP backend.

Do you think the information in the metadata.yaml file and the mcap file is different after only modifying metadata.yaml file?
In mcap file, it recorded service event message. In metadata.yaml, we want to show the request/response information which is gotten by analyzing service event message. So I think the information isn't different.

  1. Many questions arise for replay case. How to replay servcies? As far as I understand from the design doc it will be desirable to replay service requests. But how to implement this?
    We simply don't have API for this, because service requests was wrapped in to the service event and we store it in the serialized form.

Yes. There is no API.
Now I am considering to use type introspection information to pickup request contents from service event.

@MichaelOrlov
Copy link
Contributor

@Barry-Xu-2018 Instead of modifying metadata.yaml file after parsing service event messages in ros2 bag info I want to suggest adding --verbose CLI option.
The idea is that regular ros2 bag info will not parse messages and will output statistics about service events.
While ros2 bag info with --verbose key will do parsing of the service events and provide an extended version of the statistics with a breakout on the number of service requests/responses. And we will not update metadata.yaml for consistency.

@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov

Instead of modifying metadata.yaml file after parsing service event messages in ros2 bag info I want to suggest adding --verbose CLI option.
The idea is that regular ros2 bag info will not parse messages and will output statistics about service events.
While ros2 bag info with --verbose key will do parsing of the service events and provide an extended version of the statistics with a breakout on the number of service requests/responses. And we will not update metadata.yaml for consistency.

Totally agree. Thanks.
There is a little bad user experience for this scenario. User want to list detailed info more than once for a big record file. Parsing the number of request/response take a little more time when dealing with large recording file. Due to the absence of saved parsing information, it will take time each time to display detailed information.

@Barry-Xu-2018
Copy link
Contributor Author

Barry-Xu-2018 commented Jun 2, 2023

BTW, there is a quick way to parsing the number of request/response information.

Use AddTwoInts demo as an example.
About AddTwoInts_Event

struct AddTwoInts_Event_ {
...
  using _info_type =
    service_msgs::msg::ServiceEventInfo_<ContainerAllocator>;
  _info_type info;
  using _request_type =
    rosidl_runtime_cpp::BoundedVector<example_interfaces::srv::AddTwoInts_Request_<ContainerAllocator>, 1, typename std::allocator_traits<ContainerAllocator>::template rebind_alloc<example_interfaces::srv::AddTwoInts_Request_<ContainerAllocator>>>;
  _request_type request;
  using _response_type =
    rosidl_runtime_cpp::BoundedVector<example_interfaces::srv::AddTwoInts_Response_<ContainerAllocator>, 1, typename std::allocator_traits<ContainerAllocator>::template rebind_alloc<example_interfaces::srv::AddTwoInts_Response_<ContainerAllocator>>>;
  _response_type response;
...
}

The information regarding the request and response can be obtained from service_msgs::msg::ServiceEventInfo_.
Deserializing request and response is unnecessary.
ServiceEventInfo is the first item in AddTwoInts_Event.
So, regardless of how the request and response are defined, we only need to deserialize ServiceEventInfo instead of all event message.
I had tried the below code, it can correctly get event type from service event topic.

auto message_info = service_msgs::msg::ServiceEventInfo();
const rosidl_message_type_support_t * type_support_info =
    rosidl_typesupport_cpp::get_message_type_support_handle<service_msgs::msg::ServiceEventInfo>();
// `info` member is located in the first item in `example_interfaces::srv::AddTwoInts_Event`
ret = rmw_deserialize(
   &(serialized_message->get_rcl_serialized_message()),
   type_support_info,
   &message_info);
 if (ret == RMW_RET_OK) {
          RCLCPP_INFO(this->get_logger(), "Receive an event message, rmw_deserialize with `ServiceEventInfo`, type is: '%d'", message_info.event_type);
  } else {
     RCLCPP_INFO(this->get_logger(), "failed to rmw_deserialize");
  }

Since ServiceEventInfo has a fewer items (that is, Deserialization is fast), it is possible to get the number of request/response while recording.

# The type of event this message represents
uint8 event_type

# Timestamp for when the event occurred (sent or received time)
builtin_interfaces/Time stamp

# Unique identifier for the client that sent the service request
# Note, this is only unique for the current session
unique_identifier_msgs/UUID client_id

# Sequence number for the request
# Combined with the client ID, this creates a unique ID for the service transaction
int64 sequence_number

@fujitatomoya
Copy link
Contributor

@MichaelOrlov @Barry-Xu-2018 I cannot make it today's call, so if this takes place i will catch up with you later, thanks!

@MichaelOrlov
Copy link
Contributor

@Barry-Xu-2018 Am i understand correctly that in

ret = rmw_deserialize(
   &(serialized_message->get_rcl_serialized_message()),
   type_support_info,
   &message_info);

serialized_message->get_rcl_serialized_message()) - is supposed to be a recorded message with service event?

@Barry-Xu-2018
Copy link
Contributor Author

serialized_message->get_rcl_serialized_message()) - is supposed to be a recorded message with service event?

Yes. You are right.

@MichaelOrlov
Copy link
Contributor

@fujitatomoya @emersonknapp @ihasdapie We met with @Barry-Xu-2018 on today's Rosbag2 and Tooling WG meeting and come to the conclusion about the following:

  1. Decided to provide --verbose parameter for ros2 bag info and don’t update metadata.yaml file in regards to the number of service requests and responses when running info with the verbose option. Will parse service event messages each time when calling info with the --verbose option. It shouldn't take significant time even for the large bags since usually there are not too many service events and we will read out and skip most of the recorded messages and it's going to be pretty fast.
  2. Decided that --include-hidden-topics will record service events as well, since it is just a hidden topic.
  3. To be decided yet about if ros2 record --service without service names should record all service events. On one hand it would be user-friendly to record all services if no service names are mentioned explicitly, however from another hand it's not straightforward to implement since the parameter for the services is going to be a vector of strings with names - and an empty vector will be equal to the situation when no --service parameter at all.
    I am more inclined to not record all services if there are no service names after the --service parameter. Users will be able to record all service events with the --include-hidden-topics parameter or perhaps --all parameter.
  4. --all CLI ? To be decided how --all and --service and --include-hidden-topics will coexist together. Need some time for consideration and discussion.
    On one hand from a consistency point of view I am inclined to the option when --all will record only topics as before and will not record service events since they are just a hidden topics and we have --include-hidden-topics for them. However from another hand, if look from the user-friendliness point of view probably the user will expect that --all will record service events since there exists --service parameter and the user may not be aware that "service" in this case is just a specific hidden topic.
    Open to feedback and opinion about this item.
  5. We are not able to replay dedicated service requests due to the lack of the API currently because service requests are wrapped inside the service event and they are serialized. We decided that possibility of playback service requests needs to be further investigated and prototyped before we will make a final decision about including this option in the design.
    @Barry-Xu-2018 volunteer to try prototyping this option.

P.S. Your feedback and opinion are welcome.

@fujitatomoya
Copy link
Contributor

@MichaelOrlov @Barry-Xu-2018 thanks for leading this.

Decided to provide --verbose parameter for ros2 bag info and don’t update metadata.yaml

agree on not having it update metadata.yaml, and this behavior.

(just a comment) --verbose seems to be general option for topics as well, so it should be described what it is for with help message.

Decided that --include-hidden-topics will record service events as well

agree. if this option applies, it should collect all hidden topics.

I am more inclined to not record all services if there are no service names after the --service parameter.
Users will be able to record all service events with the --include-hidden-topics parameter or perhaps --all parameter.

IMO,

  • --service is the option to explicitly specify the target service. if that is NULL, that returns as invalid argument.
  • --all to record all topics and services. as developer point of view, it is understandable that service_events is hidden topic, but user. probably user expects that service is just different normal entity as well as topics when we support recording services feature.
  • if --all is specified, that prevails to --service to record all services as well.

We decided that possibility of playback service requests needs to be further investigated and prototyped before we will make a final decision about including this option in the design.

sounds good.

@Barry-Xu-2018
Copy link
Contributor Author

@fujitatomoya @MichaelOrlov

About record parameter for service,I list them based on current discussion result.

Parameter Description
--include-hidden-topics Record all hidden topics. Include service event topic.
--all Record all topics and service event topic. Other hidden topics are excluded.
--services ServiceName [ServiceName ...] Record service (service event topic) with specified service name.

If users want to only record all services, there is no single parameter that can fulfill this requirement. The closest parameter to the recent requirement is "--include-hidden-topics."

@MichaelOrlov
Copy link
Contributor

In this case, it will arise the same question if the user will want to record only all topics without services, there is no single parameter that can fulfill this requirement.

@Barry-Xu-2018
Copy link
Contributor Author

In this case, it will arise the same question if the user will want to record only all topics without services, there is no single parameter that can fulfill this requirement.

Yes.
Maybe should add 2 new parameters for all. What do you think ?

Parameter Description
--all Record all topics and service event topic. Other hidden topics are excluded.
--all-topics Only record all topics. Hidden topic are excluded.
--all-services Only record all service event topics.

@MichaelOrlov
Copy link
Contributor

Maybe should add 2 new parameters for all. What do you think ?

Adding --all-topics and --all-services along with --all sounds logically reasonable. 👍

@Barry-Xu-2018
Copy link
Contributor Author

Barry-Xu-2018 commented Jun 9, 2023

We are not able to replay dedicated service requests due to the lack of the API currently because service requests are wrapped inside the service event and they are serialized. We decided that possibility of playback service requests needs to be further investigated and prototyped before we will make a final decision about including this option in the design.

How to unwrap request part in service event message.
(I had written code to verify this way.)

Assume we know the below information

  1. service event topic --- /add_two_ints/_service_event
  2. service event type --- example_interfaces/srv/AddTwoInts_Event
// Use type support introspection 
std::shared_ptr<rcpputils::SharedLibrary>    ts_lib = rclcpp::get_typesupport_library(
      "example_interfaces/srv/AddTwoInts_Event", "rosidl_typesupport_cpp");
    
const rosidl_message_type_support_t *   ts = rclcpp::get_typesupport_handle(
      "example_interfaces/srv/AddTwoInts_Event",
      "rosidl_typesupport_cpp",
      *ts_lib);

const rosidl_message_type_support_t *  ts_handle = get_message_typesupport_handle(
      ts,
      rosidl_typesupport_introspection_cpp::typesupport_identifier);
    
const rosidl_typesupport_introspection_cpp::MessageMembers *  message_members = (rosidl_typesupport_introspection_cpp::MessageMembers *)ts_handle->data;

// Allocate memory for deseriliazed message
void * ros_message = (void *)malloc(message_members->size_of_);
      message_members->init_function(
ros_message, rosidl_runtime_cpp::MessageInitialization::ZERO);

// serialized_msg_ptr (std::shared_ptr<rcutils_uint8_array_t>) points to one service event seriliazed message

// Deserialize service event message
rmw_deserialize(serialized_msg_ptr.get(), ts, ros_message);

// Now ros_message point to all deseriliazed service topic event

// Get the offset part of request
// Now the size of ServiceEventInfo is fixed value. So the offset can be only calculated once.
//
// The strcut of service event message 
// {
//    ServiceEventInfo info
//    example_interfaces::srv::AddTwoInts_Request_ request
//    example_interfaces::srv::AddTwoInts_Response_ response
// }
// 
// Service Event Info structure
// {
//   uint8 event_type
//   builtin_interfaces/Time stamp
//   char[16] client_gid
//   int64 sequence_number
// }

size_t request_offset = 0;
size_t request_offset_tmp = 0;
  for(size_t i=0; i < message_members->member_count_; i++) {
    auto message_member = message_members->members_[i];
    if (message_member.name_ == std::string("request")) {
      request_offset = request_offset_tmp;
    } else {
      auto sub_members = (rosidl_typesupport_introspection_cpp::MessageMembers *)message_member.members_->data;
      request_offset_tmp += sub_members->size_of_;
    }
   if (request_offset != 0) {
      break;
    }
  }

// Create node  
auto node = std::make_shared<rclcpp::Node>("send_request_topic");

rclcpp::QoS qos = rclcpp::ServicesQoS();
rcl_client_options_t options = rcl_client_get_default_options();
options.qos = qos.get_rmw_qos_profile();

// Create client
// GenericClient should be new created. It can accept service type string to create client
// async_send_request() can accept void * as input message.
auto client = std::make_shared<rclcpp::GenericClient>(
    &*node->get_node_base_interface(),
     node->get_node_graph_interface(),
    "add_two_ints",   // Get from service event topic
    "example_interfaces/srv/AddTwoInts",  // Get from service event type
    options);

 // Check if service is ready
 while (!client->service_is_ready()) {
     std::printf("wait for service ready !\n");
     std::this_thread::sleep_for(std::chrono::seconds(1));
 }

// Send request message
// Input data will be serilized according to type example_interfaces::srv::AddTwoInts_Request_. 
 client->async_send_request((void *)((uint8_t *)ros_message + request_offset));

The definition of class GenericClient likes

namespace rclcpp {
class GenericClient : public ClientBase {
public:  
  GenericClient(
    rclcpp::node_interfaces::NodeBaseInterface * node_base,
    rclcpp::node_interfaces::NodeGraphInterface::SharedPtr node_graph,
    const std::string & service_name,
    const std::string & service_type,
    rcl_client_options_t & client_options)
  : ClientBase(node_base, node_graph)
  {
    ts_lib_ = rclcpp::get_typesupport_library(
      service_type, "rosidl_typesupport_cpp");
    
    // Now rclcpp::get_typesupport_handle only support message.  
    // We should make it support service
    auto srv_type_support = rclcpp::get_typesupport_handle_srv(
      service_type,
      "rosidl_typesupport_cpp",
      *ts_lib_);
    
    srv_type_support_handle_ = get_service_typesupport_handle(srv_type_support, "rosidl_typesupport_cpp");

    auto request_typesupport = srv_type_support_handle_->request_typesupport;
    request_members_ = (rosidl_typesupport_introspection_cpp::MessageMembers *)request_typesupport->data;

    auto response_typesupport = srv_type_support_handle_->response_typesupport;
    response_members_ = (rosidl_typesupport_introspection_cpp::MessageMembers *)response_typesupport->data;

    rcl_ret_t ret = rcl_client_init(
      this->get_client_handle().get(),
      this->get_rcl_node_handle(),
      srv_type_support_handle_,
      service_name.c_str(),
      &client_options);
    if (ret != RCL_RET_OK) {
      if (ret == RCL_RET_SERVICE_NAME_INVALID) {
        auto rcl_node_handle = this->get_rcl_node_handle();
        // this will throw on any validation problem
        rcl_reset_error();
        expand_topic_or_service_name(
          service_name,
          rcl_node_get_name(rcl_node_handle),
          rcl_node_get_namespace(rcl_node_handle),
          true);
      }
      rclcpp::exceptions::throw_from_rcl_error(ret, "could not create generic client");
    }
  }

  std::shared_ptr<void>
  create_response() override
  {
    auto response = (void *)malloc(response_members_->size_of_);
    response_members_->init_function(response, rosidl_runtime_cpp::MessageInitialization::ZERO);
    return std::shared_ptr<void>(response, [](void *p){free(p);});
  }

  std::shared_ptr<rmw_request_id_t>
  create_request_header() override
  {
    // TODO(wjwwood): This should probably use rmw_request_id's allocator.
    //                (since it is a C type)
    return std::shared_ptr<rmw_request_id_t>(new rmw_request_id_t);
  }

  void
  handle_response(
    std::shared_ptr<rmw_request_id_t> request_header,
    std::shared_ptr<void> response) override
  {
    erase_pending_request(request_header->sequence_number);
  }

  int64_t
  async_send_request(const void * request)
  {
    int64_t sequence_number;
    std::lock_guard<std::mutex> lock(pending_requests_mutex_);
    rcl_ret_t ret = rcl_send_request(get_client_handle().get(), request, &sequence_number);
    if (RCL_RET_OK != ret) {
      rclcpp::exceptions::throw_from_rcl_error(ret, "failed to send request");
    }
    pending_requests_.emplace(sequence_number);
    return sequence_number;
  }

protected:
  void
  erase_pending_request(int64_t request_number)
  {
    std::unique_lock<std::mutex> lock(pending_requests_mutex_);
    auto it = pending_requests_.find(request_number);
    if (it == this->pending_requests_.end()) {
      RCUTILS_LOG_DEBUG_NAMED(
        "rclcpp",
        "Received invalid sequence number. Ignoring...");
    } else {
      pending_requests_.erase(request_number);
    }
  }


  RCLCPP_DISABLE_COPY(GenericClient)

  std::set<int64_t> pending_requests_;
  std::mutex pending_requests_mutex_;

private:
  std::shared_ptr<rcpputils::SharedLibrary> ts_lib_;
  const rosidl_service_type_support_t * srv_type_support_handle_;
  const rosidl_typesupport_introspection_cpp::MessageMembers * request_members_;
  const rosidl_typesupport_introspection_cpp::MessageMembers * response_members_;
};
}  // rclcpp

rclcpp::get_typesupport_handle_srv() should be wrapped to rclcpp::get_typesupport_handle().
For verification, I made a new function

const rosidl_service_type_support_t *
get_typesupport_handle_srv(
  const std::string & type,
  const std::string & typesupport_identifier,
  rcpputils::SharedLibrary & library)
{
  std::string package_name;
  std::string middle_module;
  std::string type_name;
  std::tie(package_name, middle_module, type_name) = extract_type_identifier(type);

  auto mk_error = [&package_name, &type_name](auto reason) {
      std::stringstream rcutils_dynamic_loading_error;
      rcutils_dynamic_loading_error <<
        "Something went wrong loading the typesupport library for service type " << package_name <<
        "/" << type_name << ". " << reason;
      return rcutils_dynamic_loading_error.str();
    };

  try {
    std::string symbol_name = typesupport_identifier + "__get_service_type_support_handle__" +
      package_name + "__" + (middle_module.empty() ? "srv" : middle_module) + "__" + type_name;

    const rosidl_service_type_support_t * (* get_ts)() = nullptr;
    // This will throw runtime_error if the symbol was not found.
    get_ts = reinterpret_cast<decltype(get_ts)>(library.get_symbol(symbol_name));
    return get_ts();
  } catch (std::runtime_error &) {
    throw std::runtime_error{mk_error("Library could not be found.")};
  }
}

@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov Friendly ping. Please look at above description of property on how to send request according to recorded service event message.

1. Add '/srv/' to judge if type is for service event topic
2. Update and add new parameters for record command
3. Add '-v'/'--verbose' to info command

Signed-off-by: Barry Xu <[email protected]>
@Barry-Xu-2018
Copy link
Contributor Author

According to above discussion, I updated the design. Please refer to 13effd3

Signed-off-by: Barry Xu <[email protected]>
@MichaelOrlov
Copy link
Contributor

We are not able to replay dedicated service requests due to the lack of the API currently because service requests are wrapped inside the service event and they are serialized. We decided that possibility of playback service requests needs to be further investigated and prototyped before we will make a final decision about including this option in the design.

How to unwrap request part in service event message. (I had written code to verify this way.)

Assume we know the below information

  1. service event topic --- /add_two_ints/_service_event
  2. service event type --- example_interfaces/srv/AddTwoInts_Event
// Use type support introspection 
std::shared_ptr<rcpputils::SharedLibrary>    ts_lib = rclcpp::get_typesupport_library(
      "example_interfaces/srv/AddTwoInts_Event", "rosidl_typesupport_cpp");

@Barry-Xu-2018 Thank you for spending your time prototyping possible implementation for the service playback.
It will give us a sense of the level of work required to get it done.
We discussed it on today's Rosbag2 and Tooling WG meeting and it seems everybody agreed that it would be reasonable to split support for the services recording and replay on the two phases.

  • In the first phase, we will add support for the service events recording.
  • In the second phase we will add support for the service request publishing during playback.

And the rationale for that is that publishing service requests require support from the rclcpp layer and the level of the changes looks like a significant feature that will require some time to get it done.
We have tried to consider the option when we would implement rclcpp parts like GenericClient and get_typesupport_handle_srv temporarily inside rosbag2 packages. But we don't like this scenario for various reasons and would prefer to make it on the rclcpp layer from the very beginning.

As regards the implementation.
I think the following part

size_t request_offset = 0;
size_t request_offset_tmp = 0;
  for(size_t i=0; i < message_members->member_count_; i++) {
    auto message_member = message_members->members_[i];
    if (message_member.name_ == std::string("request")) {
      request_offset = request_offset_tmp;
    } else {
      auto sub_members = (rosidl_typesupport_introspection_cpp::MessageMembers *)message_member.members_->data;
      request_offset_tmp += sub_members->size_of_;
    }
   if (request_offset != 0) {
      break;
    }
  }

Has to be rewritten in some better way. And actually, I think there is some mistake in calculating the offset. Basically, we need to check the service event type and based on it. Perhaps we can cast the beginning of the deserialized ros message to the Service Event Info structure to be able to get the event_type. And the service request structure is going to be right after the ros_message + sizeof(Service Event Info structure).

As regards the GenericClient implementation it looks like a bare minimum implementation and in real life it will need a bit more code to support pruning of the sent or stale service requests and other helper stuff similar to what is already done in the similar templated service client class.

@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov

Thank you for sharing the results of your discussion

We discussed it on today's Rosbag2 and Tooling WG meeting and it seems everybody agreed that it would be reasonable to split support for the services recording and replay on the two phases.

  • In the first phase, we will add support for the service events recording.
  • In the second phase we will add support for the service request publishing during playback.

Totally agree. The second phase relies on modifications to the rclcpp for implementation.

We have tried to consider the option when we would implement rclcpp parts like GenericClient and get_typesupport_handle_srv temporarily inside rosbag2 packages. But we don't like this scenario for various reasons and would prefer to make it on the rclcpp layer from the very beginning.

Yes. I also prefer to place them in rclcpp.

I think the following part
...
Has to be rewritten in some better way. And actually, I think there is some mistake in calculating the offset.

Yes. I will update it. It can be simplified as your suggestion ros_message + sizeof(Service Event Info structure).

we need to check the service event type and based on it. Perhaps we can cast the beginning of the deserialized ros message to the Service Event Info structure to be able to get the event_type. And the service request structure is going to be right after the ros_message + sizeof(Service Event Info structure).

Yes. We should check if the type of event message is REQUEST_SENT or REQUEST_RECEIVED. And then send it.

As regards the GenericClient implementation it looks like a bare minimum implementation and in real life it will need a bit more code to support pruning of the sent or stale service requests and other helper stuff similar to what is already done in the similar templated service client class.

Yes. I only do simply implementation for sending request. Finally, GenericClient should be implemented as you suggested.

@ihasdapie
Copy link
Member

ihasdapie commented Jun 18, 2023

Small comment: I'm personally not a big fan of all the new options for selecting topics/services, after this lands we will have --services, --all-services, --topics, --all-topics, -S .... and so forth. It may make sense to support (and document/advertise!) wildcards instead, i.e. --all-services can be --services=*, etc. This has the added bonus of supporting queries like --services=/planner/* for free.

During initial prototyping for service introspection I recall having to do something similar with loading the typesupports at runtime, but for getting the request/response msg typesupports given a service. Not sure if it'll be useful, but here it is anyways: https://gist.github.com/ihasdapie/e3950d0802a23d23419947c3496980ca. Another caveat I recall is that there are differences between the _cpp and _c typesupports, and IIRC the _c typesupports are usable for python & c, but the _cpp ones are only good for cpp -- which made it difficult to implement at the rcl level. But then again rosbag2 lives above the client library level, so it shouldn't be impacted by this.

I second the two-phase implementation, playback is a lot more rather unrelated work and it'd quite annoying to block what is essentially a rosbag2 cli syntactic sugar PR because of it.

@Barry-Xu-2018
Copy link
Contributor Author

Thanks for your comments.

Small comment: I'm personally not a big fan of all the new options for selecting topics/services, after this lands we will have --services, --all-services, --topics, --all-topics, -S .... and so forth. It may make sense to support (and document/advertise!) wildcards instead, i.e. --all-services can be --services=, etc. This has the added bonus of supporting queries like --services=/planner/ for free.

Yes, that would reduce the number of parameters. However, the style and usage of topic are different (topic uses '-e REGEX').
Except the missing '-e REGEX' parameter, the service currently lacks another parameter '--exclude'. Whether do we add new parameter for service ? @MichaelOrlov Do you have any suggestion?

During initial prototyping for service introspection I recall having to do something similar with loading the typesupports at runtime, but for getting the request/response msg typesupports given a service. Not sure if it'll be useful, but here it is anyways: https://gist.github.com/ihasdapie/e3950d0802a23d23419947c3496980ca.

Thanks, Your example provides another way to get request/response msg typesupports.

I second the two-phase implementation, playback is a lot more rather unrelated work and it'd quite annoying to block what is essentially a rosbag2 cli syntactic sugar PR because of it.

Yes.

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-06-15/32038/1

@MichaelOrlov
Copy link
Contributor

@Barry-Xu-2018 Sorry for my late response.

Yes, that would reduce the number of parameters. However, the style and usage of topic are different (topic uses '-e REGEX').
Except the missing '-e REGEX' parameter, the service currently lacks another parameter '--exclude'. Whether do we add new parameter for service ? @MichaelOrlov Do you have any suggestion?

On one hand it would be good to have support for wild cards, although on another hand it is going to be non-trivial in implementation. I would go with dedicated CLI options first.
As regards to the '-e REGEX' parameter it looks like it would be logically correct if it would be a more generic regex and will be applied for the both topics and services if they were enabled. However, it would be better to have separate exclude options for service and topics. But in this case we need to add a one more parameter and likely rename existent -exclude parameter.
In this case it would be challenging to create short cuts for those new "exclude" paramers. Probably the compromise solution would be to get rid from short names for those new CLI options.

@Barry-Xu-2018
Copy link
Contributor Author

As regards to the '-e REGEX' parameter it looks like it would be logically correct if it would be a more generic regex and will be applied for the both topics and services if they were enabled.

Agree.

Probably the compromise solution would be to get rid from short names for those new CLI options.

Agree.

So in order to keep the same style, we have to add new parameter for the service
For the record command, I list all parameters to clearly describe whether it affects the record of topic or service or both.

Parameter name Topic Service
-s {my_test_plugin,sqlite3,mcap}, --storage {my_test_plugin,sqlite3,mcap}
-a, --all
--all-topics
--all-services
-e REGEX, --regex REGEX
--exclude-topics EXCLUDE
--exclude-services EXCLUDE
-S [ServiceName ...], --services [ServiceName ...]
--include-unpublished-topics
--include-hidden-topics
--no-discovery
-p POLLING_INTERVAL, --polling-interval POLLING_INTERVAL
--ignore-leaf-topics
--qos-profile-overrides-path
-f {a,s}, --serialization-format {a,s}
-b MAX_BAG_SIZE, --max-bag-size MAX_BAG_SIZE
-d MAX_BAG_DURATION, --max-bag-duration MAX_BAG_DURATION
--max-cache-size MAX_CACHE_SIZE
--start-paused
--node-name NODE_NAME
--custom-data [KEY=VALUE ...]
--snapshot-mode
--storage-config-file STORAGE_CONFIG_FILE
--storage-preset-profile {none,fastwrite,zstd_fast,zstd_small}
--compression-queue-size COMPRESSION_QUEUE_SIZE
--compression-threads COMPRESSION_THREADS
--compression-mode {none,file,message}
--compression-format {zstd,fake_comp}

About --ignore-leaf-topics, service event topic has no subscription normally. So I think this parameter is only for recording topic.

Do you any comments on the above table? @MichaelOrlov @ihasdapie

@MichaelOrlov
Copy link
Contributor

@Barry-Xu-2018 Looks good to me.
I am only thinking about removing the shortcut for the -S [ServiceName ...] to avoid a mismatch with the -s {storage_plugin, sqlite3, mcap}

@Barry-Xu-2018
Copy link
Contributor Author

I am only thinking about removing the shortcut for the -S [ServiceName ...] to avoid a mismatch with the -s {storage_plugin, sqlite3, mcap}

Okay.

Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Barry-Xu-2018 I think we are pretty close to being good to go with this design document.
Need to mention that services playback is gonna implemented as phase 2 after implementing relevant API on rclcpp layer (list API which we needed), also remove -s and --services from playback.

docs/design/rosbag2_record_replay_service.md Outdated Show resolved Hide resolved
docs/design/rosbag2_record_replay_service.md Outdated Show resolved Hide resolved
@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov

I think we are pretty close to being good to go with this design document.
Need to mention that services playback is gonna implemented as phase 2 after implementing relevant API on rclcpp layer (list API which we needed), also remove -s and --services from playback.

Yes. I have started implementing the main functionality of phase 1. Once the new relevant API is added to rclcpp, I will start the implementation of phase 2.

@Barry-Xu-2018
Copy link
Contributor Author

f0f8ddd will be updated for play option part based on the disccusion result of https://github.com/ros2/rosbag2/pull/1359/files#r1244556452

1. Add the descrption of `--include-hidden-topics`
2. Update play options
3. Add descripton of implementation phase

Signed-off-by: Barry Xu <[email protected]>
Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Barry-Xu-2018 LGTM.
Thanks for your patients and versioning.

@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov Thanks.

I share the current status.
The implementation for phase 1 had been complete. Now I am adding tests.

@MichaelOrlov
Copy link
Contributor

@Barry-Xu-2018 As regards

The implementation for phase 1 had been complete. Now I am adding tests.

The draft PR will be appreciated.

@MichaelOrlov MichaelOrlov merged commit d788e51 into ros2:rolling Jul 11, 2023
@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov

The draft PR will be appreciated.

Okay. I will create draft PR.

@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov
I created a draft PR #1414.
Recording service and showing service info can work correctly.

@MichaelOrlov
Copy link
Contributor

Thanks, much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants