Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid passing exception KeyboardInterrupt to the upper layer #788

Merged

Conversation

Barry-Xu-2018
Copy link
Contributor

Address #787

@Barry-Xu-2018 Barry-Xu-2018 requested a review from a team as a code owner June 22, 2021 08:30
@Barry-Xu-2018 Barry-Xu-2018 requested review from jaisontj and prajakta-gokhale and removed request for a team June 22, 2021 08:30
@Barry-Xu-2018 Barry-Xu-2018 force-pushed the topic-avoid-pass-KeyboardInterrupt branch from b3ef5b8 to aceac06 Compare June 22, 2021 08:38
Copy link
Collaborator

@emersonknapp emersonknapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks. It might be nice to have a regression test for this, if you could add that

@emersonknapp
Copy link
Collaborator

Gist: https://gist.githubusercontent.com/emersonknapp/31a0f4acbf7a9178dfbedb80dccd2d01/raw/7573fb48ec7c6d3994fa585d98eb137bf3928e3b/ros2.repos
BUILD args: --packages-up-to ros2bag
TEST args: --packages-select ros2bag
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/8611

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@Barry-Xu-2018
Copy link
Contributor Author

@emersonknapp

Thanks for the review.

looks good, thanks. It might be nice to have a regression test for this, if you could add that

I find there is a check.

// this call will make sure that the process does execute without issues before it is killed by
// the user in the test or, in case it runs until completion, that it has correctly executed.
EXPECT_THAT(WEXITSTATUS(child_return_code), Not(Eq(EXIT_FAILURE)));

But it checks if return is EXIT_FAILURE(1).
As you known, original codes return 2 (reference below code)
https://github.com/ros2/ros2cli/blob/aaeda774f6e8cc4fe43c97888c5fd633e010a56a/ros2cli/ros2cli/cli.py#L66-L70

So the check always pass. This is incorrect behavior.
So I modify this check as 38fbbac.

@Barry-Xu-2018
Copy link
Contributor Author

Barry-Xu-2018 commented Jun 23, 2021

@emersonknapp

I find CI report failure.

Rpr__rosbag2__ubuntu_focal_amd64 --- failure
According to https://build.ros2.org/job/Rpr__rosbag2__ubuntu_focal_amd64/881/testReport/, 3 test case failed.

The failure reason includes 2 points.

  1. While publishing the message, sometimes rosbag doesn't discover the topic to be recorded.
    e.g.
    rosbag2_test_common::PublicationManager pub_manager;
    pub_manager.setup_publisher("/test_topic", message, expected_test_messages);
    pub_manager.setup_publisher("/wrong_topic", wrong_message, 3);
    auto process_handle = start_execution(
    "ros2 bag record --max-cache-size 0 --output " + root_bag_path_.string() + " /test_topic");
    wait_for_db();
    pub_manager.run_publishers();

This leads to the error as the below (The number of publishing message doesn't match the number of recorded message)
https://build.ros2.org/job/Rpr__rosbag2__ubuntu_focal_amd64/881/testReport/rosbag2_tests/RecordFixture/record_end_to_end_test_with_cache/

Now we have no interface to make sure rosbag finish discovering topic, so I want to add sleep before publishing message to make sure discover is finished.

  1. It is possible that 2 processes access the created sqlite3 database at the same time.
    auto process_handle = start_execution(
    "ros2 bag record --max-cache-size 0 --output " + root_bag_path_.string() + " /test_topic");
    wait_for_db();

start_execution() will start rosbag command in new process.

void wait_for_db()
{
const auto database_path = get_bag_file_path(0);
while (true) {
try {
std::this_thread::sleep_for(50ms); // wait a bit to not query constantly
if (database_path.exists()) {
rosbag2_storage_plugins::SqliteWrapper db{
database_path.string(), rosbag2_storage::storage_interfaces::IOFlag::READ_ONLY};
return;
}

As the above codes, current process also access sqlite3 database.
This leads that the lock occurs.

[ RUN      ] RecordFixture.rosbag2_record_and_play_multiple_topics_with_filter
[INFO] [1624441972.716871010] [rosbag2_storage]: Opened database '/tmp/tmp_test_dir_OVIieg/rosbag2_record_and_play_multiple_topics_with_filter/rosbag2_record_and_play_multiple_topics_with_filter_0.db3' for READ_WRITE.
[INFO] [1624441972.716994585] [rosbag2_recorder]: Listening for topics...
[INFO] [1624441972.718430617] [rosbag2_recorder]: Subscribed to topic '/test_topic1'
[INFO] [1624441972.719189448] [rosbag2_cpp]: Writing remaining messages from cache to the bag. It may take a while
Error when processing SQL statement. SQLite error (5): database is locked

If run enough times, the same problems also occur in my local environment.

@Barry-Xu-2018
Copy link
Contributor Author

@emersonknapp

I made a fix and want to discuss with you on how to fix.
So please take a look faa1743.

@emersonknapp
Copy link
Collaborator

Thanks for analyzing the flaky test!

I don't like there being an arbitrary sleep for the OK case. I think we can have timeouts, but much of the flaky behavior comes from these sleep statements - I don't know if the fix you have proposed is appropriate.

However, your fix for the exit code is good, so I would prefer to break out the flaky-test fix into a separate PR, and move this fix forward independently since it is definitely OK.

@Barry-Xu-2018
Copy link
Contributor Author

Barry-Xu-2018 commented Jun 24, 2021

@emersonknapp

I don't like there being an arbitrary sleep for the OK case. I think we can have timeouts, but much of the flaky behavior comes from these sleep statements - I don't know if the fix you have proposed is appropriate.

Yeah. I also don't want to use sleep.
I will create another issue to discuss how to fix and we can continue to discuss it.

However, your fix for the exit code is good, so I would prefer to break out the flaky-test fix into a separate PR, and move this fix forward independently since it is definitely OK.

Okay.
I will make new commit to revert the latest commit faa1743

… database file"

This reverts commit faa1743.

Signed-off-by: Barry Xu <[email protected]>
@emersonknapp
Copy link
Collaborator

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@emersonknapp emersonknapp merged commit bb9c7ae into ros2:master Jun 24, 2021
bin-i pushed a commit to bin-i/rosbag2 that referenced this pull request Dec 16, 2021
* Avoid passing exception KeyboardInterrupt to the upper layer

Signed-off-by: Barry Xu <[email protected]>
MichaelOrlov pushed a commit that referenced this pull request Sep 29, 2022
…942)

* Avoid passing exception KeyboardInterrupt to the upper layer

Signed-off-by: Barry Xu <[email protected]>

Signed-off-by: Barry Xu <[email protected]>
Co-authored-by: Barry Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants