Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

👨‍🌾 rosbag2_transport can run out of memory in coverage jobs #872

Closed
sloretz opened this issue Sep 21, 2021 · 5 comments · Fixed by #1518
Closed

👨‍🌾 rosbag2_transport can run out of memory in coverage jobs #872

sloretz opened this issue Sep 21, 2021 · 5 comments · Fixed by #1518
Labels
enhancement New feature or request

Comments

@sloretz
Copy link
Contributor

sloretz commented Sep 21, 2021

Description

** Clear and concise description of the feature that you'd like to exist, from a user's perspective. Don't get into implementation details, describe what you want it to do. Put any consideration of alternate features that might solve your problem as well. **

The package `rosbag2_transport appears to have run out of memory while building in the Rolling coverage nightly.
https://ci.ros2.org/job/nightly_linux_coverage/1371/ . Building locally, I notice the package required ~15GB peak to build on my desktop, but the peak number probably varies with the number of CPU cores used.

Related Issues

Completion Criteria

Implementation Notes / Suggestions

I looked briefly at rosbag2_transport with include-what-you-use to see if there are excessive includes that could be increasing the memory required to compile. Nothing really sticks out to me except rclcpp/rclcpp.hpp can be replaced with one or two more specific headers in a lot of cases. I didn't try it to see what the impact to memory usage is. I attached the output from the tool in case anyone wants to try its suggestions. iwyu.txt

One thing that does standout is a lot of files get compiled multiple times when they don't need to be. rosbag2_transport_add_gmock calls ament_add_gmock with different RMW_IMPLEMENTATION environment variables set, but the code shouldn't need to be rebuilt because of that. A way of splitting building the exectuable from configuring the runtime properties of the tests would likely save a bunch of compilation time.

Testing Notes / Suggestions

@sloretz sloretz added the enhancement New feature or request label Sep 21, 2021
@clalancette clalancette changed the title rosbag2_transport ran out of memoryin Rolling coverage job rosbag2_transport ran out of memory in Rolling coverage job Oct 18, 2021
@clalancette
Copy link
Contributor

This is still happening with Humble and Rolling sometimes:

https://ci.ros2.org/job/nightly_linux_coverage/1606/
https://ci.ros2.org/job/nightly_linux_humble_coverage/17/

I think Shane's suggestion here is probably the best one; we don't need to recompile the tests multiple times for different RMW implementations, we just need to set the RMW_IMPLEMENTATION environment variable when running. For that matter, we could decide not to bother testing multiple RMW implementations in here, as the purpose of these unit tests is to test out rosbag2 and not the RMWs. That said, I'm less in favor of that since having multiple RMWs here at least gives us some additional coverage on all RMWs.

@MichaelOrlov
Copy link
Contributor

@clalancette From one hand you are right unit tests for rosbag2 should test rosbag2 and not RMW implementations.
However from another hand it's historically turns out that most of the tests from rosbag2_transport and rosbag2_tests packages are sort of integration tests which involves both publishing and subscriptions of the real messages on the "wires".
Very often those failing integration tests became entry point for real problems on some middleware layer. Some of the tests may be acting as sort of "stress" tests for middleware by their nature.
And we have had some meaningful amount of cases when issues in RMW implementation was revealed.
Very often when one of the such "integration" tests start failing we need to know if it's failing on some specific RMW or failure also happening on other RMW implementations.

Also very often issues related to the RMW implementation not reproduces or very difficult to reproduce on local machine. And they are more likely to be reproduced on CI.

One of the real examples of such failing test is rosbag2_transport.test_play_services__rmw_fastrtps_cpp and I have seen more in my practice and we have made a several fixes in CycloneDDS after debugging failures in rosbag2 test.

I wouldn't recommend to going away from running rosbag2 tests on all RMW implementations.
I'we tried to look on the way how we running tests for multiple RMWs. We actually generating new subset of tests for each RMW implementation in rosbag2_transport_add_gmock.
And unfortunately at the moment I have not found another way how to consistently change environment variable in runtime for running the same tests multiple times with another RMW implementation without recompile them.
If someone have a good idea or proposal how to do this please share.

I would propose to try reduce memory footprint during compilation by trying to address some findings and suggestions from include-what-you-use report.

@juanmed
Copy link

juanmed commented Sep 10, 2022

Hi,

I am experiencing this issue while building ros2-humble in Manjaron ARM on a pinebook pro laptop. The specific build command is pamac build ros2-humble. While most of the packages build successfully (after some tweaking), rosbag2_transport and dependent packages always fail due to the RAM usage, disregard of what I try.

My system is:

Operating System: Manjaro-ARM
KDE Plasma Version: 5.25.4
KDE Frameworks Version: 5.96.0
Qt Version: 5.15.5
Kernel Version: 5.19.1-2-MANJARO-ARM (64-bit)
Graphics Platform: Wayland
Processors: 4 × ARM Cortex-A53, 2 × ARM Cortex-A72
Memory: 3.7 GiB of RAM
Graphics Processor: Mali-T860

I am wondering if there are any workarounds or recommendations to successfully build packages that require a lot of RAM (>3GB) on limited platforms like raspberry-pi or the pinebook pro?

Any suggestions will be very appreciated!

@clalancette
Copy link
Contributor

I am wondering if there are any workarounds or recommendations to successfully build packages that require a lot of RAM (>3GB) on limited platforms like raspberry-pi or the pinebook pro?

So what you are running into is almost certainly different that this issue. This issue is specifically about building with coverage enabled.

That said, on resource-constrained platforms, you can usually get around RAM problems by building with something like the following:

MAKEFLAGS="-j1" colcon build --executor sequential

Which will cause all packages to be built one-by-one, while also only building a single file at a time. As you can imagine, this makes building very slow, but will use the minimum amount of RAM.

If you still can't build after this, then the platform just doesn't have enough memory. We would love to reduce the compilation memory usage, so if you find a way to reduce it, please do feel free to open PRs that we will be happy to review.

@Blast545 Blast545 changed the title rosbag2_transport ran out of memory in Rolling coverage job 👨‍🌾 rosbag2_transport can run out of memory in coverage jobs Dec 15, 2023
@Blast545
Copy link
Contributor

This has been a "known issue" for a while, we'll track it using this ticket. This was closed some time ago: #1015 reporting the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants