Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible log playback memory leak #47

Closed
osrf-migration opened this issue Jan 6, 2020 · 11 comments
Closed

Possible log playback memory leak #47

osrf-migration opened this issue Jan 6, 2020 · 11 comments
Labels
bug Something isn't working

Comments

@osrf-migration
Copy link

Original report (archived issue) by Ian Chen (Bitbucket: Ian Chen, GitHub: iche033).

The original report had attachments: state.tlog


Description

Log playback consumes a lot of memory. It was observed that when playing back the attached state.tlog (2.7MB) recorded in levels.sdf world, memory usage grew consistently over time to roughly >300MB by the end of the playback (~46s sim time)

Steps to Reproduce

  1. Playback the attached state.tlog (or any other log file), e.g. ign gazebo -v 4 -p [path_to_log_dir]
  2. Hit Play on ign-gazebo gui
  3. Open top and observe memory usage

Expected behavior:

It is expected that memory may grow as states get loaded but it should be at a slower pace. There should also be a cap on how much memory that can be consumed.

Actual behavior:

High memory usage ( >300MB over 46s sim time)

Reproduces how often:

All the time

Versions

ign-gazebo 2.0

Additional Information

Related subt issue. It was reported that playback used up all 32GB of RAM

@osrf-migration
Copy link
Author

Original comment by Ian Chen (Bitbucket: Ian Chen, GitHub: iche033).


  • Edited issue description

1 similar comment
@osrf-migration
Copy link
Author

Original comment by Ian Chen (Bitbucket: Ian Chen, GitHub: iche033).


  • Edited issue description

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


I can confirm that this is a problem.

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


I don't see a memory leak when running just the server during log playback.

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


  • set assignee_account_id to "557058:65740f22-cc56-4418-9608-7e17d0ed47b7"
  • set assignee to "carromj (Bitbucket: carromj, GitHub: mjcarroll)"

@osrf-migration
Copy link
Author

Original comment by Michael Carroll (Bitbucket: Michael Carroll, GitHub: mjcarroll).


I believe that I have tracked the leaks down to some of the move constructors and move assignment operators in sdformat . At a minimum, I believe that Geometry, Material, Collision, and Box are impacted, but it may be that these result in the largest leaks.

Running an sdformat unit test under valgrind seems to agree in an isolated case:

(blueprint)➜  sdformat8 valgrind --tool=memcheck --leak-check=full ./src/UNIT_Material_TEST 
==19839== Memcheck, a memory error detector
==19839== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19839== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19839== Command: ./src/UNIT_Material_TEST
==19839== 
Running main() from gtest_main.cc
[==========] Running 8 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 7 tests from DOMMaterial
[ RUN      ] DOMMaterial.Construction
[       OK ] DOMMaterial.Construction (10 ms)
[ RUN      ] DOMMaterial.MoveConstructor
[       OK ] DOMMaterial.MoveConstructor (6 ms)
[ RUN      ] DOMMaterial.CopyConstructor
[       OK ] DOMMaterial.CopyConstructor (5 ms)
[ RUN      ] DOMMaterial.AssignmentOperator
[       OK ] DOMMaterial.AssignmentOperator (4 ms)
[ RUN      ] DOMMaterial.MoveAssignmentOperator
[       OK ] DOMMaterial.MoveAssignmentOperator (5 ms)
[ RUN      ] DOMMaterial.Set
[       OK ] DOMMaterial.Set (16 ms)
[ RUN      ] DOMMaterial.InvalidSdf
[       OK ] DOMMaterial.InvalidSdf (6 ms)
[----------] 7 tests from DOMMaterial (56 ms total)

[----------] 1 test from DOMAtmosphere
[ RUN      ] DOMAtmosphere.CopyAssignmentAfterMove
[       OK ] DOMAtmosphere.CopyAssignmentAfterMove (2 ms)
[----------] 1 test from DOMAtmosphere (3 ms total)

[----------] Global test environment tear-down
[==========] 8 tests from 2 test cases ran. (74 ms total)
[  PASSED  ] 8 tests.
==19839== 
==19839== HEAP SUMMARY:
==19839==     in use at exit: 232 bytes in 1 blocks
==19839==   total heap usage: 3,564 allocs, 3,563 frees, 3,916,665 bytes allocated
==19839== 
==19839== 232 bytes in 1 blocks are definitely lost in loss record 1 of 1
==19839==    at 0x4C3017F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19839==    by 0x4EC5EDA: sdf::v8::Material::Material() (Material.cc:66)
==19839==    by 0x11D0FA: DOMMaterial_MoveAssignmentOperator_Test::TestBody() (Material_TEST.cc:143)
==19839==    by 0x14D029: HandleSehExceptionsInMethodIfSupported<testing::Test, void> (gtest.cc:2421)
==19839==    by 0x14D029: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2457)
==19839==    by 0x142E29: Run (gtest.cc:2495)
==19839==    by 0x142E29: testing::Test::Run() (gtest.cc:2486)
==19839==    by 0x142F77: Run (gtest.cc:2671)
==19839==    by 0x142F77: testing::TestInfo::Run() (gtest.cc:2645)
==19839==    by 0x143054: Run (gtest.cc:2789)
==19839==    by 0x143054: testing::TestCase::Run() (gtest.cc:2774)
==19839==    by 0x14357B: testing::internal::UnitTestImpl::RunAllTests() (gtest.cc:5051)
==19839==    by 0x14D539: HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (gtest.cc:2421)
==19839==    by 0x14D539: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (gtest.cc:2457)
==19839==    by 0x1436AB: testing::UnitTest::Run() (gtest.cc:4667)
==19839==    by 0x1194A8: RUN_ALL_TESTS (gtest.h:2329)
==19839==    by 0x1194A8: main (gtest_main.cc:37)
==19839== 
==19839== LEAK SUMMARY:
==19839==    definitely lost: 232 bytes in 1 blocks
==19839==    indirectly lost: 0 bytes in 0 blocks
==19839==      possibly lost: 0 bytes in 0 blocks
==19839==    still reachable: 0 bytes in 0 blocks
==19839==         suppressed: 0 bytes in 0 blocks
==19839== 
==19839== For counts of detected and suppressed errors, rerun with: -v
==19839== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

@osrf-migration
Copy link
Author

Original comment by Michael Carroll (Bitbucket: Michael Carroll, GitHub: mjcarroll).


Fixing the ones mentioned above changes the heap allocation profile.



There is still some growth, which could mean we have other sdformat classes that have the same issue, I’m going to audit the rest of the tests under valgrind and see what turns up.

@osrf-migration
Copy link
Author

Original comment by Michael Carroll (Bitbucket: Michael Carroll, GitHub: mjcarroll).


https://bitbucket.org/osrf/sdformat/pull-requests/641/fix-move-assignment-constructor-leaks

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


This issue is almost resolved. I believe @mjcarroll has one more leak based on this comment.

@osrf-migration
Copy link
Author

Original comment by Michael Carroll (Bitbucket: Michael Carroll, GitHub: mjcarroll).


This was the follow-up, all should be addressed now: https://bitbucket.org/osrf/sdformat/pull-requests/644

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


  • changed state from "new" to "resolved"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant