Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1122 elementstats is leaking #1137

Merged
merged 11 commits into from
Nov 18, 2020
Merged

1122 elementstats is leaking #1137

merged 11 commits into from
Nov 18, 2020

Conversation

nlslatt
Copy link
Collaborator

@nlslatt nlslatt commented Nov 4, 2020

The vectors holding stats in ElementStats grow without bound. This PR erases stats data for phases older than the lookback.

Fixes #1122

@nlslatt nlslatt force-pushed the 1122-elementstats-is-leaking branch 2 times, most recently from af9552d to e2f04a8 Compare November 4, 2020 18:16
@codecov
Copy link

codecov bot commented Nov 4, 2020

Codecov Report

Merging #1137 (fd5479f) into develop (cee23c1) will increase coverage by 0.09%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1137      +/-   ##
===========================================
+ Coverage    79.45%   79.54%   +0.09%     
===========================================
  Files          718      719       +1     
  Lines        27241    27325      +84     
===========================================
+ Hits         21643    21735      +92     
+ Misses        5598     5590       -8     
Impacted Files Coverage Δ
src/vt/vrt/collection/balance/elm_stats.h 100.00% <ø> (ø)
src/vt/vrt/collection/balance/elm_stats.cc 84.69% <100.00%> (+7.03%) ⬆️
src/vt/vrt/collection/balance/elm_stats.impl.h 100.00% <100.00%> (ø)
tests/unit/collection/test_lbstats_retention.cc 100.00% <100.00%> (ø)
...lection/balance/model/persistence_median_last_n.cc 94.73% <0.00%> (+10.52%) ⬆️

Comment on lines +127 to +129
std::unordered_map<PhaseType, std::vector<TimeType>> subphase_timings_ = {};
std::unordered_map<PhaseType, std::vector<CommMapType>> subphase_comm_ = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not make much sense (performance/memory overhead?), but if we change this to unordered_map<PhaseType, unordered_map<SubphaseType, ...>> we might simplify code by getting rid of all resize()s and at()s.

Copy link
Member

@PhilMiller PhilMiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me. Your choice whether to address the mismatch I noted.

@PhilMiller
Copy link
Member

The failure of 141 - vt:*/TestNodeStatsDumper.test_node_stats_dumping_with_interval/*_proc_4 on clang-8/macosx/mpich looks like it could be real?

@jstrzebonski
Copy link
Contributor

jstrzebonski commented Nov 6, 2020

The failure of 141 - vt:*/TestNodeStatsDumper.test_node_stats_dumping_with_interval/*_proc_4 on clang-8/macosx/mpich looks like it could be real?

Don't want to say yes or now, but given that it failed only on one build it might be just flaky :'- (
There's an issue for this test -> #1110

@nlslatt
Copy link
Collaborator Author

nlslatt commented Nov 6, 2020

The failure of 141 - vt:*/TestNodeStatsDumper.test_node_stats_dumping_with_interval/*_proc_4 on clang-8/macosx/mpich looks like it could be real?

Don't want to say yes or now, but given that it failed only on one build it might be just flaky :'- (
There's an issue for this test -> #1110

It definitely looks like a legitimate failure, but I'm not sure yet if it's related to my changes.

Copy link
Collaborator

@lifflander lifflander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@nlslatt nlslatt force-pushed the 1122-elementstats-is-leaking branch from 16b300d to fd5479f Compare November 18, 2020 18:31
@nlslatt nlslatt merged commit 81fa08d into develop Nov 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ElementStats is leaking memory from previous phases
4 participants