-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bindings] Update memory leak test #22059
[bindings] Update memory leak test #22059
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+@jwnimmer-tri for feature review.
The results show that some tests miss the mark, and (perhaps) others may be missing. I don't know that we need to address that in this PR. YMMV.
Reviewable status: LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, all commit messages.
Reviewable status: 2 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers
bindings/pydrake/test/memory_leak_test.py
line 117 at r1 (raw file):
if key != 'self' and not any(isinstance(value, typ) for typ in [list, str])}
Everything from class Sentinel
up above through here is quite opaque to me, and indeed has few to no comments. We will need a lot more explanatory text in the first 100 lines to hand-hold drake developers through what all is happening.
bindings/pydrake/test/memory_leak_test.py
line 299 at r1 (raw file):
f" leaks_allowed count is stale, please update the test.")
nit Lint is here (blank line).
c93622e
to
8535a92
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers
bindings/pydrake/test/memory_leak_test.py
line 2 at r1 (raw file):
"""Eventually this program might grow up to be an actual regression test for memory leaks, but for now it merely serves to demonstrate such leaks.
stale
bindings/pydrake/test/memory_leak_test.py
line 117 at r1 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
Everything from
class Sentinel
up above through here is quite opaque to me, and indeed has few to no comments. We will need a lot more explanatory text in the first 100 lines to hand-hold drake developers through what all is happening.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers
bindings/pydrake/test/memory_leak_test.py
line 99 at r2 (raw file):
def _counts_for_cycle_parts(o, name): """Print relevant reference counts for the objects expected if `o` participates in a cycle created by drake::pydrake::internal::ref_cycle.
anachronism
6ba1b9b
to
3ad502b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 3 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers
bindings/pydrake/test/memory_leak_test.py
line 62 at r3 (raw file):
def _object_generation(o) -> int | None: """Return the garbage collection generation of the passed object, or None
nit GSG mood.
Ditto throughout this file.
Suggestion:
Returns
bindings/pydrake/test/memory_leak_test.py
line 71 at r3 (raw file):
gen_id_list = [id(x) for x in gen_list] if id(o) in gen_id_list: return gen
BTW The way this is spelled seems odd to me. It is weird for performance reasons? Seems like this is simpler and more clear:
if any([x is o for x in gen_list]):
return gen
Code quote:
gen_id_list = [id(x) for x in gen_list]
if id(o) in gen_id_list:
return gen
bindings/pydrake/test/memory_leak_test.py
line 128 at r3 (raw file):
verbose: if True, print additional debug information. """ self._verbose = verbose
I can't understand why this is a member field. It seems like it's only ever changed by hand, by editing the code, and never just for one dut, rather for everything at once?
Under that premise, a test-wide global VERBOSE = False
would be much easier to read and understand even just for the if VERBOSE
logic, but then would also allow us to split DutRepeater into pieces. Right now it's doing too many jobs at once, and is very difficult to follow and reason about.
Once "verbose" is make into a global, then make_sentinel
no longer needs a self
, and so it can be hoisted up right next to class Sentinel
for cohesiveness. Its helper make_sentinels_from_locals
would also move up there for the same reasons (dropping self
).
After that, the dut
functions don't need to be member functions, so would go back to also being little stateless free functions. Having them be member functions with self._leaks
in their scope is very confusing.
After all of that, possibly it makes sense for DutRepeater
to just merge into the class TestMemoryLeaks
? Hard to say w/o seeing the next draft, but I'm struggling to understand what it's supposed to be encapsulating as a class. Possibly it still makes sense as a standalone thing. LMK.
3ad502b
to
b0ce33a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers
bindings/pydrake/test/memory_leak_test.py
line 71 at r3 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
BTW The way this is spelled seems odd to me. It is weird for performance reasons? Seems like this is simpler and more clear:
if any([x is o for x in gen_list]): return gen
I was trying not to bump ref counts of stuff on the verge of GC, but maybe it doesn't matter. We'll try it your way.
bindings/pydrake/test/memory_leak_test.py
line 128 at r3 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
I can't understand why this is a member field. It seems like it's only ever changed by hand, by editing the code, and never just for one dut, rather for everything at once?
Under that premise, a test-wide global
VERBOSE = False
would be much easier to read and understand even just for theif VERBOSE
logic, but then would also allow us to split DutRepeater into pieces. Right now it's doing too many jobs at once, and is very difficult to follow and reason about.Once "verbose" is make into a global, then
make_sentinel
no longer needs aself
, and so it can be hoisted up right next toclass Sentinel
for cohesiveness. Its helpermake_sentinels_from_locals
would also move up there for the same reasons (droppingself
).After that, the
dut
functions don't need to be member functions, so would go back to also being little stateless free functions. Having them be member functions withself._leaks
in their scope is very confusing.After all of that, possibly it makes sense for
DutRepeater
to just merge into theclass TestMemoryLeaks
? Hard to say w/o seeing the next draft, but I'm struggling to understand what it's supposed to be encapsulating as a class. Possibly it still makes sense as a standalone thing. LMK.
Done.
b0ce33a
to
6381009
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: 7 unresolved discussions, needs at least two assigned reviewers
bindings/pydrake/test/memory_leak_test.py
line 163 at r4 (raw file):
if VERBOSE: for key, value in locals().copy().items(): _counts_for_cycle_parts(value, key)
BTW Using value, key
here comes across as a bit unusual, things are typically "key, value" the other way around. Probably the "counts for ..." should require calling by kwargs, and so have name=key, o=value
or key=key, value=value
here at the caller.
bindings/pydrake/test/memory_leak_test.py
line 294 at r4 (raw file):
diagram.SetRandomContext(simulator.get_mutable_context(), random) simulator.AdvanceTo(0.5) return _make_sentinels_from_locals("full_simulator", locals())
nit This string doesn't match the function name, which is "full_example". I imagine we want them to be 100% identical?
bindings/pydrake/test/memory_leak_test.py
line 300 at r4 (raw file):
"""Calls dut() for count times in a row, performing a full garbage collection before and after each call. Tracks memory leaks of interest; the count of leaks is returned. if `VERBOSE=True`, additional debug
typo s/if/If/
bindings/pydrake/test/memory_leak_test.py
line 310 at r4 (raw file):
Returns: int: the total number of leaks detected by examining returned
BTW When I read this docstring, I took it to be the number of leaked objects, not the number of dut()
calls which leaked anything.
bindings/pydrake/test/memory_leak_test.py
line 330 at r4 (raw file):
class TestMemoryLeaks(unittest.TestCase): def do_test(self, *, dut, count, leaks_allowed=0): """Run the requested `dut` (see _repeat() above) for `count`
nit GSG "Runs"
bindings/pydrake/test/memory_leak_test.py
line 123 at r5 (raw file):
# TODO(rpoyner-tri): update or delete this when ref_cycle is adopted, changed, # or abandoned. def _counts_for_cycle_parts(o, name):
nit This needs a verb. Maybe "_print_counts...etc"? Currently "counts" seems like it's supposed to be the verb, which is somewhat confusing.
Actually, without any of the 22022 infra landed, this function seems somewhat premature to land now?
bindings/pydrake/test/memory_leak_test.py
line 360 at r5 (raw file):
# TODO(rpoyner-tri): investigate platform differences, if leaks persist # after fixing. leaks_by_platform = {
BTW This kind of thing is why I'm a little bit hesitant to have the "fail if the allowed number is too high". For my taste, grooming that number should be mostly manual, without CI failures nagging us about it. Since the goal number our work on this ticket is necessarily zero, I don't know that we need a lot of hand-holding to remind us to keep lowering it.
Our C++ limit_malloc has min_num_allocations
but it is rarely used. If you do want to keep the lower bound as part of this test, maybe "leaks_required=1" is a better phrasing that equating leaks_allowed to leaks_required implicitly?
6381009
to
090ef37
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+@sherm1 for next not-me platform review.
Reviewable status: LGTM missing from assignee sherm1(platform)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r6, all commit messages.
Reviewable status: LGTM missing from assignee sherm1(platform)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, 1 of 1 files at r6, all commit messages.
Reviewable status: 3 unresolved discussions
bindings/pydrake/test/memory_leak_test.py
line 99 at r6 (raw file):
print(f"sentinel alive? {finalizer.alive}") if finalizer.alive: o = finalizer.peek()[0]
BTW a comment on this line would be helpful for me. Would a more Python-savvy person automatically understand this? Here's my (likely wrong) guess after a lot of staring at the code: peek()
applied to a function returns a list of its arguments and then [0]
grabs the first one, which in this case is known to be the object of interest?
bindings/pydrake/test/memory_leak_test.py
line 283 at r6 (raw file):
Returns: int: the total number of leaked objectss detected by examining returned
typo: objectss
bindings/pydrake/test/memory_leak_test.py
line 297 at r6 (raw file):
_report_sentinels(sentinels, "after collect") leaks += any( [sentinel.finalizer.alive for sentinel in sentinels])
BTW IIUC this adds 1 to leaks
if any of this batch of sentinels is still active; it does not add 1 for each active sentinel. Is that the intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 unresolved discussions
bindings/pydrake/test/memory_leak_test.py
line 99 at r6 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW a comment on this line would be helpful for me. Would a more Python-savvy person automatically understand this? Here's my (likely wrong) guess after a lot of staring at the code:
peek()
applied to a function returns a list of its arguments and then[0]
grabs the first one, which in this case is known to be the object of interest?
Does this API reference help?
https://docs.python.org/3/library/weakref.html#weakref.finalize.peek
The finalizer: weakref.finalize
field is annotated as being of type weakref.finalize
, and that's the API docs for its peek()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 unresolved discussions
bindings/pydrake/test/memory_leak_test.py
line 99 at r6 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
Does this API reference help?
https://docs.python.org/3/library/weakref.html#weakref.finalize.peek
The
finalizer: weakref.finalize
field is annotated as being of typeweakref.finalize
, and that's the API docs for itspeek()
method.
Yep, thanks. I tried googling "python what does peek do" but that returned info about something else with the same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 unresolved discussions
bindings/pydrake/test/memory_leak_test.py
line 297 at r6 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW IIUC this adds 1 to
leaks
if any of this batch of sentinels is still active; it does not add 1 for each active sentinel. Is that the intention?
Ah, yes, the code and documentation are at odds. Will fix.
Convert the test to unittest format, rewrite the instrumentation, and track the current numbers of leaks, with the goal of reaching 0 once fixes are landed. As leaks are fixed, the test will complain about allowing more leaks than actually occurred, prompting update or removal of the leaks allowed.
090ef37
to
f87c852
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r7, all commit messages.
Reviewable status:complete! all discussions resolved, LGTM from assignees jwnimmer-tri(platform),sherm1(platform)
Convert the test to unittest format, rewrite the instrumentation, and track the current numbers of leaks, with the goal of reaching 0 once fixes are landed. As leaks are fixed, the test will complain about allowing more leaks than actually occurred, prompting update or removal of the leaks allowed.
Convert the test to unittest format, rewrite the instrumentation, and track the current numbers of leaks, with the goal of reaching 0 once fixes are landed.
As leaks are fixed, the test will complain about allowing more leaks than actually occurred, prompting update or removal of the leaks allowed.
This change is