-
-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multiple races within actor/cycle detector interactions #4251
Conversation
1727806
to
36c2578
Compare
This is functionally working but needs (extensive) documentation and release notes. I hope to land this within a week. This will close 4 issues. |
@jemc I hope the level of documentation here is good enough. Let me know. |
Sylvan reviewed the core assumptions of the fixes herein with me as a sanity check, but did not review the actual implementation. |
Co-authored-by: Joe Eli McIlvain <[email protected]>
Co-authored-by: Joe Eli McIlvain <[email protected]>
Co-authored-by: Joe Eli McIlvain <[email protected]>
We discussed this today in the sync call and I asked Sean a few questions to confirm my understanding, then I approved the PR. |
The Pony runtime includes an optional cycle detector that is on by default. The
cycle detector runs and looks for groups of blocked actors that will have
reference counts above 0 but are unable to do any more work as all members are
blocked and don't have additional work to do.
Over time, we have made a number of changes to the cycle detector to improve
it's performance and mitigate it's impact on running Pony programs. In the
process of improving the cycle detectors performance, it has become more and
more complicated. That complication led to several race conditions that existed
in the interaction between actors and the cycle detector. Each of these race
conditions could lead to an actor getting freed more than once, causing an
application crash or an attempt to access an actor after it had been deleted.
I've identified and fixed what I believe are all the existing race conditions
in the current design. I intend to follow this commit up in the future with a
completely new design that will provide the same functionality as the cycle
detector but with better performance and maintenance characteristics.
These changes were all tested with the "short lived actors" cycle detector programs
exhibit similar memory usage and none of them nor other test programs I threw
at them caused any assertion failures or crashes.
Closes #4193
Closes #4221
Closes #4220
Closes #4219