Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race conditions that can lead to a segfault #3667

Merged
merged 11 commits into from
Sep 29, 2020

Conversation

dipinhora
Copy link
Contributor

Prior to this commit, there could be race conditions with a actor
running concurrently on two different threads due to work being
done after an actors queue was marked empty.

This commit reworks the logic to ensure that the actors queue being
marked empty is the last thing that occurs in ponyint_actor_run
to prevent these race conditions.

This commit also adds logic in the cycle detector to not send any
messages to actors that are marked as pending destroy to ensure
that the cycle detector cannot create a race condition by sending
messages to a zombie actor (with rc == 0) that is about to be
reaped in the near future.

Prior to this commit, there could be race conditions with a actor
running concurrently on two different threads due to work being
done after an actors queue was marked empty.

This commit reworks the logic to ensure that the actors queue being
marked empty is the last thing that occurs in `ponyint_actor_run`
to prevent these race conditions.

This commit also adds logic in the cycle detector to not send any
messages to actors that are marked as pending destroy to ensure
that the cycle detector cannot create a race condition by sending
messages to a zombie actor (with `rc == 0`) that is about to be
reaped in the near future.
// mark the queue as empty or else destroy will hang
bool empty = ponyint_messageq_markempty(&actor->q);

// make sure the queue is actually empty as expected
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should restate the assumptions you and I have privately spoken about here.

Those points might not be clear to others in the future.

  • If no block and rc of 0 then nothing could have sent a message since we did the empty check.
  • If not no block, but rc of 0 and cd has never been contacted, nothing could have sent a message since we did the empty check

someone who isn't

Comment on lines 520 to 522
// the invariant that should hold true at this point is that we have
// not sent the cycle detector a `block` message already
pony_assert(!has_flag(actor, FLAG_BLOCKED_SENT));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure? i'm not sure this is correct. i think we might have but don't want to again.

otoh, i really need to think about that, and I'm definitely not sure this isn't correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so to get here, CD_CONTACTED would have to be set.

And that means we have sent a block at some point.

There's no guarantee that I see for this that we haven't unblocked and then end up here which is the only way FLAG_BLOCKED_SENT would be set if we get this, right?

In the previous version of this code, this wasn't an assert, it was an if. I think we want an if not an assert here.

@SeanTAllen SeanTAllen changed the title Fix race conditions due to work being done after markempty Fix race conditions that can lead to a segfault Sep 27, 2020
@SeanTAllen SeanTAllen added the changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge label Sep 27, 2020
@ponylang-main
Copy link
Contributor

Hi @dipinhora,

The changelog - fixed label was added to this pull request; all PRs with a changelog label need to have release notes included as part of the PR. If you haven't added release notes already, please do.

Release notes are added by creating a uniquely named file in the .release-notes directory. We suggest you call the file 3667.md to match the number of this pull request.

The basic format of the release notes (using markdown) should be:

## Title

End user description of changes, why it's important,
problems it solves etc.

If a breaking change, make sure to include 1 or more
examples what code would look like prior to this change
and how to update it to work after this change.

Thanks.

@SeanTAllen
Copy link
Member

@dipinhora i'm going to be testing this with the various "short-lived actors" tests soon-ish.

@SeanTAllen
Copy link
Member

Testing with the examples looks good.

@SeanTAllen SeanTAllen merged commit 6f0cfc3 into ponylang:master Sep 29, 2020
github-actions bot pushed a commit that referenced this pull request Sep 29, 2020
github-actions bot pushed a commit that referenced this pull request Sep 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants