Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix early quiescence/termination bug #4550

Merged
merged 3 commits into from
Nov 27, 2024

Conversation

dipinhora
Copy link
Contributor

Prior to this commit, the logic to detect quiescence had a race condition in relation to dynamic scheduler scaling and it was possible for the runtime to incorrectly detect quiescence and termninate early if a scheduler thread suspended at just the right time.

This commit changes the quiescence logic to keep an accurate track of exactly how many scheduler threads are active at the time the quiescence detection protocol begins so it can ensure that any scheduler threads suspending or unsuspending can no longer cause an incorrect determination that might lead to early termination of the runtime.

Prior to this commit, the logic to detect quiescence had a race
condition in relation to dynamic scheduler scaling and it was
possible for the runtime to incorrectly detect quiescence and
termninate early if a scheduler thread suspended at just the
right time.

This commit changes the quiescence logic to keep an accurate track
of exactly how many scheduler threads are active at the time the
quiescence detection protocol begins so it can ensure that any
scheduler threads suspending or unsuspending can no longer cause
an incorrect determination that might lead to early termination of
the runtime.
@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Nov 26, 2024
@dipinhora
Copy link
Contributor Author

not included as part of this PR, but stopping the ASIO thread should likely be moved to happen after all the scheduler threads have stopped... maybe worth doing as a follow on thing..

@SeanTAllen SeanTAllen added the changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge label Nov 26, 2024
@ponylang-main
Copy link
Contributor

Hi @dipinhora,

The changelog - fixed label was added to this pull request; all PRs with a changelog label need to have release notes included as part of the PR. If you haven't added release notes already, please do.

Release notes are added by creating a uniquely named file in the .release-notes directory. We suggest you call the file 4550.md to match the number of this pull request.

The basic format of the release notes (using markdown) should be:

## Title

End user description of changes, why it's important,
problems it solves etc.

If a breaking change, make sure to include 1 or more
examples what code would look like prior to this change
and how to update it to work after this change.

Thanks.

@SeanTAllen
Copy link
Member

not included as part of this PR, but stopping the ASIO thread should likely be moved to happen after all the scheduler threads have stopped... maybe worth doing as a follow on thing..

That sounds like a good idea

send_msg_all_active(sched->index, SCHED_CNF, sched->ack_token);
// save the # of active schedulers to expect ACK's from
sched->ack_count = send_msg_all_active(sched->index, SCHED_CNF, sched->ack_token);
pony_assert(sched->ack_count > 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assertion failed during a CI run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep.. i missed an edge case.. will have a fix soonish...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix pushed

@SeanTAllen SeanTAllen merged commit 3ea24eb into ponylang:main Nov 27, 2024
22 checks passed
@ponylang-main ponylang-main removed the discuss during sync Should be discussed during an upcoming sync label Nov 27, 2024
github-actions bot pushed a commit that referenced this pull request Nov 27, 2024
github-actions bot pushed a commit that referenced this pull request Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants