Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix of flaky timedwait on Windows #115

Merged
merged 4 commits into from
Oct 10, 2022
Merged

Fix of flaky timedwait on Windows #115

merged 4 commits into from
Oct 10, 2022

Conversation

cmnrd
Copy link
Contributor

@cmnrd cmnrd commented Oct 7, 2022

The Windows implementation of lf_timed_wait calculates the difference between the timepoint to wait until and the current physical time. In some seldom cases, the current physical time could already be larger then the specified timepoint. This will lead to a negative delay passed to SleepConditionVariableCS. While the documentation does not specify this precisely, experimentation showed that negative delays are interpreted as infinite delays and hence the program would block forever.

This fix makes two changes. First, it checks for negative delays and returns without waiting if the delay is less or equal to zero. Second, the number of milliseconds to wait for is rounded up in order to prevent unnecessary busy waiting when the call to SleepConditionVariableCS returns early.

The Windows implementation of `lf_timed_wait` calculates between the timepoint
to wait until and the current physical time. In some seldom cases, the current physical
time could already be larger then the specified timepoint. This will lead to a
negative delay to `SleepConditionVariableCS`. While the documentation does not
specify this precisely, experimentation showed that negative delays are
interpreted as infinite delays and hence the program would block forever.

This fix makes to changes. First, it checks for negative delays and returns
without waiting if the delay is less or equal to zero. Second, the number of
milliseconds tow wait for is rounded up in order to prevent unnecessary busy
waiting when the call to `SleepConditionVariableCS` returns early.
@petervdonovan
Copy link
Contributor

Thank you Christian!

Co-authored-by: Peter Donovan <[email protected]>
Copy link
Member

@lhstrh lhstrh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terrific work tracking this down, @cmnrd! I added some nitpicks and noticed some unreachable code on lines 241 and 246 that we might as well get rid of.

core/platform/lf_windows_support.c Outdated Show resolved Hide resolved
core/platform/lf_windows_support.c Outdated Show resolved Hide resolved
Copy link
Contributor

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch about the negative time interval! This looks like a good solution to me, but rounding up does mean that time precision below one ms will be impossible on Windows machines. E.g., any deadline below 1 ms is likely to be violated. But this probably true anyway on any non-RTOS because of the OS jiffy interval (or whatever it is called in Windows).

@lhstrh lhstrh changed the title Fix flaky timedwait on Windows Fix of flaky timedwait on Windows Oct 10, 2022
@lhstrh lhstrh merged commit 8041dbb into main Oct 10, 2022
@lhstrh lhstrh deleted the fix-flaky-wait branch February 1, 2024 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants