Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-101558: Add time.sleep_until() #101559

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

haukex
Copy link

@haukex haukex commented Feb 4, 2023

This modifies pysleep to give it an "absolute" argument, and adds a time.sleep_until() function.

(I haven't been able to test this on Windows yet, I hope the GitHub Actions will take care of that)

@cpython-cla-bot
Copy link

cpython-cla-bot bot commented Feb 4, 2023

All commit authors signed the Contributor License Agreement.
CLA signed

@haukex

This comment was marked as off-topic.

@arhadthedev arhadthedev added the stdlib Python modules in the Lib dir label Feb 4, 2023
@eryksun eryksun requested a review from vstinner February 4, 2023 22:01
@abalkin abalkin self-assigned this Feb 6, 2023
Misc/ACKS Show resolved Hide resolved
Copy link
Member

@abalkin abalkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@CuriousLearner CuriousLearner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🌮

@@ -2168,7 +2206,8 @@ pysleep(_PyTime_t timeout)
int ret;
Py_BEGIN_ALLOW_THREADS
#ifdef HAVE_CLOCK_NANOSLEEP
ret = clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &timeout_abs, NULL);
ret = clock_nanosleep(absolute ? CLOCK_REALTIME : CLOCK_MONOTONIC,
TIMER_ABSTIME, &timeout_abs, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need to use CLOCK_REALTIME? @vstinner mentioned in a mailing thread that CLOCK_REALTIME can jump forwards and backwards as the system time-of-day clock is changed.

Copy link
Author

@haukex haukex Feb 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting my reply from the email:

Yes, using the system clock has its caveats, which is why I mentioned it in the documentation, and I wouldn't mind highlighting these even more.

The use case in my specific case is long-running recurring measurements, for example a measurement every minute or every ten minutes over a period of many hours or often days. In these cases it makes much more sense to tie the measurements to wall-clock time, because often there are other measurement systems also using wall-clock time.

Although of course not everyone uses it, an NTP daemon like Chrony is able to adjust the system clock by changing the system clock frequency and thereby adjusting the system clock gradually, which keeps the system clock monotonic. Having multiple measurement computers stay synchronized via NTP is then much more reasonable. In the past I've built a system consisting of multiple machines, one of them using a GPS clock and acting as an NTP server, that does exactly this.

Since both clock_nanosleep and SetWaitableTimerEx allow sleeping until a specific time, my reasoning is simply that this functionality available in the lower-level API could be made available to the user so they can write simple loops - aside from the precision of clock_nanosleep appealing to my perfectionist side 😉

While there are schedulers like cron or similar, sometimes, a measurement may need to happen e.g. every second, where then the (admittedly small) inaccuracy introduced by calculating the sleep() timeout based on the current time can start to make a difference.

@haukex
Copy link
Author

haukex commented Feb 12, 2023

The force push from yesterday was just a rebase; the force push from today was a rebase + squash; no changes to the code itself.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that I like the overall idea of exposing such function in Python. It can give false hope to users, since it's hard to write a reliable function to "sleep until". Did you actually test the behavior of the function when the system clock is changed on different operating systems (ex: Windows, Linux, macOS).

I tried to make such study when I designed time.monotonic(): https://peps.python.org/pep-0418/#monotonic-clocks

Additional potential sources of inaccuracies include:

* Because this function uses the system clock as a reference, this means the
reference clock is adjustable and may jump backwards.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me what is the expected behavior when the system clock jumps backward or forward. Does the function sleeps longer/shorter?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a caller asks to sleep until an absolute time on a given clock and that clock then "jumps", the caller should still wake up when that absolute time has elapsed. For example, if the clock is set backwards after the caller blocks, then the caller should sleep for a longer duration because it will take a longer time for the clock to reach the chosen absolute value. This is not a violation of the caller's request: they asked to block until an absolute time, not to block for a given duration.

If sleeping a longer/shorter duration is unacceptable for a caller, then either (a) the caller should use a relative sleep, or (b) the caller should schedule the sleep against a clock that cannot "jump".

Comment on lines +403 to +404
* On Unix, if ``clock_nanosleep()`` is not available, the absolute timeout
is emulated using ``nanosleep()`` or ``select()``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you consider that it's important to document the implementation, I suggest to do it in a separated section, since it's unclear to me what's the relationship between the chosen implementation and inaccuracies. If you care about the theorical clock resolution, you should document it, it's not obviously to users what is the resolution of these functions.

self.assertGreater(delta, -0.050)
# allow sleep_until to take up to 1s longer than planned
# (e.g. in case the system is under heavy load during testing)
self.assertLess(delta, 1.000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience, such test tends to fail on busy buildbots. I would prefer to not test the "performance" (accuracy) of the function in such unit test.

I'm not sure that it's useful to write an unit test for this function. It's very hard to write a reliable on the clock accuracy.

What if the system clock changes during the test? :-(

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, and I actually already had to increase that final assertLess because the test failed on a busy runner with a smaller value. When writing the test I took my cue from tests like test_monotonic, but I'm also fine with simplifying the test case, let me know.

Modules/timemodule.c Outdated Show resolved Hide resolved
@eryksun
Copy link
Contributor

eryksun commented Apr 6, 2023

behavior of the function when the system clock is changed on different operating systems (ex: Windows, Linux, macOS).

On Windows, an absolute due time doesn't change if the system time is adjusted.

@vstinner
Copy link
Member

vstinner commented Apr 6, 2023

On Windows, an absolute due time doesn't change if the system time is adjusted.

Sorry, it's unclear to me. It reminds me traveling in a flight trough different time zones which gives me deadheads...

Let's say that I do:

deadline = time.time() + 60.0
time.sleep_until(deadline)

During this sleep, the system clock is modified: either goes 30 seconds backward or forward. Does Python sleep 30, 60 or 90 seconds?

@eryksun
Copy link
Contributor

eryksun commented Apr 6, 2023

Let's say that I do:

deadline = time.time() + 60.0
time.sleep_until(deadline)

During this sleep, the system clock is modified: either goes 30 seconds backward or forward. Does Python sleep 30, 60 or 90 seconds?

A timer object is signaled as soon as the system time is greater than or equal to the scheduled time. If the system time is adjusted backward by 30 seconds, the sleep will be for 90 seconds. If the system time is adjusted forward by 30 seconds, the sleep will be for 30 to 60 seconds, depending on when the system time is adjusted.

On the other hand, adjusting the system time also adjusts relative due times by the same amount. So time.sleep(60) will sleep for 60 seconds regardless of changes to the system time.

@haukex
Copy link
Author

haukex commented Apr 10, 2023

@vstinner Thanks for all the feedback!

Did you actually test the behavior of the function when the system clock is changed on different operating systems (ex: Windows, Linux, macOS).

Let's say that I do:

deadline = time.time() + 60.0
time.sleep_until(deadline)

During this sleep, the system clock is modified: either goes 30 seconds backward or forward. Does Python sleep 30, 60 or 90 seconds?

The Windows SetWaitableTimerEx documentation says:

If the system time is adjusted, the due time of any outstanding absolute timers is adjusted.

And the Ubuntu Linux clock_nanosleep documentation says:

POSIX.1 specifies that after changing the value of the CLOCK_REALTIME clock via clock_settime(2), the new clock value shall be used to determine the time at which a thread blocked on an absolute clock_nanosleep() will wake up; if the new clock value falls past the end of the sleep interval, then the clock_nanosleep() call will return immediately.

I wrote a test that does a sleep_until for 60 seconds in the future, and then, 30 seconds later, the system clock is adjusted 30 seconds back; then in a second test case the system clock is adjusted 30 seconds to the future instead. The actual elapsed time is measured with the monotonic clock.

I ran this test on a Linux machine (4.15.0-206-generic #217-Ubuntu x86_64 GNU/Linux), a Linux VM (5.15.0-67-generic #74-Ubuntu x86_64 GNU/Linux), a Windows machine (Win 10 Pro 22H2 19045.2364), and a Windows VM (Win 10 Pro 22H2 19045.2486), and the results were the same on all four:

When the system clock is adjusted backwards by 30 seconds, the total time slept by sleep_until is 90 seconds, and when the system clock is adjusted forwards by 30 seconds, the total time slept is 30 seconds. In other words, it does sleep until the system clock reaches the specified wall-clock time (returning immediately if the system clock is adjusted past the target time).

This confirms what the documentation says and what @eryksun wrote - it appears to be well-defined behavior on systems with SetWaitableTimerEx or clock_nanosleep*.

I'm not sure that I like the overall idea of exposing such function in Python.

The way I look at it is that this is simply exposing a functionality already available in the OS's libraries, and in my comment above I name a couple of useful uses for it. Putting this function in an external module instead requires a ton of code duplication from timemodule.c. And since timemodule.c already uses SetWaitableTimerEx and clock_nanosleep, the changes there aren't so big.

* Anyway, one remaining question is of course what happens on *NIX systems that don't have clock_nanosleep, such as OS X. Here, the behavior is only emulated, and in fact changes to the system clock coupled with interrupts (which cause the timeout to be recalculated) could in theory lead to unpredictable sleep times. So at the moment I have three alternative suggestions for how to handle that:

  1. Simply document that on systems without SetWaitableTimerEx or clock_nanosleep, sleep_until may return early if the system clock is changed during a sleep. This of course requires users to read this and know whether their system is affected or not.
  2. Provide a boolean constant such as e.g. time.HAVE_NATIVE_SLEEP_UNTIL that tells users whether sleep_until will be emulated on their system or not and therefore be reliable or not (similar to e.g. os.supports_bytes_environ).
  3. Simply don't provide sleep_until on OSes that don't have SetWaitableTimerEx or clock_nanosleep, similar to e.g. os.mkfifo (hasattr(os, 'mkfifo') is false on Windows).

I hope you'll agree that sleep_until should be reliable on Windows and POSIX with clock_nanosleep and I hope this alleviates the worries you expressed about exposing the function. And I'd be happy to implement any of the above three solutions for other OSes. Let me know which one you would prefer and I'll take care of it.

As for your other comments that I haven't replied to above, I'll re-write the documentation once we've clarified the best way forward.

@eryksun
Copy link
Contributor

eryksun commented Apr 10, 2023

If the system time is adjusted, the due time of any outstanding absolute timers is adjusted.

This statement from the Windows documentation is vague and thus possibly misleading. Adjusting the system time actually has no effect on an absolute due time. If a timer is due at 17:00 hours, then the executive signals the timer as soon as the system time is greater than or equal to 17:00 hours, even if that's because the time was manually adjusted past the scheduled due time.

It's actually the case, in terms of the implementation in the kernel, that relative due times have to be adjusted when the system time is adjusted. Internally, due times are stored as absolute times, except relative due times are flagged as such. When the system time is adjusted, the executive has to recompute all of the 'relative' due times by the adjusted amount.

@haukex
Copy link
Author

haukex commented Apr 10, 2023

@eryksun Thanks for the clarification!

haukex and others added 2 commits April 11, 2023 22:35
Adds the `time.sleep_until` function, which allows sleeping until the
specified absolute time.
Co-authored-by: Victor Stinner <[email protected]>
@haukex
Copy link
Author

haukex commented May 19, 2023

Bump. Of the options I named above, I personally would probably prefer option 3, that is, simply don't provide sleep_until when it isn't available, so that on builds where hasattr(time, 'sleep_until') is True, the behavior is reliable.

.. function:: sleep_until(secs)

Like :func:`sleep`, but sleep until the specified time of the system clock (as
returned by :func:`time`). This can be used, for example, to schedule events
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced from this documentation what is the benefit of using this function instead of existing code:

dt = deadline - time.time()
if dt > 0: time.sleep(dt)

How does it behave differently? If it does behave the same, the function has no benefit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your point about improving the documentation or are you questioning the usefulness of the function in general? I'd be happy to improve the documentation if that's all that's missing, but I'd prefer to finalize the implementation first (see above).

The code you showed does differ from sleep_until in at least one IMHO significant way: The fact that sleep_until (when backed by the SetWaitableTimerEx or clock_nanosleep implementations) is sensitive to changes to the system clock is actually desirable in some situations. Imagine for example a measurement system consisting of multiple processes on a single machine, or spread across several machines, where time is kept synchronized by NTP, running for several days with a measurement interval on the order of minutes. On this timescale, differences in the clocks of a few ms don't matter, but if one of the measurement processes were to drift off relative to the others, as it could with the code you showed, that wouldn't be too good.

Other than that, for short measurement intervals, in theory the code you showed introduces a small offset due to there being several instructions in between taking the current time and scheduling the sleep, which may make a difference on a heavily loaded machine.

@jcalvinowens
Copy link

I'd love to see this merged. To implement a periodic task in a Python loop, one is currently forced to do a racy subtraction with the current time. This PR fixes that.

To print 'A' at 100hz, one might naively write:

    while True:
            print('A')
            time.sleep(0.01)

...but of course, that's not really going to yield a coherent 100hz. There are many applications where that matters, such as sampling a sensor to produce timeseries data.

If you actually care, you're forced to do something like this:

    p = time.time()
    while True:
            print('A')
            time.sleep(time.time() - p + 0.01)
            p = time.time()

...but that's inherently racy. If this proposal is implemented, that loop could become:

    p = time.time()
    while True:
            print('A')
            time.sleep_until(p + 0.01)
            p += 0.01

...which both simplifies the code, and eliminates the race condition. On platforms where there isn't support for TIMER_ABSTIME, it could fall back to the racy subtraction (which is what the user would've done anyway).

@vstinner
Copy link
Member

If this proposal is implemented, that loop could become: (...)

I'm not convinced that the time.sleep_until() code respects 100 Hz if the clock jumps backward or forward. You should use time.monotonic() for that.

time.sleep_until() uses the "realtime" clock which is affected by changes of a system administrator, NTP, or anything else.

Example using time.monotonic_ns():

import sys
import time

SEC_TO_NS = 10 ** 9

def sync_at_frequency(frequency):
    if frequency < 1:
        raise ValueError
    delay = SEC_TO_NS // frequency
    deadline = time.monotonic_ns() + delay
    while True:
        yield
        dt = deadline - time.monotonic_ns()
        if dt > 0.0:
            time.sleep(dt / SEC_TO_NS)
        deadline += delay

def main():
    stdout = sys.stdout
    i = 1
    for _ in sync_at_frequency(10):
        if (i % 10) == 0:
            stdout.write(f'. {time.time()}\n')
        else:
            stdout.write('.')
        stdout.flush()
        i += 1

main()

Output on Linux:

(...)
.......... 1706018345.30171
.......... 1706018346.3016381
.......... 1706018347.3015473
.......... 1706018348.301646
(...)

The precision is under 1 ms.

@vstinner
Copy link
Member

Hum, another more complete example which injects a random sleep between 0 and 100 ms:

import sys
import time
import random

SEC_TO_NS = 10 ** 9

def random_sleep():
    # max: 100 millisecond
    delay = 0.100 * random.random()
    time.sleep(delay)

def sync_at_frequency(frequency):
    if frequency < 1:
        raise ValueError
    delay = SEC_TO_NS // frequency
    deadline = time.monotonic_ns() + delay
    while True:
        yield
        dt = deadline - time.monotonic_ns()
        if dt > 0:
            time.sleep(dt / SEC_TO_NS)
        deadline += delay

def main():
    stdout = sys.stdout
    i = 1
    for _ in sync_at_frequency(10):
        if (i % 10) == 0:
            stdout.write(f'. {time.time()}\n')
        else:
            stdout.write('.')
        stdout.flush()
        # inject random delay to test sync_at_frequency() accuracy
        random_sleep()
        i += 1

main()

Output on an idle Linux system:

(...)
.......... 17060262 54.328 768
.......... 17060262 55.328 8026
.......... 17060262 56.328 745
.......... 17060262 57.328 686
.......... 17060262 58.328 6498
.......... 17060262 59.328 5427
.......... 17060262 60.328 5737
(...)

I manually added some spaces in timing numbers for readability, to highlight the precision of 1 ms.

You can see that there is a precision of 1 ms even with the random sleep. And the code is not affected by system clock changes.

@jcalvinowens
Copy link

jcalvinowens commented Jan 23, 2024

You can see that there is a precision of 1 ms even with the random sleep. And the code is not affected by system clock changes.

I can't believe you wasted time testing that... it was a toy example, you missed the point.

It is common to sample things at rates greater than 100hz. Obviously, sampling at 10khz, 1ms of inaccuracy is substantial....

@jcalvinowens
Copy link

jcalvinowens commented Jan 23, 2024

The precision is under 1 ms.

I also think you're missing that it is the sum of the error over time that matters: the timed sleep approach will drift in phase over long timescales. An abstime time sleep eliminates that possibility (if one assumes the local clock is accurate, of course...). If you care about this, you currently have to do all that ugly subtraction.

You might argue the error will effectively cancel out over time, but that's not true at least on Linux because it strictly over-sleeps its timers.

@jcalvinowens
Copy link

Re-reading this... I replied too quickly, I'm not sure which of these two things you're claiming:

  1. The precision of blindly calling sleep() is good enough
  2. Using TIMER_ABSTIME instead of doing the racy subtraction does not meaningfully increase precision

I think (1) is just flat out wrong, for the reasons I explained above.

If you're saying (2)... yes, that's true, I wasn't trying to claim it did. You could make the same argument against clock_nanosleep() supporting TIMER_ABSTIME in the first place. But it does exist, and given that it makes sampling code simpler, mathematically race free, and easier to read... why not include it?

I suppose something being "ugly" is a poor argument... but the inherent race condition is very ugly IMHO.

@haukex
Copy link
Author

haukex commented Jan 24, 2024

@vstinner Regarding both of the code examples you showed, imagine the kernel scheduler decides to hand a slice of time to another process right in between the calculation of the delay and the call to sleep itself, and blam. sleep_until doesn't have this particular problem because you specify the deadline directly instead of calculating it based on a constantly changing timebase.

Just to summarize a few of the points I made above:

  • Tests and documentation show the behavior of SetWaitableTimerEx or clock_nanosleep to be reliable in the face of jumps of the system time. (At least on the systems I tested.)
  • NTP daemons can make small adjustments to the system clock by slewing the clock, allowing it to remain monotonic.
  • For long-running measurements, synchronizing to wall-clock time may actually be desirable. (May not apply to the >=100Hz case of course.)
  • This change simply exposes an existing system API call, which is the most accurate way the system has to offer to accomplish the various tasks described in the thread above.

Are these not things that can be described in documentation? As I said, I'd be happy to rewrite it if the implementation is settled.

@jcalvinowens https://pypi.org/project/sleep-until/, though of course this duplicates a bunch of code from timemodule.c, which is why I proposed this PR. (I also haven't had a chance to see whether there have been changes to timemodule.c in the meantime that I need to port.)

On platforms where there isn't support for TIMER_ABSTIME, it could fall back to the racy subtraction (which is what the user would've done anyway).

I actually argued against this above, instead I now think it is better to only provide sleep_until on systems that have the corresponding underlying API calls to support its behavior fully.

@vstinner
Copy link
Member

clock_nanosleep() gives the choice of the clock, so you can use CLOCK_MONOTONIC. My concern here is that proposed time.sleep_until() function always use CLOCK_REALTIME which has issues. Extract of the change:

        ret = clock_nanosleep(absolute ? CLOCK_REALTIME : CLOCK_MONOTONIC,
                TIMER_ABSTIME, &timeout_abs, NULL);

@jcalvinowens
Copy link

I actually argued against this #101559 (comment), instead I now think it is better to only provide sleep_until on systems that have the corresponding underlying API calls to support its behavior fully.

You're proposing that I have to test for the existence of sleep_until before using it? That's uglier than the racy subtraction is IMHO, I would never use that API. I think it's clearly simpler on all fronts to fall back to the racy subtraction if TIMER_ABSTIME doesn't exist.

The point of this change is simplifying code, not increasing precision. The increase in precision is entirely meaningless on intervals long enough they can reliably work on a general purpose operating system. I'm still not sure if it was the point @vstinner was trying to make, but if so I completely agree.

What you're proposing (forcing the user to test for the existence of sleep_until) makes things more complicated, not simpler, IMHO.

@haukex
Copy link
Author

haukex commented Jan 24, 2024

@vstinner Can I interpret your comment to mean that if the user had a choice between the two clocks in the sleep_until call, you'd find the suggestion more acceptable? That's certainly something I could look to implement.

@jcalvinowens I understand your reservations, but take a look at the whole thread; you'll see that before the comment I linked to, I proposed three different alternative solutions, was asking which one I should implement, and was just stating my preference for the third - I now see it as a case of "do one thing and do it well". There was also some discussion of user expectations.

I think it's clearly simpler on all fronts to fall back to the racy subtraction if TIMER_ABSTIME doesn't exist.

The way you've worded this, I have to disagree. I feel like this amounts to telling the user "maybe we'll give you the accurate version of sleep_until, or maybe you'll get the version that suffers from the same problems as sleep". I'm a bit confused because I understood your comments above (like the one about 10kHz sampling) to mean that you seem to care about the precision improvement, so I would have thought that at the very least you might want to have a flag to check whether you're using the accurate version of sleep_until or not?

@jcalvinowens
Copy link

@vstinner Can I interpret your comment to mean that if the user had a choice between the two clocks in the sleep_until call, you'd find the suggestion more acceptable? That's certainly something I could look to implement.

@jcalvinowens I understand your reservations, but take a look at the whole thread; you'll see that before the comment I linked to, I proposed three different alternative solutions, was asking which one I should implement, and was just stating my preference for the third - I now see it as a case of "do one thing and do it well". There was also some discussion of user expectations.

I read the whole thread, I just disagree with the reviewers. I started to work on a patch to do this, but then discovered yours, I'm trying to jump in here rather than fragment the discussion with another PR.

Why not add os.nanosleep() if the above is the desired semantic? That avoids the whole realtime/monotonic issue too: it just becomes a flag and the user gets to pick. I don't think that's as ideal, because it limits the portability of scripts that use it, but it's much more intuitive than a general sounding function in time that isn't there some of the time.

I think it's clearly simpler on all fronts to fall back to the racy subtraction if TIMER_ABSTIME doesn't exist.

The way you've worded this, I have to disagree. I feel like this amounts to telling the user "maybe we'll give you the accurate version of sleep_until, or maybe you'll get the version that suffers from the same problems as sleep".

I understand where you're coming from. But it feels a bit circular to me: if no equivalent of ABSTIME exists, the user must necessarily emulate it themselves if they require it... but, because no ABSTIME exists, they have no way to do that, except by falling back to the racy math, themselves. I may be biased by my own usecase, but I think the vast majority of users would find it more convenient for the fallback behavior be part of the general API (not a hypothetical os.nanosleep() API, to be clear: that's different).

Emulating it by racily checking the current time is, on average, far more precise than required. One only "loses" the race if the process is interrupted between time() and sleep(), which would be extremely rare in practice. I think this is the point @vstinner was trying to make? That said, it will happen: the long tail is very long, nothing he's presented in this discussion is sufficient to see it.

I'm a bit confused because I understood your comments above (like the one about 10kHz sampling) to mean that you seem to care about the precision improvement, so I would have thought that at the very least you might want to have a flag to check whether you're using the accurate version of sleep_until or not?

I'm sorry for being confusing about this:

I was arguing that it is truly necessary to reference the current time if one cares about the phase of the timer expiry drifting, because the error accumulates. But it is not strictly necessary to have sleep_until to avoid the accumulating error: you can reference the current time each loop, and do the subtraction. In that case, the only inaccuracy introduced is the indeterminate latency between reading the current time and initiating the sleep, which is negligible the vast majority of the time.

@jcalvinowens
Copy link

Just to drive this home: I think the precision is a red herring, and we shouldn't be discussing it at all.

The problem I want to solve is that sleeping until a fixed time is a ubiquitous operation available in every programming environment I have ever worked in, except python.

My approach was to implement os.nanosleep(), on the theory it would be easier to get buy in on something that aped the POSIX API like other os functions.

In principle, I think the approach @haukex took is superior, because it's a more general portable API. But that generality benefit is lost if the function disappears on some platforms.

@haukex
Copy link
Author

haukex commented Jan 26, 2024

@jcalvinowens Thanks for the feedback, I understand better now where you're coming from. (Though several people in this thread have commented on whether the use of the absolute time is the best idea...)

I don't have the time to test right now, but I think that sleep_until(deadline) implemented via clock_nanosleep will behave differently from its "emulation" sleep(deadline - time.monotonic()) in the face of changes to the system clock. I have the suspicion that trying to describe this difference to the user in documentation (i.e. on which OSes and under which circumstances they get which behavior) would lead to confusion.

@vstinner As I said I'd be happy to look at implementing an option to sleep_until for the user to choose between the clocks if you think that would lead to more acceptance. However, as far as I can tell at the moment, it seems like this won't be possible to do natively on Windows.

As for my personal feelings on this, all of this discussion and potential issues mentioned above is driving me more towards the opinion of "just do one thing and do it well"... 😬

@haukex
Copy link
Author

haukex commented Jan 26, 2024

clock_nanosleep() gives the choice of the clock, so you can use CLOCK_MONOTONIC. My concern here is that proposed time.sleep_until() function always use CLOCK_REALTIME which has issues.

@vstinner Sorry for the multiple replies but I just realized your question actually has a really simple answer. (Also @Livius90 because you asked about the use of CLOCK_REALTIME too.)

CLOCK_MONOTONIC uses an arbitrary point of time in the past as its timebase, meaning it is relative. Therefore "sleeping until an absolute time", which is the whole point of sleep_until, makes no sense on the monotonic clock, only on wall-clock time.

Could you clarify "which has issues", is there anything else I didn't address in my research?

@vstinner
Copy link
Member

@vstinner As I said I'd be happy to look at implementing an option to sleep_until for the user to choose between the clocks if you think that would lead to more acceptance. However, as far as I can tell at the moment, it seems like this won't be possible to do natively on Windows.

I would prefer to only use a monotonic clock to sleep. If Windows doesn't support that, only add the function to operating systems supporting sleeping with an absolute time of a monotonic clock.

@vstinner
Copy link
Member

Could you clarify "which has issues", is there anything else I didn't address in #101559 (comment)?
When the system clock is adjusted backwards by 30 seconds, the total time slept by sleep_until is 90 seconds, and when the system clock is adjusted forwards by 30 seconds, the total time slept is 30 seconds.

If you ask to sleep 60 seconds, it should sleep 60 seconds: it should not sleep 30 or 90 seconds. That's a flaw in your API coming from the usage of the "realtime" clock (or called the "system" clock). I don't see the point of adding a new API with such flaw, whereas existing sleep(60) doesn't have such flaw. For me, it's a blocker issue.

IMO better accuracy is not enough to justify adding such function if there is such flaw.

@vstinner
Copy link
Member

@jcalvinowens:

I also think you're missing that it is the sum of the error over time that matters: the timed sleep approach will drift in phase over long timescales.

Would you mind to elaborate what drift in my example and how?

It is common to sample things at rates greater than 100hz. Obviously, sampling at 10khz, 1ms of inaccuracy is substantial....

Would you mind to elaborate which kind of code do you run every 10 us in Python? On which operating system? Which kind of accuracy do you expect?

In Python, it's not easy to have an "effective" precision better than 1 ms. Python is not designed for hard real time, it has a "stop the world" GC. Just reading a clock can take 1 us: https://peps.python.org/pep-0418/#performance On Windows, time.monotonic() has a precision of... 15.6 ms, it's far from 1 ms (16x worse).

@haukex
Copy link
Author

haukex commented Jan 30, 2024

@vstinner I don't follow your reasoning here:

If you ask to sleep 60 seconds, it should sleep 60 seconds: it should not sleep 30 or 90 seconds.

sleep_until is not sleep.

If you want to sleep 60 seconds, that's what sleep(60) is for.

sleep_until is for tasks like "sleep until 3pm" (sleep_until( datetime.datetime.combine(datetime.date.today(), datetime.time(15, 0, tzinfo=datetime.timezone.utc)).timestamp() )), and we've already talked about the flaws of using sleep for that.

Different task, different tool.

@jcalvinowens
Copy link

@vstinner my later replies clarified everything you asked about... could you respond to those points please?

@pganssle
Copy link
Member

pganssle commented Feb 1, 2024

The main UI issue I see with this function is that there are three slightly different use cases for it, and if the function is designed with one use case in mind, but the user expects another one, it's going to cause problems (and may be worse than if the interface didn't exist at all).

The three use cases I see:

  1. Sleep until the system local time is at least X (for, e.g. implementing something like a cron job). For this, you don't want to use a monotonic clock because you actually want the timer to end when the system local time is X, not after a specific period of time.
  2. Sleep until an absolute time — set some time in UTC and wait until that time. In that case, you are halfway between wanting a monotonic clock and wanting a local time clock, because you want the timer to end earlier if your local time adjusts to correct clock skew, but you don't want the timer to change if you change the time zone.
  3. Sleeping to align with some fixed frequency (e.g. the 100Hz example), where errors can accumulate over time if you keep doing sleep(interval) rather than sleep(interval - (time.monotonic() % interval)).

Writing it out like this, I think that #​1 sounds like the most useful use case, but the current implementation actually gives you #​2, which is a bit dangerous. I'm not sure why you'd want #​2 other than to implement something like #​3. I also think that if you are worried about phase drift accumulating in the case of #​3, doing something like this should suffice:

def sample_every(interval_in_ms: int, num_samples: n) -> Sequence[float]:
    output = []
    interval_in_ns = interval * 1_000_000
    for _ in range(n):
        # Find the next time the montonic clock will be an even multiple of the interval
        next_sample = ((time.monotonic_ns() // interval_in_ns) + 1) * interval_in_ms 
        time.sleep(next_sample)
        output.append(sample())
    return output

In this implementation, if you sample too fast, you'll probably miss some intervals, but the errors will not accumulate over time. I am not sure we could write a general function that handles this for everyone, though, because there are different trade-offs with different methods of writing a "clock" of this sort. Some people want to minimize the variation in the gap between samples (and thus would prefer to sample immediately if you miss a sample), while other people want to align as closely as possible to the clock even if it means missing samples. Other people might want a hybrid approach, where they are keeping track of phase shifts and switch strategies if the drift gets too high.

I feel like with regards to #​1 vs #​2, it might work to have a sleep_until that takes a datetime.datetime, where if it's naïve you get the behavior of #​1 and if it's aware you get something more like the behavior of #​2 (though even this might be a problem if you are expecting these jobs to run long enough that the time zones might change on disk during the run of the program). That has several problems, though:

  1. time doesn't have a dependency on datetime.datetime and shouldn't grow one for this interface.
  2. People are generally not super clear that naïve times mean "system local time" anyway.
  3. There are actually all kinds of problems with dealing with changing system local time. IIRC, Windows and Linux deal with it differently, where time.tzset() will work for Linux but not Windows. Some of this is covered by this blog post, though a lot of that is about the datetime interface itself.

When I started writing this comment, I thought it was going to be generally in favor of this and the interface just needed to be clarified, but the more I think about this, the more complicated it feels. I'd be inclined to look at the PyPI package and see if it gets wide adoption, and among the people adopting it, how many people end up using it incorrectly 😅.

@haukex
Copy link
Author

haukex commented Feb 2, 2024

@pganssle Thanks for the feedback!

... you don't want the timer to change if you change the time zone.

Seeing how clock_nanosleep/SetWaitableTimerEx react to adjustments of the system's time zone setting would actually be a good test that I haven't run yet.

Sleeping to align with some fixed frequency (e.g. the 100Hz example) ...

After reading @jcalvinowens's arguments on this, I no longer think this is the best example of a use case for this function - you'd need an RTOS to get a clean 100+ Hz.

Just to reiterate a real-world example - I've built several systems for work similar to this - imagine two Raspberry Pi's in different remote locations, just sitting out in a field somewhere, without any network connection but with a GPS receiver that provides a PPS signal. These RPis are to run for weeks and take simultaneous measurements every 15 minutes (or imagine any other time periods where you think clock drift can become significant). In this case you want sleep_until to wake at the appropriate wall-clock time including adjustments to the system clock. clock_nanosleep is the best way the OS provides to accomplish this; cron or others aren't appropriate here because you try asking a researcher whether s/he prefers the low accuracy or the high accuracy version, guess which one they want 😉

I'd be inclined to look at the PyPI package and see if it gets wide adoption, and among the people adopting it, how many people end up using it incorrectly 😅.

Sure, sleep-until works for now of course. It's just too bad I've had to duplicate so much code from timemodule.c that I also have to monitor for any bugfixes that I'd have to port, while integrating the functionality into timemodule.c isn't a big change. But anyway, I'm open to just waiting and seeing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting merge stdlib Python modules in the Lib dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.