-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/sync/semaphore: skipping acquisition if the context is done is a breaking change #67723
Comments
When acquiring from a semaphore could proceed without contention, the previous behavior was to always do so, even when the provided context was done. This was the documented behavior, but it could lead to confusion. It isn't much more expensive to check the context error, so cancel acquisition if it's done. Fixes golang/go#63615. goos: linux goarch: amd64 pkg: golang.org/x/sync/semaphore cpu: 12th Gen Intel(R) Core(TM) i5-1235U │ old.bench │ new.bench │ │ sec/op │ sec/op vs base │ AcquireSeq/Weighted-acquire-1-1-1-12 26.45n ± 2% 27.25n ± 3% +3.04% (p=0.001 n=20) AcquireSeq/Weighted-acquire-2-1-1-12 26.96n ± 1% 27.12n ± 1% ~ (p=0.104 n=20) AcquireSeq/Weighted-acquire-16-1-1-12 26.07n ± 3% 27.48n ± 1% +5.45% (p=0.000 n=20) AcquireSeq/Weighted-acquire-128-1-1-12 26.19n ± 2% 27.24n ± 1% +4.01% (p=0.000 n=20) AcquireSeq/Weighted-acquire-2-2-1-12 25.61n ± 1% 25.99n ± 2% ~ (p=0.066 n=20) AcquireSeq/Weighted-acquire-16-2-8-12 209.6n ± 2% 211.0n ± 3% ~ (p=0.280 n=20) AcquireSeq/Weighted-acquire-128-2-64-12 1.669µ ± 1% 1.721µ ± 2% +3.09% (p=0.000 n=20) AcquireSeq/Weighted-acquire-2-1-2-12 51.08n ± 1% 53.03n ± 2% +3.82% (p=0.000 n=20) AcquireSeq/Weighted-acquire-16-8-2-12 52.48n ± 2% 53.66n ± 2% +2.26% (p=0.028 n=20) AcquireSeq/Weighted-acquire-128-64-2-12 52.27n ± 1% 53.71n ± 2% +2.75% (p=0.000 n=20) geomean 60.06n 61.69n +2.71% Change-Id: I0ae1a0bb6c027461ac1a9ee71c51efd8427ab308 Reviewed-on: https://go-review.googlesource.com/c/sync/+/536275 Auto-Submit: Bryan Mills <[email protected]> Reviewed-by: Bryan Mills <[email protected]> Reviewed-by: Michael Knyszek <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
The change in question is https://go.dev/cl/536275, written to address issue #63615. CC @szuecs @zephyrtronium |
From the description of your code, it sounds like you should be using the |
In that case, I need to figure out the timeout on the context, and call Since the timeout is set by client programs, it doesn't seem to be viable. I have to say, the vast majority of our clients don't set zero timeout on the context, while there are still very few performing fire-and-forget message publishing and expecting minimum latencies on back pressure. Thoughts? |
You can call the |
The 0 timeout that triggers done depends only on the context, so you should fix the context |
I think there's a reasonable argument to be made that CL 536275 was not a backwards-compatible change. There are three things a function can do when passed a pre-canceled context: Succeed, fail, or pick one at random. In general, we try to avoid making substantive changes that may cause existing programs to break, even if the documentation permits the change, except when the change is a clear improvement. I'm not convinced this change meets that bar. That said, the change was made about nine months ago. Reverting it at this time would be just more churn and opportunity for breakage. |
Go version
go version go1.21.1 darwin/arm64
Output of
go env
in your module/workspace:What did you do?
The PR below introduced a breaking change - a regression, and broke our services deployed in production environments.
golang/sync@14be23e
What did you see happen?
We explicitly make use of a zero-timeout context,
and pass the
ctx
tosem.Acquire(ctx, 1)
to minimize the wait time on a critical path. The goal is to avoid latencies causing by the wait timeouts due to upstream back pressure.For instance, we use
sem
to control in-flight messages sent to Kafka. When Kafka brokers experience back pressure, we would like the call tosem.Acquire
failed immediately to unblock the main thread, otherwise, it would introduce extra latencies to the critical, user-facing threads.With this change in the PR, ctx with 0 timeout immediately returns a context cancelled error from
sem.Acquire(ctx, 1)
even if the semaphore still has a full capacity.What did you expect to see?
Passing
ctx
with 0 timeout tosem.Acquire(ctx, int)
should successfully acquire the semaphore counts as long as there are still capacity immediately available on the semaphore.The text was updated successfully, but these errors were encountered: