Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: unsafe: add Unreachable #30582

Closed
josharian opened this issue Mar 4, 2019 · 39 comments
Closed

proposal: unsafe: add Unreachable #30582

josharian opened this issue Mar 4, 2019 · 39 comments

Comments

@josharian
Copy link
Contributor

josharian commented Mar 4, 2019

When optimizing code, I often run up against cases in which the compiler is missing a key fact. Sometimes it could but does not infer it; sometimes there’s no reasonable way for the compiler to know.

In those cases, there is nothing to do but drop to assembly. (In normal code you could write if !x { panic }, but in many of these cases, that is prohibitively expense.)

I propose that we add unsafe.Assume. It accepts a boolean expression. The expression is typechecked but never evaluated. However, the compiler may assume that it evaluates to true when compiling other code.

I imagine the most common uses would be things like unsafe.Assume(p != nil), unsafe.Assume(0 <= i && i < len(s)), and unsafe.Assume(x < 64), for nil checks, bounds checks, and shift amounts, respectively.

@gopherbot gopherbot added this to the Proposal milestone Mar 4, 2019
@jimmyfrasche
Copy link
Member

What happens if the assertion does not hold?

@josharian
Copy link
Contributor Author

The compiler may generate code that does all sorts of awful things. It is decidedly unsafe.

@cherrymui
Copy link
Member

To me, this sounds like "Assume". I would think "Assert" means if the assertion fails, the program crashes (more like the C assert function).

@josharian josharian changed the title proposal: Go2: unsafe: add Assert proposal: Go2: unsafe: add Assume Mar 5, 2019
@josharian
Copy link
Contributor Author

@cherrymui thanks, proposal updated.

@ianlancetaylor
Copy link
Member

Another approach would be unsafe.Unreachable(). The compiler can assume that no code path will lead to a call of that function. This effectively lets you write

    if p == nil {
        unsafe.Unreachable()
    } else {
        // Here the compiler knows that p != nil.
    }

and then at some point during code generation the compiler removes the p == nil test and the Unreachable call.

@josharian
Copy link
Contributor Author

josharian commented Mar 5, 2019

@ianlancetaylor I like it. One nice thing about that is that it mirrors the not-unsafe way to write it:

if p != nil {
  panic("unreachable")
}

becomes

if p != nil {
  unsafe.Unreachable()
}

And the specification is simpler.

The only downside I see is that it doesn't fit on one line. :P It also doesn't scream unsafe at me the way unsafe.Assume does, even though they are expressively equivalent...but I guess the package name suffices for that.

@josharian josharian changed the title proposal: Go2: unsafe: add Assume proposal: Go2: unsafe: add Assume or Unreachable Mar 5, 2019
@jimmyfrasche
Copy link
Member

What happens if you ask the compiler to assume something it knows is false? Like something that reduces to

x := -1
unsafe.Assume(x > 0)

@josharian
Copy link
Contributor Author

What happens if you ask the compiler to assume something it knows is false?

Then you're in UB world: It depends on the compiler and the details. I'm guessing that in practice, with cmd/compile right now, probably not too much. With gccgo, you probably start spewing random numbers until you generate the kubernetes code base.

@jimmyfrasche
Copy link
Member

If the compiler doesn't know that x <= -1, sure UB, but, if the compiler has proved that x <= -1 and you ask it to assume that x > 0, it seems like it should error out.

@josharian
Copy link
Contributor Author

it seems like it should error out.

Perhaps optionally? I think that a compiler that simply ignored unsafe.Unreachable/unsafe.Assume should be spec-compliant.

@seebs
Copy link
Contributor

seebs commented Mar 5, 2019

I like this a lot for my code, I'm terrified of what happens if other people have it.

@randall77
Copy link
Contributor

I'm not sure we should go down this road. This is basically a "performance annotation" and it's one more thing everyone needs to understand. I'm happy to give up 5% in performance to never have to make (or see) any such annotations.

There's a lot of other annotations people could want. unsafe.Unlikely, unsafe.UnrollThisLoop, etc.

If we do this I like Ian's API.

@mvdan
Copy link
Member

mvdan commented Mar 5, 2019

(In normal code you could write if !x { panic }, but in many of these cases, that is prohibitively expense.)

@josharian could you clarify what part is expensive? The added code size? Or perhaps making some small functions non-inlineable with the current compiler?

Another idea that comes to mind is if the compiler treated panic("unreachable") in a special way, generating less code for it as there would likely be many identical calls within a single binary.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/165358 mentions this issue: cmd/compile: add unsafe.Unreachable

@josharian
Copy link
Contributor Author

josharian commented Mar 5, 2019

I wrote a rudimentary implementation so that I could experiment with it: https://golang.org/cl/165358

I'm happy to give up 5% in performance to never have to make (or see) any such annotations.

Some data points, for addVV_g in math/big, based just on what I could hack in a half hour...

Going from my recently-optimized CLs to adding a few unsafe.Unreachable annotations:

name            old time/op    new time/op    delta
AddVV/1-8         5.49ns ± 5%    4.36ns ± 1%  -20.58%  (p=0.000 n=45+41)
AddVV/2-8         6.95ns ± 4%    4.88ns ± 1%  -29.81%  (p=0.000 n=48+43)
AddVV/3-8         8.25ns ± 6%    6.18ns ± 1%  -25.14%  (p=0.000 n=44+46)
AddVV/4-8         9.34ns ± 4%    7.01ns ± 2%  -24.94%  (p=0.000 n=47+47)
AddVV/5-8         10.7ns ± 2%     7.7ns ± 4%  -28.48%  (p=0.000 n=45+44)
AddVV/10-8        17.7ns ± 4%    11.1ns ± 2%  -37.22%  (p=0.000 n=50+39)
AddVV/100-8        159ns ± 1%     118ns ± 2%  -25.78%  (p=0.000 n=34+42)
AddVV/1000-8      1.43µs ± 5%    1.14µs ± 3%  -19.99%  (p=0.000 n=47+49)
AddVV/10000-8     14.2µs ± 5%    11.4µs ± 2%  -20.15%  (p=0.000 n=49+50)
AddVV/100000-8     143µs ± 4%     113µs ± 2%  -21.01%  (p=0.000 n=48+48)

Going from regular to an unrolled loop, which requires yet more annotations:

name            old time/op    new time/op    delta
AddVV/1-8         6.38ns ± 1%    4.31ns ± 1%   -32.50%  (p=0.000 n=38+43)
AddVV/2-8         9.52ns ± 1%    5.17ns ± 1%   -45.67%  (p=0.000 n=43+44)
AddVV/3-8         11.4ns ± 1%     6.3ns ± 2%   -44.14%  (p=0.000 n=46+43)
AddVV/4-8         13.6ns ± 1%     7.2ns ± 1%   -47.09%  (p=0.000 n=44+40)
AddVV/5-8         22.5ns ± 3%     9.8ns ± 2%   -56.62%  (p=0.000 n=45+43)
AddVV/10-8        40.6ns ± 4%    14.5ns ± 4%   -64.35%  (p=0.000 n=48+49)
AddVV/100-8        414ns ± 2%     163ns ± 2%   -60.61%  (p=0.000 n=49+50)
AddVV/1000-8      4.04µs ± 2%    1.62µs ± 4%   -59.89%  (p=0.000 n=35+46)
AddVV/10000-8     40.3µs ± 1%    16.8µs ± 4%   -58.38%  (p=0.000 n=32+49)
AddVV/100000-8     403µs ± 2%     173µs ± 8%   -57.18%  (p=0.000 n=34+49)

This gets us within striking distance of the hand-optimized assembly for small n. Going from my unrolled Go+Unreachable code to assembly:

name            old time/op    new time/op    delta
AddVV/1-8         4.07ns ± 3%    4.31ns ± 1%    +5.76%  (p=0.000 n=42+43)
AddVV/2-8         4.78ns ± 2%    5.17ns ± 1%    +8.17%  (p=0.000 n=42+44)
AddVV/3-8         5.84ns ± 2%    6.34ns ± 2%    +8.65%  (p=0.000 n=46+43)
AddVV/4-8         6.75ns ± 4%    7.18ns ± 1%    +6.35%  (p=0.000 n=50+40)
AddVV/5-8         7.51ns ± 2%    9.76ns ± 2%   +29.86%  (p=0.000 n=47+43)
AddVV/10-8        9.84ns ± 4%   14.48ns ± 4%   +47.11%  (p=0.000 n=49+49)
AddVV/100-8       49.5ns ± 5%   163.1ns ± 2%  +229.25%  (p=0.000 n=48+50)
AddVV/1000-8       434ns ± 4%    1622ns ± 4%  +273.53%  (p=0.000 n=50+46)
AddVV/10000-8     5.50µs ± 4%   16.79µs ± 4%  +204.95%  (p=0.000 n=41+49)
AddVV/100000-8    61.0µs ± 9%   172.8µs ± 8%  +183.11%  (p=0.000 n=49+49)

And the main difference now is that the hand-rolled assembly can do ADC mem reg, rather than loading from memory into a register and then ADC reg reg. That's fixable straightforwardly in the compiler. So I think it is possible this could let us remove the assembly entirely.

I plan to do the ADC upgrade anyway. I'll report back once I've done it; might be a few days.

None of the unsafe.Unreachable annotations I've provided could ever be inferred by the compiler; they come from upstream invariants.

@ALTree ALTree added the v2 An incompatible library change label Mar 5, 2019
@josharian
Copy link
Contributor Author

josharian commented Mar 5, 2019

And if you want to see what the simple, hinted code looks like:

func addVV_g(z, x, y []Word) (c Word) {
	for i := 0; i < len(z); i++ {
		if i >= len(x) || i >= len(y) {
			// The calling code ensures that x and y are no longer than z.
			// The compiler doesn't see this; this hint prevents bounds
			// checks in the bits.Add call below.
			unsafe.Unreachable()
		}
		zi, cc := bits.Add(uint(x[i]), uint(y[i]), uint(c))
		z[i] = Word(zi)
		c = Word(cc)
	}
	return
}

@josharian
Copy link
Contributor Author

There's a lot of other annotations people could want. unsafe.Unlikely, unsafe.UnrollThisLoop, etc.

Fair enough. That said, I think this is plausibly qualitatively different:

  • I can see wanting a way to hint about branch likeliness, although that's not actually unsafe. And there have been myriad performance compiler requests around nil checks, BCE, etc., and almost none around branch likeliness.

  • Loop unrolling can be done manually when you really need it; these annotations cannot.

@mundaym
Copy link
Member

mundaym commented Mar 6, 2019

I'd be interested to see the benchmarks for addVV_g with the bounds checks hoisted out of the loop:

func addVV_g(z, x, y []Word) (c Word) {
        n := len(z)
        x, y = x[:n:n], y[:n:n]
	for i := 0; i < n; i++ {
		zi, cc := bits.Add(uint(x[i]), uint(y[i]), uint(c))
		z[i] = Word(zi)
		c = Word(cc)
	}
	return
}

I suspect the code using unsafe.Unreachable() is faster for a small number of iterations. I would expect, however, for the performance of the two different implementations to converge as the number of iterations is increased.

Also, if the compiler inlined code using for loops (#14768) then we might see bounds checks optimized/merged away in real code (probably not in micro-benchmarks). Inlining seems likely to be a big win for addVV_g which has a very short and simple loop. It is also a massive advantage that pure Go implementations of small functions like this have over their assembly equivalents.

I can see annotations like unsafe.Unreachable() being useful when the slice index is coming from an external source that is known in-range. This happens a lot in the compiler. But this also seems like a scenario where we really do want a bounds check to catch bugs.

EDIT: dropped the length check panics, I don't think they are necessary for this use case and with them the function doesn't work when len(z) == 0.

@mundaym
Copy link
Member

mundaym commented Mar 7, 2019

I did a few benchmarks on the generic code to see what I got. Hoisting the bounds check on y seemed to make very little difference. Which is interesting because they did remove the bounds check from the loop, implying the critical path is the carry chain (I am using an i5 3570k).

Secondly, I tried applying David's for loop inlining experiment CL 148777. I still applied the bounds check hoisting change to get rid of the range statement since David's patch doesn't handle that. This did seem to make a big difference to small loop lengths, which makes sense since it saves writing 9 words to the stack (3 for each slice):

name          old time/op    new time/op    delta
AddVV/1         5.70ns ± 0%    1.91ns ± 3%   -66.56%  (p=0.000 n=7+10)
AddVV/2         6.83ns ± 0%    2.73ns ± 1%   -60.08%  (p=0.000 n=10+8)
AddVV/3         7.75ns ± 0%    3.59ns ± 1%   -53.69%  (p=0.000 n=9+10)
AddVV/4         9.20ns ± 0%    4.59ns ± 0%   -50.05%  (p=0.000 n=10+10)
AddVV/5         10.6ns ± 0%     5.5ns ± 2%   -48.16%  (p=0.000 n=10+10)
AddVV/10        16.8ns ± 0%    10.5ns ± 1%   -37.63%  (p=0.000 n=10+9)
AddVV/100        128ns ± 0%     120ns ± 1%    -6.38%  (p=0.000 n=10+10)
AddVV/1000      1.13µs ± 0%    1.16µs ± 1%    +2.14%  (p=0.000 n=10+10)
AddVV/10000     11.5µs ± 0%    11.8µs ± 1%    +2.89%  (p=0.000 n=10+7)
AddVV/100000     117µs ± 0%     120µs ± 2%    +2.68%  (p=0.000 n=9+10)

I suspect the slightly slower speed limit is down to alignment noise or a similar effect.

Patch to remove range statement and bounds check on y in the loop:
--- a/src/math/big/arith.go
+++ b/src/math/big/arith.go
@@ -54,7 +54,9 @@ func divWW_g(u1, u0, v Word) (q, r Word) {
 
 // The resulting carry c is either 0 or 1.
 func addVV_g(z, x, y []Word) (c Word) {
-       for i := range x[:len(z)] {
+       n := len(z)
+       x, y = x[:n:n], y[:n:n]
+       for i := 0; i < n; i++ {
                zi, cc := bits.Add(uint(x[i]), uint(y[i]), uint(c))
                z[i] = Word(zi)
                c = Word(cc)

@ianlancetaylor ianlancetaylor added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Mar 26, 2019
@ianlancetaylor ianlancetaylor changed the title proposal: Go2: unsafe: add Assume or Unreachable proposal: Go 2: unsafe: add Assume or Unreachable Mar 26, 2019
josharian added a commit to josharian/go that referenced this issue May 7, 2019
DO NOT SUBMIT

This is a rudimentary implementation,
so that I can experiment with it.

Updates golang#30582

Change-Id: I2e3a9e776490a9b01d4f166014b133625e419418
josharian added a commit to josharian/go that referenced this issue May 7, 2019
DO NOT SUBMIT

This is a rudimentary implementation,
so that I can experiment with it.

Updates golang#30582

Change-Id: I2e3a9e776490a9b01d4f166014b133625e419418
josharian added a commit to josharian/go that referenced this issue May 7, 2019
DO NOT SUBMIT

This is a rudimentary implementation,
so that I can experiment with it.

Updates golang#30582

Change-Id: I2e3a9e776490a9b01d4f166014b133625e419418
@bcmills
Copy link
Contributor

bcmills commented May 13, 2019

The expression is typechecked but never evaluated. However, the compiler may assume that it evaluates to true when compiling other code.

That seems dangerous to me: it adds a large space of undefined behavior (that is, any program that violates the assumption), but no way to validate that the program does not actually exhibit that behavior (contrast -fsanitize=undefined).

To me, the biggest advantage (by far!) of undefined behavior is that it can be sanitized: unlike a Go panic (which can be recovered and relied upon for control flow, as in the case of fmt with nil pointers), a violation of an unsafe.Assume condition always indicates a programmer error — and can be reported as such.

So instead, I would propose that the expression always be evaluated, and the result discarded except in sanitized (perhaps race-detector-enabled?) builds. That makes it possible to optimize away arithmetic expressions and pure functions (by far the common use-cases), but allows a sanitizer to verify the assumptions — without producing inconsistent behavior if the expression has side-effects.

@gopherbot gopherbot added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Sep 3, 2019
@komuw
Copy link
Contributor

komuw commented Jan 1, 2020

https://ziglang.org/documentation/master/#unreachable

@beoran
Copy link

beoran commented Jan 8, 2020

Interesting, but the idea that there should be a difference between debug mode and release mode is false. I have spent years doing life critical software in a large, famous, company, and there the rule was the production binary must be the tested binary, which almost always meant the binary with debug information available, in order to be able to do postmortem analysis in case of a crash.

Nevertheless unsafe.Unreachable seems to give some substantial performance increases. But I think one of the strong points of Go is that is has few undefined behavior. So I think unsafe.Unreachable can still serve as a performance hint for statement that follow it, but it should always cause a panic in case it is reached, and the check itself that leads to unafe.Unreachable should not be optimized out in any case.

@josharian
Copy link
Contributor Author

So I think unsafe.Unreachable can still serve as a performance hint for statement that follow it, but it should always cause a panic in case it is reached, and the check itself that leads to unafe.Unreachable should not be optimized out in any case.

We already have this: panic. In the cases under discussion, it is the check itself that is expensive, performance-wise.

@jfcg
Copy link

jfcg commented Aug 4, 2021

unsafe.Assume looks more elegant and compact, I think.

@griesemer
Copy link
Contributor

If I write your example above slightly differently (and encapsulate into a complete package so I can compile it):

package main

import "math/bits"

type Word uint64

func addVV_g(z, x, y []Word) (c Word) {
	if len(x) != len(z) || len(y) != len(z) {
		panic("vector lengths don't match")
	}
	// lengths of x, y, z are the same
	for i := 0; i < len(z); i++ {
		zi, cc := bits.Add(uint(x[i]), uint(y[i]), uint(c))
		z[i] = Word(zi)
		c = Word(cc)
	}
	return
}

func main() {}

It looks like the compiler is already smart enough to do the right thing. The inner loop appears to be:

	0x0033 00051 (x.go:13)	NEGL	DX
	0x0035 00053 (x.go:13)	MOVQ	(DI)(CX*8), SI
	0x0039 00057 (x.go:13)	MOVQ	(R9)(CX*8), R8
	0x003d 00061 (x.go:13)	ADCQ	SI, R8
	0x0040 00064 (x.go:14)	MOVQ	R8, (AX)(CX*8)
	0x0044 00068 (x.go:13)	SBBQ	DX, DX
	0x0047 00071 (x.go:12)	INCQ	CX
	0x004a 00074 (x.go:13)	NEGQ	DX
	0x004d 00077 (x.go:12)	CMPQ	BX, CX
	0x0050 00080 (x.go:12)	JGT	51

or, copying from godbolt.org:

        JMP     addVV_g_pc77
addVV_g_pc51:
        NEGL    DX
        MOVQ    (DI)(CX*8), SI
        MOVQ    (R9)(CX*8), R8
        ADCQ    SI, R8
        MOVQ    R8, (AX)(CX*8)
        SBBQ    DX, DX
        INCQ    CX
        NEGQ    DX
addVV_g_pc77:
        CMPQ    BX, CX
        JGT     addVV_g_pc51

So at least for this code we may not have to do anything.

@ianlancetaylor ianlancetaylor changed the title proposal: Go 2: unsafe: add Unreachable proposal: unsafe: add Unreachable Jun 21, 2023
@ianlancetaylor ianlancetaylor removed v2 An incompatible library change NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Jun 21, 2023
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

@griesemer points out that we added unsafe to provide the ability to do things that the language could not express but that needed to be available, specifically type-unsafe conversions. unsafe.Unreachable is different: it's purely a compiler optimization, not giving new expressive power.

I worry a lot about debugability of buggy code, especially after years of suffering C and C++ compilers and "undefined behavior". In general Go's approach has been to prioritize safe code execution over absolute raw performance. C/C++ compilers have shown us where we end up when the compiler assumes things like "array indexes are always in bound" or "pointers are never null". It's true that unsafe.Unreachable would only produce more limited effects, since "always" and "never" are replaced by "in this specific instance". Even so, I fear that adding unsafe.Unreachable will encourage overuse for the sake of performance and lead to code that misbehaves in mysterious ways.

Separately, the fact that the compiler is not smart enough to optimize away certain checks creates a constructive tension on developers that makes us search out new analyses or idioms that optimize better but are still safe. An example of this is the _ = b[7] bounds check hint in encoding/binary's implementation, which replaces 8 bounds checks by one. That's not zero, but it stays safe and ends up being a significant win. If we'd had unsafe.Unreachable or unsafe.Assume, we'd have been under pressure to use it there instead of inventing something safe. I would rather keep the constructive tension and encourage people to look for other safe idioms, perhaps backed by new, safe compiler analyses.

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Decline in Proposals Jun 28, 2023
@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

No change in consensus, so declined.
— rsc for the proposal review group

@rsc rsc moved this from Likely Decline to Declined in Proposals Jul 5, 2023
@rsc rsc closed this as completed Jul 5, 2023
@golang golang locked and limited conversation to collaborators Jul 4, 2024
@rsc rsc removed this from Proposals Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests