-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove @inbounds in tuple iteration #48297
Conversation
This was #48260, but CI didn't like that PR anymore. |
@nanosoldier |
base/compiler/tfuncs.jl
Outdated
# The inbounds-ness assertion requires dynamic reachability, while | ||
# :consistent needs to be true for all input values. | ||
# However, as a special exception, we do allow literal `:boundscheck`. | ||
# `:consitency`-will be tainted in any caller using `@inbounds` based |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# `:consitency`-will be tainted in any caller using `@inbounds` based | |
# `:consistency`-will be tainted in any caller using `@inbounds` based |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, we fixed that already. Did I accidentally drop the fixup commit when rebasing?
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
LLVM can prove this inbounds and the annotation weakens the inferable effects for tuple iteration, which has a surprisingly large inference performance and precision impact. Unfortunately, my previous changes to :inbounds tainting weren't quite strong enough yet, because `getfield` was still tainting consistency on unknown boundscheck arguments. To fix that, we pass through the fargs into the fetfield effects to check if we're getting a literal `:boundscheck`, in which case the `:noinbounds` consistency-tainting logic I added in #48246 is sufficient to not require additional consistency tainting. Also add a test for both effects and codegen to make sure this doens't regress.
This got exceedingly faster. I'm assuming this means we regressed it at some point and didn't notice. Some memory regressions elsewhere. Probably from having to allocate more ArgInfo objects in the optimizer, but we shouldn't really be calling this from there anymore anyway, since we now annotate effectfulness flags in inference. But that's a spearate change. I think this is good to go if CI comes back green. |
So we don't need the previous changes on the |
We need both |
Ah, I just forgot we already merged that PR. |
elseif length(argtypes) != 2 | ||
if length(argtypes) == 4 | ||
isvarargtype(argtypes[4]) && return false | ||
if widenconst(argtypes[4]) !== Bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if widenconst(argtypes[4]) !== Bool | |
if !hasintersect(widenconst(argtypes[4]), Bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you that this condition is suspect, but I'm not sure your suggested replacement is correct either, because that would just set the ordering to :not_atomic
, which seems wrong potentially. However, this is pre-existing code, so I think we should punt that to another PR.
test/compiler/codegen.jl
Outdated
@@ -794,3 +794,6 @@ end | |||
f48085(@nospecialize x...) = length(x) | |||
@test Core.Compiler.get_compileable_sig(which(f48085, (Vararg{Any},)), Tuple{typeof(f48085), Vararg{Int}}, Core.svec()) === nothing | |||
@test Core.Compiler.get_compileable_sig(which(f48085, (Vararg{Any},)), Tuple{typeof(f48085), Int, Vararg{Int}}, Core.svec()) === Tuple{typeof(f48085), Any, Vararg{Any}} | |||
|
|||
# Make sure that the bounds check is elided in tuple iteration | |||
@test !occursin("call void @", get_llvm(iterate, Tuple{NTuple{4, Float64}, Int64})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These sorts of checks usually break our coverage and debug tests (since -g2
may add void @llvm.dbg.value
calls, for example). It is preexisting, so does not need to block merging. But can we do any better though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could maybe check for the jl_ prefix, since the boundscheck we're trying to avoid here is an intrinsic function, but I did want to test that we don't call any other non-inlined function that might be hiding the call either.
|
LLVM can prove this inbounds and the annotation weakens the inferable effects for tuple iteration, which has a surprisingly large inference performance and precision impact.
Unfortunately, my previous changes to :inbounds tainting weren't quite strong enough yet, because
getfield
was still tainting consistency on unknown boundscheck arguments. To fix that, we pass through the fargs into the fetfield effects to check if we're getting a literal:boundscheck
, in which case the:noinbounds
consistency-tainting logic I added in #48246 is sufficient to not require additional consistency tainting.Also add a test for both effects and codegen to make sure this doens't regress.