-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eager finalizer insertion #45272
Eager finalizer insertion #45272
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
This is great! I think this approach will probably cover the cases that #44056 intended to address. Note: I could only follow the general gist of what you're doing here, so my review approval really just pertains to the idea in general. I haven't thoroughly scrutinized the implementation. |
ff35e48
to
458c7f6
Compare
I've addressed the correctness issues that made me call this "WIP", though I haven't yet expanded the legality analysis to allow all the cases I want yet. That said, I think that can be part of a follow up PR (including any potential interaction with EA), so I think this is basically good to go. |
@nanosoldier |
Your package evaluation job has completed - possible new issues were detected. A full report can be found here. |
# Handle finalizer | ||
if sig.f === Core._add_finalizer | ||
if isa(info, FinalizerInfo) | ||
# Only inline finalizers that are known nothrow and notls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have this restriction? Would it be invalid to inline finalizers and have them run in the original task environment?
Obligatory GPU context: our finalizers access the TLS to inspect the current stream. On finalizer tasks, there's no such stream, so we perform a blocking free, whereas on 'regular' tasks the memory operation can be ordered against other operations on that stream. I had assumed that inlining finalizers would additionally get them to use the local stream, promoting blocking frees to asynchronously-ordered ones. Maybe that's too magical though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's to prevent things like #42752 where finalizers unexpectedly mess with task local state. I think we're now conceptually moving into the direction of finalizers having their own pseudo-task (maybe even a real task eventually), so inlining a finalizer that looks at task state would not be legal since it changes the task environment.
That said, for your case, you want to allow that to happen, even though it's not technically semantically legal. I think for that case, we could just put a notls effect override on the finalizer.
I worked through PkgEval and fixed 3 issues that I could reproduce, but none were related to this PR. The remaining issues are the known random precompile segfaults in the SciML stack, which as usual, I don't see locally. One of these days we'll need to rr a PkgEval run to catch those. Overall though, no issues I can see with merging this from a PkgEval perspective. |
Split out from #45272. This effect models the legality of moving code between tasks. It is somewhat related to effect-free/consistent, but only with respect to task-local state. As an example consider something like: ``` global glob function bar() @async (global glob = 1; some_other_code()) end ``` The newly created task is not effect-free, but it would be legal to inline the assignment of `glob` into `bar` (as long it is inlined before the creation of the task of `some_other_code` does not access `glob`). For comparison, the following is neither `notls`, nor `effect_free`: ``` function bar() @async (task_local_storage()[:var] = 1; some_other_code()) end ``` The same implies to implicit task-local state such as the RNG state. Implementation wise, there isn't a lot here, because the implicit tainting by ccall is the correct conservative default. In the future, we may want to annotate various ccalls as being permissible for notls, but let's worry about that when we have a case that needs it.
Split out from #45272. This is prepratory work towards adding optimization passes that recognize this builtin. This PR adds `Core.finalizer` with essentially the same interface as `Base.finalizer`, but without the error checking or raw-C-pointer feature. In future commits, the Core.finalizer interface will likely expand slightly, but Base.finalizer will remain unchanged and is the supported interface for this functionality.
- renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code - improved type stabilities by handling edge cases
- renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code - improved type stabilities by handling edge cases
- renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code - improved type stabilities by handling edge cases
- renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code - improved type stabilities by handling edge cases
- renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code - improved type stabilities by handling edge cases
- added more comments to explain the purpose of code - properly handle `ConstantCase` - erase `Core.finalizer` call more aggressively - improved type stabilities by handling edge cases - renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code
- added more comments to explain the purpose of code - properly handle `ConstantCase` - erase `Core.finalizer` call more aggressively - improved type stabilities by handling edge cases - renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code
inlining: follow #45272, improve the finalizer inlining implementation
- added more comments to explain the purpose of code - properly handle `ConstantCase` - erase `Core.finalizer` call more aggressively - improved type stabilities by handling edge cases - renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code
…plementation - added more comments to explain the purpose of code - properly handle `ConstantCase` - erase `Core.finalizer` call more aggressively - improved type stabilities by handling edge cases - renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code
I added the flag in JuliaLang#45272, but didn't hook it up to the nothrow model. In the fullness of time, these should forwarded from inference and only recomputed if necessary, but for the moment, just make it work.
…plementation - added more comments to explain the purpose of code - properly handle `ConstantCase` - erase `Core.finalizer` call more aggressively - improved type stabilities by handling edge cases - renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block post-dominateds all the def/uses e.g. ```julia function cfg_finalization5(io) for i = -999:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end register_finalizer!(o) end end let src = code_typed1(cfg_finalization5, (IO,)) # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block post-dominateds all the def/uses e.g. ```julia function cfg_finalization5(io) for i = -999:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end register_finalizer!(o) end end let src = code_typed1(cfg_finalization5, (IO,)) # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block post-dominateds all the def/uses e.g. ```julia function cfg_finalization5(io) for i = -999:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end register_finalizer!(o) end end let src = code_typed1(cfg_finalization5, (IO,)) # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful. Co-authored-by: Keno Fischer <[email protected]>
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful. Co-authored-by: Keno Fischer <[email protected]>
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful.
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful.
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful.
This is a variant of the eager-finalization idea
(e.g. as seen in #44056), but with a focus on the mechanism
of finalizer insertion, since I need a similar pass downstream.
Integration of EscapeAnalysis is left to #44056.
My motivation for this change is somewhat different. In particular,
I want to be able to insert finalize call such that I can
subsequently SROA the mutable object. This requires a couple
design points that are more stringent than the pass from #44056,
so I decided to prototype them as an independent PR. The primary
things I need here that are not seen in #44056 are:
entirely (requires additional legality analyis)
point (to enable subsequent SROA)
To this end, adding a finalizer is promoted to a builtin
that is recognized by inference and inlining (such that inference
can produce an inferred version of the finalizer for inlining).
The current status is that this fixes the minimal example I wanted
to have work, but does not yet extend to the motivating case I had.
Nevertheless, I felt that this was a good checkpoint to synchronize
with other efforts along these lines.
Currently working demo:
Thoughts @jpsamaroo @aviatesk ?