Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eager finalizer insertion #45272

Merged
merged 4 commits into from
Jun 7, 2022
Merged

Eager finalizer insertion #45272

merged 4 commits into from
Jun 7, 2022

Conversation

Keno
Copy link
Member

@Keno Keno commented May 11, 2022

This is a variant of the eager-finalization idea
(e.g. as seen in #44056), but with a focus on the mechanism
of finalizer insertion, since I need a similar pass downstream.
Integration of EscapeAnalysis is left to #44056.

My motivation for this change is somewhat different. In particular,
I want to be able to insert finalize call such that I can
subsequently SROA the mutable object. This requires a couple
design points that are more stringent than the pass from #44056,
so I decided to prototype them as an independent PR. The primary
things I need here that are not seen in #44056 are:

  • The ability to forgo finalizer registration with the runtime
    entirely (requires additional legality analyis)
  • The ability to inline the registered finalizer at the deallocation
    point (to enable subsequent SROA)

To this end, adding a finalizer is promoted to a builtin
that is recognized by inference and inlining (such that inference
can produce an inferred version of the finalizer for inlining).

The current status is that this fixes the minimal example I wanted
to have work, but does not yet extend to the motivating case I had.
Nevertheless, I felt that this was a good checkpoint to synchronize
with other efforts along these lines.

Currently working demo:

julia> const total_deallocations = Ref{Int}(0)
Base.RefValue{Int64}(0)

julia> mutable struct DoAlloc
           function DoAlloc()
               this = new()
               Core.finalizer(this) do this
                   global total_deallocations[] += 1
               end
               return this
           end
       end

julia> function foo()
           for i = 1:1000
               DoAlloc()
           end
       end
foo (generic function with 1 method)

julia> @code_llvm foo()
;  @ REPL[3]:1 within `foo`
define void @julia_foo_428() #0 {
top:
  %.promoted = load i64, i64* inttoptr (i64 4373384000 to i64*), align 64
;  @ REPL[3]:2 within `foo`
  %0 = add i64 %.promoted, 1000
;  @ REPL[3]:3 within `foo`
; ┌ @ REPL[2]:4 within `DoAlloc`
; │┌ @ REPL[2]:5 within `#3`
; ││┌ @ Base.jl within `setproperty!`
     store i64 %0, i64* inttoptr (i64 4373384000 to i64*), align 64
; └└└
;  @ REPL[3]:4 within `foo`
  ret void
}

julia> foo()

julia> total_deallocations[]
1000

Thoughts @jpsamaroo @aviatesk ?

Copy link
Member

@jpsamaroo jpsamaroo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

test/compiler/inline.jl Outdated Show resolved Hide resolved
@jpsamaroo
Copy link
Member

This is great! I think this approach will probably cover the cases that #44056 intended to address.

Note: I could only follow the general gist of what you're doing here, so my review approval really just pertains to the idea in general. I haven't thoroughly scrutinized the implementation.

@Keno Keno force-pushed the kf/eagerfinalizers branch 2 times, most recently from ff35e48 to 458c7f6 Compare May 12, 2022 00:52
@Keno Keno changed the title Very WIP: Eager finalizer insertion Eager finalizer insertion May 16, 2022
@Keno Keno force-pushed the kf/eagerfinalizers branch from 458c7f6 to 67ec007 Compare May 16, 2022 04:58
@Keno
Copy link
Member Author

Keno commented May 16, 2022

I've addressed the correctness issues that made me call this "WIP", though I haven't yet expanded the legality analysis to allow all the cases I want yet. That said, I think that can be part of a follow up PR (including any potential interaction with EA), so I think this is basically good to go.

@Keno Keno force-pushed the kf/eagerfinalizers branch from 67ec007 to dc8a715 Compare May 16, 2022 18:05
@Keno
Copy link
Member Author

Keno commented May 16, 2022

@nanosoldier runtests(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

# Handle finalizer
if sig.f === Core._add_finalizer
if isa(info, FinalizerInfo)
# Only inline finalizers that are known nothrow and notls.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have this restriction? Would it be invalid to inline finalizers and have them run in the original task environment?

Obligatory GPU context: our finalizers access the TLS to inspect the current stream. On finalizer tasks, there's no such stream, so we perform a blocking free, whereas on 'regular' tasks the memory operation can be ordered against other operations on that stream. I had assumed that inlining finalizers would additionally get them to use the local stream, promoting blocking frees to asynchronously-ordered ones. Maybe that's too magical though.

Copy link
Member Author

@Keno Keno May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to prevent things like #42752 where finalizers unexpectedly mess with task local state. I think we're now conceptually moving into the direction of finalizers having their own pseudo-task (maybe even a real task eventually), so inlining a finalizer that looks at task state would not be legal since it changes the task environment.

That said, for your case, you want to allow that to happen, even though it's not technically semantically legal. I think for that case, we could just put a notls effect override on the finalizer.

@Keno
Copy link
Member Author

Keno commented May 18, 2022

I worked through PkgEval and fixed 3 issues that I could reproduce, but none were related to this PR. The remaining issues are the known random precompile segfaults in the SciML stack, which as usual, I don't see locally. One of these days we'll need to rr a PkgEval run to catch those. Overall though, no issues I can see with merging this from a PkgEval perspective.

src/builtins.c Outdated Show resolved Hide resolved
base/compiler/types.jl Outdated Show resolved Hide resolved
base/compiler/types.jl Outdated Show resolved Hide resolved
src/builtins.c Outdated Show resolved Hide resolved
test/compiler/inline.jl Outdated Show resolved Hide resolved
base/compiler/stmtinfo.jl Outdated Show resolved Hide resolved
base/gcutils.jl Outdated Show resolved Hide resolved
Keno added a commit that referenced this pull request May 22, 2022
Split out from #45272. This effect models the legality of moving code
between tasks. It is somewhat related to effect-free/consistent, but
only with respect to task-local state. As an example consider something
like:

```
global glob
function bar()
    @async (global glob = 1; some_other_code())
end
```

The newly created task is not effect-free, but it would be legal to inline
the assignment of `glob` into `bar` (as long it is inlined before the creation
of the task of `some_other_code` does not access `glob`). For comparison,
the following is neither `notls`, nor `effect_free`:

```
function bar()
    @async (task_local_storage()[:var] = 1; some_other_code())
end
```

The same implies to implicit task-local state such as the RNG state.

Implementation wise, there isn't a lot here, because the implicit
tainting by ccall is the correct conservative default. In the future,
we may want to annotate various ccalls as being permissible for
notls, but let's worry about that when we have a case that needs it.
@Keno Keno mentioned this pull request May 22, 2022
Keno added a commit that referenced this pull request May 22, 2022
Split out from #45272. This is prepratory work towards adding
optimization passes that recognize this builtin. This PR adds
`Core.finalizer` with essentially the same interface as
`Base.finalizer`, but without the error checking or raw-C-pointer
feature. In future commits, the Core.finalizer interface will
likely expand slightly, but Base.finalizer will remain unchanged
and is the supported interface for this functionality.
@Keno Keno mentioned this pull request May 22, 2022
aviatesk added a commit that referenced this pull request Jul 5, 2022
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
- improved type stabilities by handling edge cases
aviatesk added a commit that referenced this pull request Jul 5, 2022
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
- improved type stabilities by handling edge cases
aviatesk added a commit that referenced this pull request Jul 6, 2022
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
- improved type stabilities by handling edge cases
aviatesk added a commit that referenced this pull request Jul 11, 2022
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
- improved type stabilities by handling edge cases
aviatesk added a commit that referenced this pull request Jul 11, 2022
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
- improved type stabilities by handling edge cases
aviatesk added a commit that referenced this pull request Jul 12, 2022
- added more comments to explain the purpose of code
- properly handle `ConstantCase`
- erase `Core.finalizer` call more aggressively
- improved type stabilities by handling edge cases
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
aviatesk added a commit that referenced this pull request Jul 13, 2022
- added more comments to explain the purpose of code
- properly handle `ConstantCase`
- erase `Core.finalizer` call more aggressively
- improved type stabilities by handling edge cases
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
aviatesk added a commit that referenced this pull request Jul 13, 2022
inlining: follow #45272, improve the finalizer inlining implementation
aviatesk added a commit that referenced this pull request Jul 13, 2022
- added more comments to explain the purpose of code
- properly handle `ConstantCase`
- erase `Core.finalizer` call more aggressively
- improved type stabilities by handling edge cases
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
ffucci pushed a commit to ffucci/julia that referenced this pull request Aug 11, 2022
…plementation

- added more comments to explain the purpose of code
- properly handle `ConstantCase`
- erase `Core.finalizer` call more aggressively
- improved type stabilities by handling edge cases
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
pcjentsch pushed a commit to pcjentsch/julia that referenced this pull request Aug 18, 2022
I added the flag in JuliaLang#45272, but didn't hook it up to the nothrow
model. In the fullness of time, these should forwarded from
inference and only recomputed if necessary, but for the moment,
just make it work.
pcjentsch pushed a commit to pcjentsch/julia that referenced this pull request Aug 18, 2022
…plementation

- added more comments to explain the purpose of code
- properly handle `ConstantCase`
- erase `Core.finalizer` call more aggressively
- improved type stabilities by handling edge cases
- renamed `AddFianlizerUse` to `FinalizerUse`
- removed dead pieces of code
aviatesk added a commit that referenced this pull request Sep 6, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses are in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...)
@noinline nothrow_side_effect(x) =
    Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid
mutable struct DoAllocWithFieldInter
    x
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        nothrow_side_effect(nothing)
    end
end

let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            register_finalizer!(o)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
        end
    end
    # the finalzier call gets inlined
    @test count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```

There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a `finalizer` call
in a case when a `finalizer` block is post-dominated by all the def/uses
e.g.
```
let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
            register_finalizer!(o)
        end
    end
    # currently this is broken
    @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```
aviatesk added a commit that referenced this pull request Sep 6, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses are in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...)
@noinline nothrow_side_effect(x) =
    Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid
mutable struct DoAllocWithFieldInter
    x
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        nothrow_side_effect(nothing)
    end
end

let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            register_finalizer!(o)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
        end
    end
    # the finalzier call gets inlined
    @test count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```

There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a `finalizer` call
in a case when a `finalizer` block is post-dominated by all the def/uses
e.g.
```
let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
            register_finalizer!(o)
        end
    end
    # currently this is broken
    @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```
aviatesk added a commit that referenced this pull request Sep 7, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses are in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...)
@noinline nothrow_side_effect(x) =
    Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid
mutable struct DoAllocWithFieldInter
    x
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        nothrow_side_effect(nothing)
    end
end

let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            register_finalizer!(o)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
        end
    end
    # the finalzier call gets inlined
    @test count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```

There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a `finalizer` call
in a case when a `finalizer` block is post-dominated by all the def/uses
e.g.
```
let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
            register_finalizer!(o)
        end
    end
    # currently this is broken
    @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```
aviatesk added a commit that referenced this pull request Sep 7, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses are in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...)
@noinline nothrow_side_effect(x) =
    Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid
mutable struct DoAllocWithFieldInter
    x
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        nothrow_side_effect(nothing)
    end
end

let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            register_finalizer!(o)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
        end
    end
    # the finalzier call gets inlined
    @test count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```

There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a `finalizer` call
in a case when a `finalizer` block is post-dominated by all the def/uses
e.g.
```
let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
            register_finalizer!(o)
        end
    end
    # currently this is broken
    @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```
aviatesk added a commit that referenced this pull request Sep 8, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses are in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...)
@noinline nothrow_side_effect(x) =
    Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid
mutable struct DoAllocWithFieldInter
    x
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        nothrow_side_effect(nothing)
    end
end

let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            register_finalizer!(o)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
        end
    end
    # the finalzier call gets inlined
    @test count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```

There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a `finalizer` call
in a case when a `finalizer` block is post-dominated by all the def/uses
e.g.
```
let src = code_typed1() do
        for i = -1000:1000
            o = DoAllocWithFieldInter(i)
            if i == 1000
                safeprint(o.x, '\n')
            elseif i > 0
                safeprint(o.x)
            end
            register_finalizer!(o)
        end
    end
    # currently this is broken
    @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```
aviatesk added a commit that referenced this pull request Sep 8, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses are in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization3(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        register_finalizer!(o)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
    end
end
let src = code_typed1(cfg_finalization3, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization3(IOBuffer())
    @test get_finalization_count() == 1000
end
```

There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a `finalizer` call
in a case when a `finalizer` block post-dominateds all the def/uses e.g.
```julia
function cfg_finalization5(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
        register_finalizer!(o)
    end
end
let src = code_typed1(cfg_finalization5, (IO,))
    # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses
    @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```
aviatesk added a commit that referenced this pull request Sep 8, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses to be in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization3(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        register_finalizer!(o)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
    end
end
let src = code_typed1(cfg_finalization3, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization3(IOBuffer())
    @test get_finalization_count() == 1000
end
```

There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a `finalizer` call
in a case when a `finalizer` block post-dominateds all the def/uses e.g.
```julia
function cfg_finalization5(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
        register_finalizer!(o)
    end
end
let src = code_typed1(cfg_finalization5, (IO,))
    # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses
    @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```
Keno pushed a commit that referenced this pull request Sep 21, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses to be in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization3(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        register_finalizer!(o)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
    end
end
let src = code_typed1(cfg_finalization3, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization3(IOBuffer())
    @test get_finalization_count() == 1000
end
```

There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a `finalizer` call
in a case when a `finalizer` block post-dominateds all the def/uses e.g.
```julia
function cfg_finalization5(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
        register_finalizer!(o)
    end
end
let src = code_typed1(cfg_finalization5, (IO,))
    # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses
    @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1
end
```
Keno added a commit that referenced this pull request Sep 21, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses to be in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization3(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        register_finalizer!(o)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
    end
end
let src = code_typed1(cfg_finalization3, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization3(IOBuffer())
    @test get_finalization_count() == 1000
end
```

To support this transformation, the domtree code also gains the ability
to represent post-dominator trees, which is generally useful.

Co-authored-by: Keno Fischer <[email protected]>
Keno added a commit that referenced this pull request Sep 21, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses to be in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization3(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        register_finalizer!(o)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
    end
end
let src = code_typed1(cfg_finalization3, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization3(IOBuffer())
    @test get_finalization_count() == 1000
end
```

To support this transformation, the domtree code also gains the ability
to represent post-dominator trees, which is generally useful.

Co-authored-by: Keno Fischer <[email protected]>
aviatesk added a commit that referenced this pull request Sep 21, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses to be in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization3(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        register_finalizer!(o)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
    end
end
let src = code_typed1(cfg_finalization3, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization3(IOBuffer())
    @test get_finalization_count() == 1000
end
```

To support this transformation, the domtree code also gains the ability
to represent post-dominator trees, which is generally useful.
aviatesk added a commit that referenced this pull request Sep 21, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses to be in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization3(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        register_finalizer!(o)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
    end
end
let src = code_typed1(cfg_finalization3, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization3(IOBuffer())
    @test get_finalization_count() == 1000
end
```

To support this transformation, the domtree code also gains the ability
to represent post-dominator trees, which is generally useful.
Keno pushed a commit that referenced this pull request Sep 21, 2022
Eager `finalizer` inlining (#45272) currently has a restriction that
requires all the def/uses to be in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a `finalizer`
call to be inlined, since in that case it is safe to insert the body of
`finalizer` at the end of all the def/uses, e.g.
```julia
const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization3(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        register_finalizer!(o)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
    end
end
let src = code_typed1(cfg_finalization3, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization3(IOBuffer())
    @test get_finalization_count() == 1000
end
```

To support this transformation, the domtree code also gains the ability
to represent post-dominator trees, which is generally useful.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants