Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing backedge somewhere on master? #48047

Closed
Keno opened this issue Dec 30, 2022 · 11 comments · Fixed by #48054 or #48309
Closed

Missing backedge somewhere on master? #48047

Keno opened this issue Dec 30, 2022 · 11 comments · Fixed by #48054 or #48309
Labels
regression Regression in behavior compared to a previous version
Milestone

Comments

@Keno
Copy link
Member

Keno commented Dec 30, 2022

It appears that something around backedge tracking broke somewhat recently (I'm assuming with pkgimages, but haven't bisected yet). In particular, changing Core.Compiler no longer gets picked up by Cthulhu. To reproduce, apply a patch to base/compiler, e.g.

diff --git a/base/compiler/abstractinterpretation.jl b/base/compiler/abstractinterpretation.jl
index b3e18a6ee7..fc8e2d1aee 100644
--- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -547,6 +547,8 @@ function abstract_call_method(interp::AbstractInterpreter, method::Method, @nosp
         return MethodCallResult(Any, false, false, nothing, Effects())
     end
 
+    @show sig
+
     # Limit argument type tuple growth of functions:
     # look through the parents list to see if there's a call to the same method
     # and from the same method.

This will not show up in Cthulhu:

julia> using Revise

julia> Revise.track(Core.Compiler)

julia> using Cthulhu

julia> @descend optimize=false sin(1.0)
[snip]
sin(x::T) where T<:Union{Float32, Float64} @ Base.Math special/trig.jl:29

However, if I force-reevaluate do_typeinf!, it does show up:

julia> Core.eval(Cthulhu, quote
       function do_typeinf!(interp::AbstractInterpreter, mi::MethodInstance)
           result = InferenceResult(mi)
           # we may want to handle the case when `InferenceState(...)` returns `nothing`,
           # which indicates code generation of a `@generated` has been failed,
           # and show it in the UI in some way ?
           # branch on https://github.com/JuliaLang/julia/pull/42082
           frame = @static hasmethod(InferenceState, (InferenceResult,Symbol,AbstractInterpreter)) ?
                   InferenceState(result, #=cache=# :global, interp)::InferenceState :
                   InferenceState(result, #=cached=# true, interp)::InferenceState
           CC.typeinf(interp, frame)
           return nothing
       end
       end)

julia> @descend optimize=false sin(1.);
:sig = Tuple{typeof(Base.abs), Float64}
:sig = Tuple{Type{Float64}, Base.Irrational{:π}}
:sig = Tuple{typeof(Base.:(/)), Float64, Int64}
:sig = Tuple{typeof(Base.promote), Float64, Int64}
:sig = Tuple{typeof(Base._promote), Float64, Int64}

This suggests that there is some cached version of do_typeinf! that fails to notice the revision of the Core.Compiler method. @timholy any ideas - is this known?

@Keno
Copy link
Member Author

Keno commented Dec 30, 2022

If I do info = info_cachefile("Cthulhu") and then look at info.edges, I don't see an entry for Cthulhu.do_typeinf! - shouldn't there be one?

@Keno
Copy link
Member Author

Keno commented Dec 30, 2022

julia> using Revise

julia> Revise.track(Core.Compiler)

julia> using Cthulhu

julia> mi = Cthulhu.get_specialization(Tuple{typeof(gcd), Int, Int})
MethodInstance for gcd(::Int64, ::Int64)

julia> interp = Cthulhu.CthulhuInterpreter();

julia> Cthulhu.do_typeinf!(interp, mi)

julia> Core.Compiler.typeinf(interp, Core.Compiler.InferenceState(Core.Compiler.InferenceResult(mi), :global, interp))
:sig = Tuple{typeof(Base.:(==)), Int64, Int64}
:sig = Tuple{typeof(Base.Checked.checked_abs), Int64}
:sig = Tuple{typeof(Base.:(==)), Int64, Int64}
:sig = Tuple{typeof(Base.Checked.checked_abs), Int64}
:sig = Tuple{typeof(Base._gcd), Int64, Int64}
:sig = Tuple{typeof(Base.signbit), Int64}
:sig = Tuple{typeof(Base.__throw_gcd_overflow), Int64, Int64}
true

@Keno
Copy link
Member Author

Keno commented Dec 30, 2022

I think one piece of the puzzle is that Cthulhu sets compile=min for its methods.

@Keno
Copy link
Member Author

Keno commented Dec 30, 2022

Aha, I think I have the smoking gun. We collect the edges in

        jl_prepare_serialization_data(mod_array, newly_inferred, jl_worklist_key(worklist), &extext_methods, &new_specializations, &method_roots_list, &ext_targets, &edges);

but then only call

            *_native_data = jl_precompile_worklist(worklist, extext_methods, new_specializations);

afterwards, so if we do extra inference there, those edges are missed.

@vtjnash
Copy link
Member

vtjnash commented Dec 30, 2022

Extra inference seems forbidden there, since we should have already prepped the method list for serialization and decided the total content to include. What changed?

@Keno
Copy link
Member Author

Keno commented Dec 30, 2022

By what mechanism is it supposed to be forbidden?

#0  jl_type_infer (mi=mi@entry=0x7f96486a8380, world=33942, force=force@entry=0) at /home/keno/julia/src/gf.c:278
#1  0x00007f9651e5305c in jl_ci_cache_lookup (cgparams=..., src_out=0x7ffd4ab60c08, ci_out=<synthetic pointer>, 
    world=<optimized out>, mi=0x7f96486a8380) at /home/keno/julia/src/aotcompile.cpp:250
#2  jl_create_native_impl (methods=0x7f964a99f4c0, llvmmod=<optimized out>, 
    cgparams=0x7f9651f07e60 <jl_default_cgparams>, _policy=0, _imaging_mode=<optimized out>, 
    _external_linkage=<optimized out>) at /home/keno/julia/src/aotcompile.cpp:335
#3  0x00007f96523d631c in jl_precompile_ (external_linkage=external_linkage@entry=1, m=<optimized out>, m=<optimized out>)
    at /home/keno/julia/src/precompile_utils.c:254
#4  0x00007f96523e3b55 in jl_precompile_worklist (extext_methods=<optimized out>, extext_methods=<optimized out>, 
    new_specializations=<optimized out>, new_specializations=<optimized out>, worklist=0x7f9648c268f0)
    at /home/keno/julia/src/precompile_utils.c:303
#5  ijl_create_system_image (_native_data=<optimized out>, worklist=<optimized out>, emit_split=<optimized out>, 
    s=<optimized out>, z=<optimized out>, udeps=<optimized out>, srctextpos=<optimized out>)
    at /home/keno/julia/src/staticdata.c:2624

@vtjnash
Copy link
Member

vtjnash commented Dec 30, 2022

It is not supposed to be calling anything Julia related (finalizers are also disabled here, and threads are supposed to be stopped), since it will corrupt the state, though nothing checks it to detect that sort of error. Sounds like someone broke it recently then?

@vtjnash vtjnash added the regression Regression in behavior compared to a previous version label Dec 30, 2022
@vtjnash vtjnash added this to the 1.9 milestone Dec 30, 2022
@Keno
Copy link
Member Author

Keno commented Dec 30, 2022

This diff might work?

diff --git a/src/precompile_utils.c b/src/precompile_utils.c
index f251d00f76..6154a9ced7 100644
--- a/src/precompile_utils.c
+++ b/src/precompile_utils.c
@@ -188,7 +188,7 @@ static int precompile_enq_specialization_(jl_method_instance_t *mi, void *closur
                 (jl_ir_inlining_cost((jl_array_t*)inferred) == UINT16_MAX)) {
                 do_compile = 1;
             }
-            else if (jl_atomic_load_relaxed(&codeinst->invoke) != NULL || jl_atomic_load_relaxed(&codeinst->precompile)) {
+            else if (jl_atomic_load_relaxed(&codeinst->precompile)) {
                 do_compile = 1;
             }
         }

@vtjnash
Copy link
Member

vtjnash commented Dec 30, 2022

do_compile seems likely to corrupt the precompile state prepared by jl_prepare_serialization_data, if it gets called at all

Keno added a commit that referenced this issue Dec 30, 2022
As noted in #48047, we're currently attempting to infer extra
methods during incremental image saving, which causes us to miss
edges in the image. In particular, in the case of #48047, Cthulhu
had the `compile=min` option set, which caused the code instance
for `do_typeinf!` to not be infered. However, later it was
nevertheless queued for precompilation, causing inference to
occur at an inopportune time.

This PR simply prevents code instances that don't explicitly have
the `->precompile` flag set (e.g. the guard instance created for
the interpreter) from being enqueued for precompilation. It is
not clear that this is necessarily the correct behavior - we may
in fact want to infer these method instances, just before we set
up the serializer state, but for now this fixes #48047 for me.

I also included an appropriate test and a warning message if
we attempt to enter inference when this is not legal, so any
revisit of what should be happening here can hopefully make
use of those.
Keno added a commit that referenced this issue Dec 30, 2022
As noted in #48047, we're currently attempting to infer extra
methods during incremental image saving, which causes us to miss
edges in the image. In particular, in the case of #48047, Cthulhu
had the `compile=min` option set, which caused the code instance
for `do_typeinf!` to not be infered. However, later it was
nevertheless queued for precompilation, causing inference to
occur at an inopportune time.

This PR simply prevents code instances that don't explicitly have
the `->precompile` flag set (e.g. the guard instance created for
the interpreter) from being enqueued for precompilation. It is
not clear that this is necessarily the correct behavior - we may
in fact want to infer these method instances, just before we set
up the serializer state, but for now this fixes #48047 for me.

I also included an appropriate test and a warning message if
we attempt to enter inference when this is not legal, so any
revisit of what should be happening here can hopefully make
use of those.
@Keno
Copy link
Member Author

Keno commented Dec 30, 2022

I've PR'd that diff as #48054 since if fixes my immediate issue. If there's a large fix to be had, you might need to discuss with @vchuravy .

Keno added a commit that referenced this issue Dec 31, 2022
As noted in #48047, we're currently attempting to infer extra
methods during incremental image saving, which causes us to miss
edges in the image. In particular, in the case of #48047, Cthulhu
had the `compile=min` option set, which caused the code instance
for `do_typeinf!` to not be infered. However, later it was
nevertheless queued for precompilation, causing inference to
occur at an inopportune time.

This PR simply prevents code instances that don't explicitly have
the `->precompile` flag set (e.g. the guard instance created for
the interpreter) from being enqueued for precompilation. It is
not clear that this is necessarily the correct behavior - we may
in fact want to infer these method instances, just before we set
up the serializer state, but for now this fixes #48047 for me.

I also included an appropriate test and a warning message if
we attempt to enter inference when this is not legal, so any
revisit of what should be happening here can hopefully make
use of those.
Keno added a commit that referenced this issue Jan 2, 2023
As noted in #48047, we're currently attempting to infer extra
methods during incremental image saving, which causes us to miss
edges in the image. In particular, in the case of #48047, Cthulhu
had the `compile=min` option set, which caused the code instance
for `do_typeinf!` to not be infered. However, later it was
nevertheless queued for precompilation, causing inference to
occur at an inopportune time.

This PR simply prevents code instances that don't explicitly have
the `->precompile` flag set (e.g. the guard instance created for
the interpreter) from being enqueued for precompilation. It is
not clear that this is necessarily the correct behavior - we may
in fact want to infer these method instances, just before we set
up the serializer state, but for now this fixes #48047 for me.

I also included an appropriate test and a warning message if
we attempt to enter inference when this is not legal, so any
revisit of what should be happening here can hopefully make
use of those.
Keno added a commit that referenced this issue Jan 2, 2023
As noted in #48047, we're currently attempting to infer extra
methods during incremental image saving, which causes us to miss
edges in the image. In particular, in the case of #48047, Cthulhu
had the `compile=min` option set, which caused the code instance
for `do_typeinf!` to not be infered. However, later it was
nevertheless queued for precompilation, causing inference to
occur at an inopportune time.

This PR simply prevents code instances that don't explicitly have
the `->precompile` flag set (e.g. the guard instance created for
the interpreter) from being enqueued for precompilation. It is
not clear that this is necessarily the correct behavior - we may
in fact want to infer these method instances, just before we set
up the serializer state, but for now this fixes #48047 for me.

I also included an appropriate test and a warning message if
we attempt to enter inference when this is not legal, so any
revisit of what should be happening here can hopefully make
use of those.
Keno added a commit that referenced this issue Jan 2, 2023
As noted in #48047, we're currently attempting to infer extra
methods during incremental image saving, which causes us to miss
edges in the image. In particular, in the case of #48047, Cthulhu
had the `compile=min` option set, which caused the code instance
for `do_typeinf!` to not be infered. However, later it was
nevertheless queued for precompilation, causing inference to
occur at an inopportune time.

This PR simply prevents code instances that don't explicitly have
the `->precompile` flag set (e.g. the guard instance created for
the interpreter) from being enqueued for precompilation. It is
not clear that this is necessarily the correct behavior - we may
in fact want to infer these method instances, just before we set
up the serializer state, but for now this fixes #48047 for me.

I also included an appropriate test and a warning message if
we attempt to enter inference when this is not legal, so any
revisit of what should be happening here can hopefully make
use of those.
Keno added a commit that referenced this issue Jan 4, 2023
As noted in #48047, we're currently attempting to infer extra
methods during incremental image saving, which causes us to miss
edges in the image. In particular, in the case of #48047, Cthulhu
had the `compile=min` option set, which caused the code instance
for `do_typeinf!` to not be infered. However, later it was
nevertheless queued for precompilation, causing inference to
occur at an inopportune time.

This PR simply prevents code instances that don't explicitly have
the `->precompile` flag set (e.g. the guard instance created for
the interpreter) from being enqueued for precompilation. It is
not clear that this is necessarily the correct behavior - we may
in fact want to infer these method instances, just before we set
up the serializer state, but for now this fixes #48047 for me.

I also included an appropriate test and a warning message if
we attempt to enter inference when this is not legal, so any
revisit of what should be happening here can hopefully make
use of those.
Keno added a commit that referenced this issue Jan 5, 2023
As noted in #48047, we're currently attempting to infer extra
methods during incremental image saving, which causes us to miss
edges in the image. In particular, in the case of #48047, Cthulhu
had the `compile=min` option set, which caused the code instance
for `do_typeinf!` to not be infered. However, later it was
nevertheless queued for precompilation, causing inference to
occur at an inopportune time.

This PR simply prevents code instances that don't explicitly have
the `->precompile` flag set (e.g. the guard instance created for
the interpreter) from being enqueued for precompilation. It is
not clear that this is necessarily the correct behavior - we may
in fact want to infer these method instances, just before we set
up the serializer state, but for now this fixes #48047 for me.

I also included an appropriate test and a warning message if
we attempt to enter inference when this is not legal, so any
revisit of what should be happening here can hopefully make
use of those.
@vtjnash vtjnash reopened this Jan 5, 2023
@timholy
Copy link
Member

timholy commented Jan 5, 2023

xref analysis in #48054 (comment)

KristofferC pushed a commit that referenced this issue Jan 10, 2023
As noted in #48047, we're currently attempting to infer extra
methods during incremental image saving, which causes us to miss
edges in the image. In particular, in the case of #48047, Cthulhu
had the `compile=min` option set, which caused the code instance
for `do_typeinf!` to not be infered. However, later it was
nevertheless queued for precompilation, causing inference to
occur at an inopportune time.

This PR simply prevents code instances that don't explicitly have
the `->precompile` flag set (e.g. the guard instance created for
the interpreter) from being enqueued for precompilation. It is
not clear that this is necessarily the correct behavior - we may
in fact want to infer these method instances, just before we set
up the serializer state, but for now this fixes #48047 for me.

I also included an appropriate test and a warning message if
we attempt to enter inference when this is not legal, so any
revisit of what should be happening here can hopefully make
use of those.

(cherry picked from commit 80aeebe)
vtjnash added a commit that referenced this issue Jan 17, 2023
vtjnash added a commit that referenced this issue Jan 17, 2023
Make sure things are properly ordered here, so that when serializing,
nothing is mutating the system at the same time.

Fix #48047
vtjnash added a commit that referenced this issue Jan 17, 2023
Make sure things are properly ordered here, so that when serializing,
nothing is mutating the system at the same time.

Fix #48047
vtjnash added a commit that referenced this issue Jan 19, 2023
Make sure things are properly ordered here, so that when serializing,
nothing is mutating the system at the same time.

Fix #48047
vtjnash added a commit that referenced this issue Feb 6, 2023
Make sure things are properly ordered here, so that when serializing,
nothing is mutating the system at the same time.

Fix #48047

(cherry picked from commit 87b8896)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression Regression in behavior compared to a previous version
Projects
None yet
3 participants