-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loading CxxWrap.jl causes slowdown in GAP.jl #258
Comments
I minimized the example in the sense that while it still relies on GAP.jl, it does so in a minimal fashion: via ccall: using GAP, BenchmarkTools
function func(ptr::Ptr{Cvoid})::Any
return ccall(:julia_gap, Any, (Ptr{Cvoid},), ptr)
end
function test_foo()
for i = 1:1000000
v = ccall(:GAP_ValueGlobalVariable, Ptr{Cvoid}, (Ptr{UInt8},), "IsAbelian")
x = func(v)
end
end
@btime test_foo() However, I failed reduce this further, by replacing function func(ptr::Ptr{Cvoid})::Any
return ccall(:strlen, Cint, (Ptr{Cvoid},), ptr)
end
function test_foo()
for i = 1:1000000
v = ccall(:strchr, Ptr{Cvoid}, (Ptr{UInt8}, Cint), "test", 't')
x = func(v)
end
end
@btime test_foo() Note that the C function |
Do you have a stack trace of how it gets to |
Here are two profiles from running the original test under 1. Code
2. Profile with CxxWrap3. Profile without CxxWrap |
CxxWrap adds a lot (more than 300) of convert methods, maybe these interfere here somehow. You can check if that is the problem by commenting out the evals that define |
I managed to narrow it down further: function func_slow(ptr::Ptr{Cvoid})::Any
return ccall(:julia_gap, Any, (Ptr{Cvoid},), ptr)
end
function test_slow()
for i = 1:1000000
x = func_slow(C_NULL)
end
end
@btime test_slow() This goes from 2.2ms to 185ms when CxxWrap.jl is loaded. Let me point out once again that Aaanyway: What I noticed is this: Before loading julia> @code_typed test_slow()
CodeInfo(
1 ─ goto #7 if not true
2 ┄ %2 = φ (#1 => 1, #6 => %9)::Int64
│ $(Expr(:foreigncall, :(:julia_gap), Any, svec(Ptr{Nothing}), 0, :(:ccall), :($(QuoteNode(Ptr{Nothing} @0x0000000000000000))), :($(QuoteNode(Ptr{Nothing} @0x0000000000000000)))))::Any
│ %4 = (%2 === 1000000)::Bool
└── goto #4 if not %4
3 ─ goto #5
4 ─ %7 = Base.add_int(%2, 1)::Int64
└── goto #5
5 ┄ %9 = φ (#4 => %7)::Int64
│ %10 = φ (#3 => true, #4 => false)::Bool
│ %11 = Base.not_int(%10)::Bool
└── goto #7 if not %11
6 ─ goto #2
7 ┄ return
) => Nothing
julia> @code_typed func_slow(C_NULL)
CodeInfo(
1 ─ %1 = $(Expr(:foreigncall, :(:julia_gap), Any, svec(Ptr{Nothing}), 0, :(:ccall), :(ptr), :(ptr)))::Any
└── return %1
) => Any After loading CxxWrap.jl, I get this instead: julia> @code_typed test_slow()
CodeInfo(
1 ─ goto #7 if not true
2 ┄ %2 = φ (#1 => 1, #6 => %9)::Int64
│ invoke Main.func_slow(Main.C_NULL::Ptr{Nothing})::Any
│ %4 = (%2 === 1000000)::Bool
└── goto #4 if not %4
3 ─ goto #5
4 ─ %7 = Base.add_int(%2, 1)::Int64
└── goto #5
5 ┄ %9 = φ (#4 => %7)::Int64
│ %10 = φ (#3 => true, #4 => false)::Bool
│ %11 = Base.not_int(%10)::Bool
└── goto #7 if not %11
6 ─ goto #2
7 ┄ return
) => Nothing
julia> @code_typed func_slow(C_NULL)
CodeInfo(
1 ─ %1 = Main.Any::Core.Compiler.Const(Any, false)
│ %2 = $(Expr(:foreigncall, :(:julia_gap), Any, svec(Ptr{Nothing}), 0, :(:ccall), :(ptr), :(ptr)))::Any
│ %3 = Base.convert(%1, %2)::Any
└── return %3
) => Any That is, it fails to inline And if I drop the I suspect this is because of the CxxWrap.jl method listed here:
So whenever Julia code contains a call All this suggests how I might resolve the issue (by dropping that |
... by dropping the `::Any` return value annotation in `RAW_GAP_TO_JULIA`: Normally, Julia optimizes this away as a no-op. But CxxWrap.jl installs a method for converting `CxxWrap.CxxWrapCore.SmartPointer{DerivedT}` to `Any`; this precludes Julia from optimizing the conversion away. See also: - oscar-system#485 - JuliaInterop/CxxWrap.jl#258
... by dropping the `::Any` return value annotation in `RAW_GAP_TO_JULIA`: Normally, Julia optimizes this away as a no-op. But CxxWrap.jl installs a method for converting `CxxWrap.CxxWrapCore.SmartPointer{DerivedT}` to `Any`; this precludes Julia from optimizing the conversion away. See also: - #485 - JuliaInterop/CxxWrap.jl#258
Does the slowdown also happen if instead of loading CxxWrap, you add a method do convert? e.g. do this before running the test: julia> struct Foo end
julia> Base.convert(::Type{Any}, x::Foo) = x |
Yes indeed, it does. |
OK, then I guess it's best to open an issue for this with Julia itself? |
Well, one might argue that It still might be a good idea to get some core Julia people involved, just as a form of sanity check? |
Fixed in the master branch! |
The package GAP.jl does not use CxxWrap.jl at all, but is part of an umbrella project (Oscar.jl), which happens to include two other packages which both use CxxWrap.jl; as a result, GAP.jl and CxxWrap.jl are often loaded at the same time.
I recently discovered that just the act of loading CxxWrap.jl will substantially slow down GAP.jl. For example, take this snippet (taken from oscar-system/GAP.jl#485):
This takes 48 ms for me in Julia 1.4.2 and CxxWrap v0.10.2. However, if I then do
using CxxWrap
, the same test suddenly takes 238ms. With v0.11.0 it seems to have gotten a bit worse and now takes ~290 ms. I see no difference with Julia 1.5, I think (but i have to carefully remeasure everything if anybody thinks that the Julia version might matter).@rbehrends looked briefly into it and reported the following:
I am going to try to reduce this to a smaller example, ideally one that doesn't involve GAP.jl; but in the meantime, I thought I should already post this, in case anything of the above rings a bell for anybody.
The text was updated successfully, but these errors were encountered: